Skip to content
/

DevTeam: Claude Code Multi-Agent Dev System

devteam-michael-harris · last commit 2026-02-26

Primitive shape 21 total
Commands 21
00

Summary

DevTeam: Claude Code Multi-Agent Dev System (michael-harris) — Summary

DevTeam (11 stars) is a Claude Code plugin providing 127 specialized AI agents with SQLite-backed state persistence, dynamic model escalation (haiku → sonnet → opus based on complexity scoring), and a multi-layer quality enforcement loop. The architecture is distinguished by three mechanisms absent from most peers: (1) a Bug Council — 5 specialist agents (Root Cause Analyst, Code Archaeologist, Pattern Matcher, Systems Thinker, Adversarial Tester) convened for complex bugs; (2) a Task Loop that iterates implementation → quality gates → fix until all gates pass with anti-abandonment logic; (3) YAML-based agent capability configuration in .devteam/*.yaml with scope enforcement (allowed files, forbidden directories, max_files_changed). Planning uses an interview-driven flow that asks clarifying questions before generating PRD + sprint-organized tasks. Worktrees are used for parallel sprint tracks. Hooks cover 7 events (PreToolUse, PostToolUse, Stop, SubagentStart, SubagentStop, TaskCompleted, PreCompact) with both command and LLM-prompt hook types. Compared to seeds, it is closest to claude-flow in component count and complexity, but is a Claude Code plugin rather than an npm library, and uniquely adds model escalation on failure and the Bug Council pattern.

01

Overview

DevTeam: Claude Code Multi-Agent Dev System (michael-harris) — Overview

Origin

Created by Michael Harris (GitHub: michael-harris). Claude Code plugin. Last commit February 2026. Implementation plans for future hooks and MCP server integration are documented in IMPLEMENTATION_PLAN_HOOKS.md and IMPLEMENTATION_PLAN_MCP_SERVER.md.

Philosophy

"A multi-agent Claude Code development workflow for planning and executing sprints in a fully automated fashion."

The framework treats software development as a managed process: interview first, research patterns, generate PRD, decompose into sprints, execute with quality loops, escalate on failure. The key philosophical commitment is anti-abandonment — agents cannot give up. The system escalates models and activates the Bug Council rather than surfacing failure to the user.

"127 specialized agents for every aspect of software development" "Claude Code itself is the runtime — it reads the agent markdown files, selects appropriate agents based on task characteristics, and executes them as subagents."

Model Escalation Philosophy

The system explicitly routes different LLM capabilities to different complexity tiers:

  • Haiku (fast, cheap): simple tasks (score 0-4)
  • Sonnet (balanced): moderate tasks (score 5-8)
  • Opus (powerful, expensive): complex tasks (score 9-14), security, architecture

This is the most explicit multi-model routing system in the batch.

Version

Analyzed from main branch (no semver); last pushed 2026-02-26.

02

Architecture

DevTeam: Claude Code Multi-Agent Dev System (michael-harris) — Architecture

Distribution

  • Type: Claude Code plugin
  • Plugin manifest: .claude-plugin/ directory
  • Language: JavaScript/Node.js (hooks), Markdown (agents/commands/skills)

Installation

git clone https://github.com/michael-harris/claude-code-multi-agent-dev-system.git
path/to/repo/install-local.sh

Directory Tree

michael-harris/claude-code-multi-agent-dev-system/
├── .claude-plugin/          # Plugin manifest
├── .claude/                 # Claude Code config
├── .devteam/                # YAML configuration files (27 files)
│   ├── config.yaml          # Main project configuration
│   ├── model-selection.md   # Model escalation rules
│   ├── agent-selection.md   # Agent matching rules
│   ├── task-loop-config.yaml
│   ├── sprint-loop-config.yaml
│   ├── scope-enforcement.md
│   └── ... (21 more config files)
├── agents/                  # 127 agent directories (24 categories)
│   ├── orchestration/       # 11 orchestration agents
│   ├── planning/            # Planning agents
│   ├── diagnosis/           # 5 Bug Council agents
│   ├── backend/             # Language-specific agents
│   ├── frontend/            # React, Vue, Svelte, Angular
│   ├── quality/             # Test, security, performance, accessibility
│   └── ... (17 more categories)
├── commands/                # 21 slash commands
├── skills/                  # 21 skill directories
├── hooks/                   # hooks.json + shell/PS1 scripts
├── scripts/                 # Database + utility scripts
├── templates/               # Project templates
├── mcp-configs/             # MCP server configurations
└── settings.json            # Plugin settings schema

Required Runtime

  • Claude Code
  • Node.js (for hooks)
  • SQLite3 (for state persistence)
  • Bash or PowerShell (hook scripts cross-platform)

Target AI Tools

Claude Code exclusively (hook events are Claude Code-specific)

03

Components

DevTeam: Claude Code Multi-Agent Dev System — Components

Commands (21)

Command Purpose
devteam-plan Interview → research → PRD → tasks → sprints
devteam-implement Execute plan with Task Loop quality enforcement
devteam-bug Interview → diagnose → fix → verify bug flow
devteam-issue Fix a GitHub issue by number
devteam-issue-new Create a new GitHub issue
devteam-review Code review
devteam-test Run tests
devteam-design Design phase
devteam-design-drift Detect design drift
devteam-select Select specific agents
devteam-config Configure devteam settings
devteam-status Show current status
devteam-list List agents and capabilities
devteam-logs View session logs
devteam-help Show help
devteam-reset Reset devteam state
merge-tracks Merge parallel worktree tracks
worktree-cleanup Clean up worktrees
worktree-list List worktrees
worktree-status Worktree status

Agent Categories (127 total across 24 categories)

Orchestration (11)

autonomous-controller, bug-council-orchestrator, code-review-coordinator, quality-gate-enforcer, requirements-validator, scope-validator, sprint-loop, sprint-orchestrator, task-loop, track-merger, workflow-compliance

Bug Council (5)

root-cause-analyst, code-archaeologist, pattern-matcher, systems-thinker, adversarial-tester

Planning

Interview agents, PRD generator, task decomposer, sprint planner

Backend (by language)

Python (FastAPI/Django/Flask), TypeScript (Express/NestJS/Fastify), Go, Java (Spring Boot), C# (ASP.NET), Ruby (Rails), PHP (Laravel)

Frontend

React, Vue, Svelte, Angular specialists; accessibility specialist; performance auditor

Quality

Test writer, security auditor, performance auditor, accessibility specialist, E2E tester

Other categories

Architecture, database, DevOps, mobile, data-AI, SRE, security, product, devrel, scripting, support, ux, specialized

Hooks (7 event types)

Event Hook Purpose
PreToolUse (Edit|Write) pre-tool-use-hook + LLM prompt scope check Validate file scope; check iteration limits
PostToolUse (Bash) post-tool-use-hook Detect quality gate results
PostToolUse (Edit|Write) post-tool-use-hook Track file modifications in state
Stop stop-hook Block exit unless EXIT_SIGNAL or quality gates pass
SubagentStart log-event Track agent start time
SubagentStop log-event Track agent completion
TaskCompleted log-event Log task completion
PreCompact pre-compact Save state before compaction

YAML Configuration (.devteam/)

27 configuration files including: config.yaml, model-selection.md, agent-capabilities.yaml, task-loop-config.yaml, sprint-loop-config.yaml, scope-enforcement.md, parallel-execution.md, two-phase-architecture.yaml, context-management.yaml, database-config.yaml...

Scripts

Node.js hook runner, SQLite database management scripts, session-start/stop scripts

05

Prompts

DevTeam: Claude Code Multi-Agent Dev System — Prompts & Instructions

Agent Prompt Structure

Every agent markdown file begins with a YAML front-matter block followed by a natural-language body:

---
name: task-loop
description: "Manages iterative quality loop for single task execution with model escalation"
model: opus
tools: Read, Glob, Grep, Bash, Task
memory: project
---

The body is the agent's full system prompt. The front-matter is consumed by Claude Code's plugin loader; the body is injected into the agent's context window at invocation.

Representative Agent Prompts

Task Loop (orchestration:task-loop)

The Task Loop prompt establishes strict role separation:

"The Task Loop only handles looping, iteration, and escalation. All actual work is delegated to specialists."

The agent is explicitly forbidden from writing code, running tests directly, or making implementation decisions. It delegates every action to specialist sub-agents and evaluates results to decide: iterate, escalate, or complete.

Loop directive (verbatim):

Execute → Quality Gates → Pass? → Complete
                       ↑  Fail → Fix Tasks ← Model Escalation

Sprint Orchestrator (orchestration:sprint-orchestrator)

The sprint orchestrator includes a mandatory autonomous execution block:

CRITICAL: Autonomous Execution Mode
✅ Continue through all tasks until sprint completes
✅ Automatically call agents to fix issues when validation fails
✅ Escalate model automatically when needed (sonnet → opus on failure)
✅ Run all quality gates and fix iterations without asking
❌ DO NOT pause execution to ask for permission
❌ DO NOT stop between tasks
❌ DO NOT request confirmation to continue

Hard iteration limit: 10 per task. Tasks failing after 10 iterations are marked failed; the sprint continues with non-blocked tasks.

Scope Constraint Injection (per-task)

Every agent spawned for a specific task receives inline scope constraints injected from the task's JSON definition:

## CRITICAL: Scope Constraints

You are authorized to modify ONLY these files:
${task.scope.allowed_files}

You are FORBIDDEN from modifying:
${task.scope.forbidden_files}
${task.scope.forbidden_directories}

The max_files_changed field sets the upper bound; violations are blocked by the PreToolUse hook.

LLM-Type Prompt Hook (Unique Pattern)

DevTeam uses a "type": "prompt" hook entry — an LLM-evaluated hook that runs before every Edit/Write:

{
  "type": "prompt",
  "prompt": "The agent is about to modify a file. Check if this modification is within the current task scope. The task scope should be defined in the agent's prompt context. If the file being modified is clearly unrelated to the task (e.g., modifying auth code when the task is about UI styling), return ok:false with a reason. If it's reasonable, return ok:true.",
  "description": "LLM-based scope pre-check — catches obvious scope violations before they happen"
}

This is distinct from the command-type hook that also fires on Edit/Write. The prompt hook uses Claude itself (not a bash script) to evaluate semantic scope violations. A file-level command hook handles iteration limits and circuit-breaker logic; the prompt hook catches contextual scope drift.

Model Selection Prompts

Model routing uses a Python-pseudocode algorithm in .devteam/model-selection.md that computes a 0–14 complexity score across five dimensions:

Factor Points Details
Files affected 0–3 1=0, 2-3=1, 4-6=2, 7+=3
Estimated lines 0–3 <50=0, 50-149=1, 150-299=2, 300+=3
New dependencies 0–2 0=0, 1-2=1, 3+=2
Task type 0–3 docs=0, test=1, impl=2, arch=3
Risk flags 0–3 +1 each: security, external_integration, breaking_change

Task-type overrides always bypass the algorithm: security → opus, architecture → opus, documentation → haiku, testing → sonnet.

Context Management Prompts

.devteam/context-management.yaml defines explicit summarization prompts for four scenarios:

  • general_summarize: preserve decisions, problems/solutions, current requirements, remaining work
  • code_summarize: files modified, key changes, dependencies, tests
  • conversation_summarize: questions/answers, decisions, action items, current status
  • research_summarize: discoveries, patterns, technologies, recommendations

Summarization is triggered at phase change, agent handoff, or when context usage exceeds 85% of the 200K limit. The always_preserve block ensures current task requirements, acceptance criteria, current error messages, and latest test/quality-gate results are never compressed.

Bug Council Analyst Prompts

Each of the five Bug Council agents receives a role-specific framing:

  • Root Cause Analyst: error analysis, hypothesis generation from stack traces and logs
  • Code Archaeologist: git history exploration, regression detection, blame analysis
  • Pattern Matcher: similar past bugs, anti-patterns, code smell identification
  • Systems Thinker: dependency graph, architectural issues, cross-service interactions
  • Adversarial Tester: edge cases, security vulnerabilities, boundary conditions

After parallel execution, the bug-council-orchestrator synthesizes the five outputs into a single solution report before delegating implementation back to the Task Loop.

09

Uniqueness

DevTeam: Claude Code Multi-Agent Dev System — Uniqueness

Distinguishing Characteristics

1. LLM-Type Prompt Hook for Scope Enforcement

No other framework in this batch uses Claude Code's "type": "prompt" hook entry — where the hook payload is an LLM prompt rather than a shell command. DevTeam fires a semantic scope check before every file edit:

"The agent is about to modify a file. Check if this modification is within the current task scope... If the file being modified is clearly unrelated to the task (e.g., modifying auth code when the task is about UI styling), return ok:false..."

This runs in parallel with a command-type hook on the same event. The prompt hook catches contextual/semantic violations (wrong module) that file-pattern matching cannot detect. This is architecturally novel: using Claude to police Claude at the tool-use boundary.

2. Bug Council Pattern

The five-analyst parallel Bug Council (Root Cause Analyst, Code Archaeologist, Pattern Matcher, Systems Thinker, Adversarial Tester) is a distinct pattern for handling stuck bugs. Instead of simply escalating to a more powerful model, the council provides multi-perspective synthesis. No other framework in this batch implements a named, parallel, multi-specialist escalation body for debugging.

3. Structured Model Escalation with Complexity Scoring

DevTeam is the most explicit framework for tiered model routing. It defines a 0–14 point complexity algorithm across five factors (files, lines, dependencies, task type, risk flags) that determines not just the starting model but the full escalation chain per tier. The algorithm is documented in pseudocode, and the escalation sequence (haiku→haiku→sonnet→sonnet→opus for simple tasks) is tracked per-task with model history in SQLite. No other framework in this batch documents complexity scoring at this level of specificity.

4. SQLite-Backed State with Full Iteration History

DevTeam persists every task's model history, quality gate results, complexity score, and iteration count to SQLite. This enables: resumable sprints after crash, cost analysis by model/task, agent performance metrics, and debugging of failed tasks by examining their iteration history. Most frameworks use in-memory or markdown-file state; SQLite with explicit table schemas is uncommon.

5. SubagentStart / TaskCompleted Hook Events

DevTeam handles two non-standard Claude Code lifecycle events: SubagentStart and TaskCompleted. These are not in the standard Claude Code hooks specification documented in seed frameworks. DevTeam uses them for agent timing and cost tracking, logging start time on SubagentStart and duration on SubagentStop/TaskCompleted. This enables the --agents flag on /devteam:status to report per-agent performance metrics.

6. Anti-Abandonment Architecture

The Stop hook actively blocks agent exit unless quality gates have passed (or EXIT_SIGNAL: true is set). The Task Loop has a hard limit of 10 iterations per task, but on exhaustion the agent notifies the human and continues trying — it does not stop. This "keep trying" principle is explicitly encoded as a design value ("127 specialized agents" are the escalation path, not a fallback to human intervention).

7. YAML-Driven Scope with max_files_changed

Task scope is defined in structured JSON (allowed_files, forbidden_files, forbidden_directories, max_files_changed, scope_rationale). The max_files_changed field is enforced by the PreToolUse hook — not just advisory. Most frameworks either have no scope enforcement or rely on prompt-level instructions. DevTeam enforces scope at three layers simultaneously: task JSON definition, agent prompt injection, and hook interception.

Compared to Seeds

  • vs. BMAD-METHOD: BMAD uses persona-md agents and plan-then-execute; DevTeam adds model escalation, SQLite persistence, the Bug Council, and LLM-type hooks — BMAD has none of these.
  • vs. claude-flow: claude-flow is an npm library with programmatic coordination; DevTeam is a Claude Code plugin with conversational commands. Both have large agent catalogs but DevTeam's is focused exclusively on software development domains.
  • vs. barkain-workflow-orch: barkain uses adaptive nudge hooks and native plan mode as handoff; DevTeam uses hard blocking hooks and a dedicated Bug Council escalation path.
  • vs. open-multi-agent: open-multi-agent's goal-to-DAG decomposition is runtime code; DevTeam's sprint decomposition is driven by a planning interview and stored in SQLite before any code runs.

Weakest Points

  • 127 agents: the large catalog raises questions about maintenance coherence. Many agents are likely thin wrappers with similar prompts.
  • Node.js + SQLite dependencies: heavier runtime footprint than markdown-only frameworks.
  • Claude Code exclusivity: all hook events are Claude Code-specific; zero portability to other runtimes.
  • LLM-type prompt hook effectiveness: using Claude to police Claude creates a potential circular dependency if the LLM judges its own scope violations leniently.
04

Workflow

DevTeam: Claude Code Multi-Agent Dev System — Workflow

Main Development Workflow

/devteam:plan --feature "description"
├── Interview Phase: Clarify requirements with targeted questions
├── Research Phase: Analyze codebase, identify patterns and blockers
├── Plan Phase: Generate PRD + tasks + sprints
└── Output: Sprint-organized task list in SQLite

/devteam:implement [--sprint N] [--eco]
├── Sprint Orchestrator: Execute sprints in order
├── Per Task: Task Loop runs
│   ├── Select implementation agent (keyword matching + file types)
│   ├── Implement (model based on complexity score)
│   ├── Quality Gate Enforcer: tests + types + lint + security + coverage
│   ├── Requirements Validator: check acceptance criteria
│   ├── Scope Validator: verify no out-of-scope changes
│   ├── ALL PASS? → Complete task
│   └── FAIL? → Model escalation + retry (max 10 iterations)
│       ├── Simple task: haiku → sonnet → opus
│       ├── Stuck 3+ times? → Bug Council activation
│       └── Max iterations? → Human notification (keep trying)
└── Sprint Loop: validate after all tasks complete

Bug Fixing Workflow

/devteam:bug "description"
├── Interview: Clarify bug details
├── Diagnose: Root cause analysis
├── Fix: With Task Loop quality enforcement
└── Verify: All gates pass

Bug Council Activation

Triggers: critical/high severity | 3+ failed opus attempts | complexity ≥ 10 | bug_council: true
→ Convene 5 parallel analysts:
  - Root Cause Analyst: error analysis, hypothesis generation
  - Code Archaeologist: git history, regression detection
  - Pattern Matcher: similar bugs, anti-patterns
  - Systems Thinker: dependencies, architectural issues
  - Adversarial Tester: edge cases, security vulnerabilities
→ Synthesized solution

Approval Gates

Gate Condition
Plan confirmation After interview/research, before generating PRD
Quality gates (all) Tests, types, lint, security, coverage must pass per task
Human notification After max iterations (not a hard stop — agent keeps trying)

Phase / Artifact Map

Phase Artifact
Planning PRD document + task list + sprint assignments in SQLite
Implementation Code commits
Quality validation Gate results logged to SQLite
Bug Council Synthesized solution report
06

Memory Context

DevTeam: Claude Code Multi-Agent Dev System — Memory & Context

Persistence Layers

DevTeam uses two distinct persistence mechanisms: SQLite for structured state and YAML/JSON files for learned patterns and configuration.

SQLite Database

Primary state store. Tables tracked by the hook system:

Table Content
sessions Session metadata, start/end times, costs
tasks Task records with status, complexity score, model history, iteration count
context_budgets Per-session context token usage tracking
context_snapshots Checkpoint snapshots at phase/task boundaries
quality_gates Gate results logged per task per iteration

Configuration lives in .devteam/database-config.yaml. State is written after every task completion (auto-commit enabled by default in config.yaml). This enables sprint resumability: if a session crashes mid-sprint, the next /devteam:implement run picks up from the last committed task.

In-Agent Memory Field

Agent front-matter uses memory: project. This activates Claude Code's project-level memory scope, which persists facts between separate invocations within the same project directory.

YAML Learned Patterns

.devteam/config.yaml defines a learned patterns store:

memory:
  session_persistence: true
  max_session_files: 10
  learned_patterns:
    enabled: true
    min_confidence: 0.7
    max_patterns: 50

Patterns are stored as files (up to 50, capped at 0.7 confidence threshold). The format is not further specified in the public config — it is an aspirational feature described alongside the SQLite state layer.

Context Window Management

All three model tiers share the same 200K context window. The context management YAML defines:

  • Warn threshold: 75% of effective limit (150K tokens)
  • Auto-summarize threshold: 85% (170K tokens)
  • Aggressive-summarize threshold: 95%

Preservation Policy

Items that are never compressed (always_preserve):

  • Current task requirements
  • Acceptance criteria
  • Current error messages
  • Latest test results
  • Quality gate results
  • File paths modified
  • Recent code changes (last 3 edits)
  • Current iteration context

Items that are summarized:

  • Conversation history older than 5 exchanges → bullet summary
  • Research findings → max 10 bullets
  • Previous failed attempts → "lessons learned" format (what worked / what didn't)
  • Code context for unmodified files → one-liner descriptions

Summarization Triggers

Auto-summarization fires on: approaching context limit, phase change (interview → research → plan → implement), agent handoff, and optionally after every iteration.

Checkpoint System

Checkpoints capture full state (session_state, feature_status, git commit SHA, test results, context snapshot) at:

  • Task completion
  • Phase change
  • Quality gate pass
  • Every 3 iterations

Up to 10 checkpoints retained per session. Naming: checkpoint-{session_id}-{iteration}-{timestamp}.

PreCompact Hook

The PreCompact hook event (hooks/pre-compact.js) fires before Claude Code's built-in context compaction. It serializes the current session state to SQLite before the compaction occurs, ensuring state is not lost when the context window is truncated.

State Recovery

If context is lost mid-session, the recovery chain is:

  1. Load latest checkpoint from SQLite
  2. Reconstruct from git (last committed state)
  3. Reconstruct from features.json (planning artifact)

Agent-to-Agent Context Passing

State passed between the orchestration chain (sprint-orchestrator → task-loop → quality-gate-enforcer → requirements-validator) is conveyed through the Task's JSON record in SQLite plus inline content in the Task tool call. The Task Loop reads the current task from state, calls specialist agents with the task context, reads their outputs, and writes updated status back to state.

The context management YAML prescribes a summarize_for_handoff action on agent handoff events, compressing non-essential context before spawning the next agent to maximize the available budget for the specialist's work.

07

Orchestration

DevTeam: Claude Code Multi-Agent Dev System — Orchestration

Orchestration Hierarchy

DevTeam has a four-level orchestration stack, each level strictly delegating downward:

Level 1: Commands (/devteam:plan, /devteam:implement, /devteam:bug)
Level 2: Sprint Orchestrator (sprint execution, worktree management)
Level 3: Task Loop (single-task quality iteration, model escalation)
Level 4: Specialist Agents (implementation, quality gates, validation)

Sprint Orchestrator (Level 2)

agents/orchestration/sprint-orchestrator.md — model: opus

Responsibilities:

  • Load sprint from SQLite (tasks, dependencies, parallelizable flags)
  • Spawn Task Loop instances for each task (sequentially or in parallel up to max_concurrent_tasks: 3)
  • Track task completion status in SQLite after each Task Loop returns
  • Call Sprint Loop agent for sprint-level validation after all tasks complete
  • Create PR and sprint summary on success

The Sprint Orchestrator never writes code or runs tests. All implementation is delegated to the Task Loop.

Task Loop (Level 3)

agents/orchestration/task-loop.md — model: opus, formerly "Ralph"

The Task Loop is an iteration controller that delegates every action:

1. Call Implementation Agent(s)     → specialist agents
2. Call Scope Validator             → verify files changed match scope
3. Call Quality Gate Enforcer       → tests, lint, types, security, coverage
4. Call Requirements Validator      → check acceptance criteria
5. Evaluate results → iterate / escalate / complete

The Task Loop explicitly does NOT write code, run tests, fix issues, or make implementation decisions. Those are the specialist agents' jobs.

Iteration limit: 10 per task. After 10 iterations, task is marked failed; sprint continues.

Model Escalation (within Task Loop)

Escalation occurs when quality gates repeatedly fail:

Simple task (score 0–4):   haiku → haiku → sonnet → sonnet → opus
Moderate task (score 5–8): sonnet → sonnet → opus → opus
Complex task (score 9–14): opus → opus → opus

After 2 consecutive haiku failures → escalate to sonnet. After 2 sonnet failures → escalate to opus. After 3 opus failures → Bug Council activation.

Override rules bypass complexity scoring: security and architecture tasks always start at opus; documentation tasks start at haiku.

Bug Council Pattern

Triggered when: task is critical/high severity, opus has failed 3+ times, complexity score ≥ 10, or bug_council: true is set in task metadata.

Five specialist agents run in parallel:

  1. Root Cause Analyst — error analysis, hypothesis generation
  2. Code Archaeologist — git history, regression detection
  3. Pattern Matcher — similar bugs, anti-patterns
  4. Systems Thinker — dependency graph, architectural issues
  5. Adversarial Tester — edge cases, security vulnerabilities

The bug-council-orchestrator synthesizes the five outputs into a solution report. Implementation then resumes through the Task Loop with the synthesized solution as additional context.

Anti-abandonment principle: the system never stops trying. After Bug Council, the Task Loop continues iteration with opus, notifying the human after max iterations are exhausted while still attempting to resolve.

Quality Gate Enforcement

agents/orchestration/quality-gate-enforcer.md runs four checks sequentially:

  1. Tests (run test suite; all must pass)
  2. Type checking (no type errors)
  3. Linting (no lint errors)
  4. Security audit (critical/high issues block completion)
  5. Coverage (if minimum_coverage > 0 in config)

Results are written to SQLite (quality_gates table). The PostToolUse (Bash) hook detects gate results from command output and updates state.

agents/orchestration/requirements-validator.md checks each acceptance criterion from the task's JSON against the current codebase. Unmet criteria cause re-iteration.

agents/orchestration/scope-validator.md compares modified files against the task's allowed_files/allowed_patterns list. Violations cause rollback and re-implementation.

Sprint-Level Validation

agents/orchestration/sprint-loop.md runs after all tasks in a sprint complete:

  • All tasks passed quality gates
  • No regressions in previously passing tests
  • Sprint acceptance criteria met
  • PR-ready state (clean build, passing CI)

If sprint-level validation fails, failed tasks are re-entered into the Task Loop.

Parallel Execution

max_concurrent_tasks: 3 allows up to three Task Loop instances simultaneously. Tasks eligible for parallel execution include: testing, documentation, frontend, and independent backend tasks. Sequential enforcement applies to: database migrations, security tasks, and deployments.

Worktrees are used for parallel sprint tracks when features span multiple independent workstreams. merge-tracks command handles the merge after parallel worktrees complete.

Hook-Enforced Governance

The hook layer enforces orchestration invariants at runtime:

Hook Type Function
PreToolUse (Edit|Write) command Iteration limit check, circuit breaker
PreToolUse (Edit|Write) prompt LLM semantic scope check
PreToolUse (Bash) command Dangerous operation detection
PostToolUse (Bash) command Quality gate result detection
PostToolUse (Edit|Write) command File modification state tracking
Stop command Block exit unless gates pass or EXIT_SIGNAL
SubagentStart command Agent start time logging
SubagentStop command Agent completion logging
TaskCompleted command Task completion logging
PreCompact command State serialization before context compaction

The Stop hook is the anti-abandonment enforcement point: an agent cannot exit until all quality gates have passed for the current task. The only escape is an explicit EXIT_SIGNAL: true in state (set by human notification after max iterations).

Agent Selection

Agent selection uses keyword matching on task title/description combined with file type detection. Matching rules live in .devteam/agent-selection.md. The Task Loop reads the matched agent slug from the task's JSON and spawns the corresponding agent from agents/{category}/{slug}.md.

08

Ui Cli Surface

DevTeam: Claude Code Multi-Agent Dev System — UI & CLI Surface

Command Surface

DevTeam exposes 21 slash commands through Claude Code's plugin command system. Commands are defined as markdown files in commands/ and invoked via /devteam:<name> or /merge-tracks, /worktree-*.

Primary Workflow Commands

Command Signature Description
/devteam:plan [--feature "desc"] [--from path] [--skip-research] [--skip-interview] Interview → research → PRD → tasks → sprints
/devteam:implement [--sprint N] [--eco] Execute plan with Task Loop quality enforcement
/devteam:bug "description" Interview → diagnose → fix → verify
/devteam:issue <number> Fix a GitHub issue by number
/devteam:issue-new Create a new GitHub issue
/devteam:review Code review
/devteam:test Run tests
/devteam:design Design phase
/devteam:design-drift Detect design drift

Management Commands

Command Description
/devteam:select Select specific agents
/devteam:config Configure devteam settings
/devteam:status Show current status (health, costs, progress)
/devteam:list List agents and capabilities
/devteam:logs View session logs
/devteam:help Show help
/devteam:reset Reset devteam state

Worktree Commands

Command Description
/merge-tracks Merge parallel worktree tracks
/worktree-cleanup Clean up worktrees
/worktree-list List worktrees
/worktree-status Worktree status

Status Command Output Format

/devteam:status produces a structured report covering:

Context: {current}/{limit} tokens ({percent}%) [{status}]

With sub-commands for:

  • --history [n] — last n sessions (default 10)
  • --costs — cost breakdown by model/task
  • --costs --detailed — per-agent cost breakdown
  • --agents — agent performance metrics
  • --session <id> — specific session details
  • --all — comprehensive report
  • --json — machine-readable JSON output

Plan Command File-Based Input

/devteam:plan --from <path> accepts multiple file formats:

Format Extensions
Markdown .md
YAML .yaml, .yml
JSON .json
Plain Text .txt
PDF .pdf (text extraction)

Folder input (--from specs/) loads all supported files from the directory.

No Interactive TUI

DevTeam has no terminal UI beyond the Claude Code conversation interface. All interaction is through slash commands and conversational responses. The status output is text-formatted for the Claude Code chat panel. There is no separate dashboard, web UI, or TUI component.

Plugin Installation

git clone https://github.com/michael-harris/claude-code-multi-agent-dev-system.git
path/to/repo/install-local.sh

The install script places the plugin in Claude Code's plugin directory. The .claude-plugin/ manifest registers all commands, skills, agents, and hooks with Claude Code's plugin system.

Skills Surface

21 skills parallel the 21 commands, providing reusable capabilities that can be composed or called independently from agents. Skills follow the same markdown format as commands but are namespaced as skills rather than commands.

Configuration Surface

/devteam:config manages project-level settings via .devteam/config.yaml. The user-facing configuration points:

  • Model escalation thresholds (simple/moderate/complex boundaries)
  • Parallel execution settings (max_concurrent_tasks)
  • Quality gate requirements (min coverage, type check, lint)
  • Bug Council activation threshold (default: 3 failures)
  • GitHub integration (auto-create issues/PRs)
  • Debug settings (verbose logging, dry run mode)

YAML changes take effect on the next /devteam:implement run. No restart required.

Runtime Dependencies Visible to User

  • Node.js must be installed (for hooks)
  • SQLite3 must be available
  • Bash or PowerShell (cross-platform hook scripts)

These are checked at plugin load time with informative error messages if missing.

Related frameworks

same archetype · same primary tool · same memory type

cc-spec-driven ★ 27

Tracks specification changes as CRs through a draft→confirm→RC→archive lifecycle with LLM-evaluated hooks that enforce checklist…

Trellis ★ 8.5k

Team-scale AI coding harness: shared specs auto-injected via hooks into sub-agents across 14 AI tools, with per-developer…

Atmosphere ★ 3.8k

JVM framework for building production-grade governed AI agents with streaming transports, HITL approvals, Cedar/Rego policy…