Claude Swarm (affaan-m)

claude-swarm · affaan-m/claude-swarm · ★ 172 · last commit 2026-02-11

Decompose complex tasks via Opus into a dependency DAG, execute waves of Haiku workers in parallel, then Opus quality-gates the combined output — mirroring real engineering team structure.

Best whenStrategic model routing (Opus for planning+review, Haiku for execution) is 3x cheaper and architecturally sound — senior engineers shouldn't execute boilerpl…

Skip ifUsing the same expensive model for both planning and execution, Running tasks without dependency awareness (causes file conflicts)

vs seeds

claude-flowin multi-agent orchestration scope, but uses the Claude Agent SDK directly (not MCP) with a dependency-DAG wave schedule…

Primitive shape 6 total

Commands 3 Subagents 3

Summary

Claude Swarm (affaan-m) — Summary

Claude Swarm is a Python CLI tool for multi-agent orchestration of Claude Code: Opus 4.6 decomposes a task into a dependency graph of subtasks, Haiku worker agents execute them in parallel waves with file-locking for conflict prevention, and Opus 4.6 runs a Quality Gate review over all agent outputs before reporting results. Every execution is recorded as JSONL events and replayable.

Problem it solved: Complex software tasks have internal dependencies that make naive parallelism produce conflicts — Claude Swarm builds an explicit dependency DAG, respects ordering, prevents file conflicts with pessimistic locking, enforces a hard budget cap, and adds a model-appropriate quality gate (senior architect designs, junior engineers execute, senior reviews).

Distinctive traits: (1) Strategic multi-model routing: Opus for planning + quality gate, Haiku for worker execution — by explicit design; (2) Dependency-aware wave scheduling: tasks only start when dependencies complete; (3) File conflict detection via pessimistic locking across parallel workers; (4) JSONL session recording with claude-swarm replay <id>; (5) --demo flag for animated TUI preview without API key.

Target audience: Developers working on complex refactors or feature additions where subtask ordering matters and want automated parallelism with budget enforcement and quality review.

differs_from_seeds: Most similar to claude-flow in the MCP-anchored toolserver cluster (both spawn multiple agents, both use Claude), but architecturally different: claude-swarm uses the Claude Agent SDK directly (not MCP tools) and implements its own wave-based dependency scheduler. The Opus→Haiku→Opus quality-gate model closely mirrors the "senior engineer designs and reviews, junior executes" pattern. JSONL session recording with replay is unique to this framework in the entire batch.

Overview

Claude Swarm (affaan-m) — Overview

Origin

Created by Affaan Mustafa for the Claude Code Hackathon (Feb 10-16, 2026). MIT license. Python package (claude-swarm on PyPI). Version 0.2.0.

Philosophy

From README:

Phase 1:   Opus 4.6 decomposes task into dependency graph
Phase 2:   Parallel agents execute subtasks with live dashboard
Phase 2.5: Opus 4.6 Quality Gate reviews all agent outputs
Phase 3:   Results summary with costs and session replay

Core belief: Real engineering teams have a senior architect who designs the plan, junior engineers who execute in parallel, and the senior reviews the combined result — AI agents should mirror this structure.

Manifesto-style quote

"This mirrors real engineering team structure: a senior architect designs the plan, junior engineers execute in parallel, and the senior reviews the combined result."

Strategic model selection

"Opus 4.6 handles the hardest reasoning task — analyzing your codebase, understanding the architecture, identifying dependencies between subtasks, and producing a parallelizable execution plan. This requires deep understanding of code relationships and optimal task splitting." "Haiku handles the parallelizable work — each agent follows focused instructions from the plan. Using Haiku here is 3x cheaper while maintaining 90% of Sonnet's capability for focused tasks."

Target users

Developers with complex multi-step tasks (refactors, feature additions) who want automated parallelism with budget control, conflict prevention, and a formal quality gate.

Architecture

Claude Swarm (affaan-m) — Architecture

Distribution

PyPI package: pip install claude-swarm
Python 3.11+
CLI binary: claude-swarm

Install

pip install claude-swarm
# or from source:
git clone https://github.com/affaan-m/claude-swarm
cd claude-swarm
pip install -e .

Source structure

claude-swarm/
├── src/claude_swarm/
│   ├── __init__.py
│   ├── cli.py          # Click CLI: main command + sessions + replay
│   ├── config.py       # YAML config loader (swarm.yaml)
│   ├── decomposer.py   # Opus 4.6 task decomposition → dependency graph
│   ├── orchestrator.py # Wave scheduler, file locking, budget enforcement
│   ├── quality_gate.py # Opus 4.6 quality review
│   ├── session.py      # JSONL session recorder
│   ├── types.py        # SwarmPlan, SwarmTask, SwarmAgent, etc.
│   ├── ui.py           # Rich terminal TUI (htop-style)
│   └── demo.py         # Animated demo without API key
├── tests/              # 44 tests
└── pyproject.toml

Dependencies

claude-agent-sdk>=0.1.35 — Anthropic Claude Agent SDK
textual>=1.0.0 — TUI framework
rich>=13.0.0 — Terminal formatting
networkx>=3.0 — Dependency graph (DAG)
anyio>=4.0 — Async I/O
click>=8.0 — CLI
pydantic>=2.0 — Data validation

Required runtime

Python 3.11+
ANTHROPIC_API_KEY
Claude API access (Opus 4.6 + Haiku)

Components

Claude Swarm (affaan-m) — Components

CLI commands

Command	Purpose
`claude-swarm TASK`	Run a swarm: decompose, execute, quality gate, report
`claude-swarm --dry-run TASK`	Show plan without executing
`claude-swarm --demo`	Animated TUI preview without API key
`claude-swarm sessions`	List past sessions
`claude-swarm replay <session-id>`	Replay a session's JSONL events

CLI options

Option	Default	Purpose
`--max-agents` / `-n`	4	Max concurrent workers
`--model` / `-m`	opus	Decomposition model
`--budget` / `-b`	$5.00	Hard cost cap
`--retry` / `-r`	1	Max retries per task
`--config` / `-c`	swarm.yaml	Custom agent topology
`--quality-gate/--no-quality-gate`	on	Enable/disable Opus review
`--no-ui`	off	Disable Rich TUI

Python modules

Module	Purpose
`decomposer.py`	Opus 4.6 → dependency DAG via Claude Agent SDK
`orchestrator.py`	Wave scheduler: reads DAG, dispatches ready tasks, enforces file locks + budget
`quality_gate.py`	Opus 4.6 review of combined agent output → score + verdict
`session.py`	JSONL recording of all events; enables `replay`
`ui.py`	Rich htop-style TUI with agent progress, tools, costs, file conflicts
`config.py`	Parse `swarm.yaml` for custom agent types
`demo.py`	Animated simulation (no API key needed)

YAML configuration (`swarm.yaml`)

swarm:
  name: full-stack-review
  max_concurrent: 4
  budget_usd: 5.0
  model: opus

agents:
  security-reviewer:
    description: Reviews code for OWASP vulnerabilities
    model: opus
    tools: [Read, Grep, Glob]
    prompt: |
      Analyze the code for SQL injection, XSS, CSRF...
  tester:
    description: Writes and runs tests
    model: haiku
    tools: [Read, Write, Edit, Bash]

connections:
  - from: coder
    to: [security-reviewer, tester]
  - from: [security-reviewer, tester]
    to: reviewer

Prompts

Claude Swarm (affaan-m) — Prompts

Verbatim excerpt 1: decomposer.py — DECOMPOSE_SYSTEM_PROMPT

DECOMPOSE_SYSTEM_PROMPT = """You are a task decomposition expert. \
Given a complex software engineering task, break it down into \
independent subtasks that can be executed by separate Claude Code \
agents in parallel.

RULES:
1. Each subtask should be as independent as possible
2. Specify dependencies between tasks (task IDs)
3. Each task should specify which files it will modify
4. Tasks should be small enough for one agent to complete in a few minutes
5. Include a "reviewer" task at the end that depends on all implementation tasks

OUTPUT FORMAT (strict JSON):
{
  "tasks": [
    {
      "id": "task-1",
      "description": "Short description of what to do",
      "agent_type": "coder|reviewer|tester|refactorer|documenter",
      "dependencies": [],
      "files_to_modify": ["src/auth.ts", "src/middleware.ts"],
      "tools": ["Read", "Write", "Edit", "Bash", "Grep", "Glob"],
      "prompt": "Detailed instructions for the agent..."
    }
  ]
}

AGENT TYPES:
- coder: Writes new code or modifies existing code
- reviewer: Reviews code changes for quality/security
- tester: Writes and runs tests
- refactorer: Refactors existing code
- documenter: Updates documentation

IMPORTANT:
- Minimize file overlap between tasks to prevent conflicts
- If two tasks MUST edit the same file, make one depend on the other
- Keep the total number of tasks between 2 and 8
- Each task's prompt should be self-contained with all context needed"""

Prompting technique: Structured JSON output enforcement with explicit conflict-prevention rules. The decomposer is prompted to think about file overlaps explicitly, which feeds the file locking system. This is prompt-chaining — the decomposer's JSON output directly becomes the orchestrator's execution plan.

Verbatim excerpt 2: YAML config agent definition (from README)

agents:
  security-reviewer:
    description: Reviews code for OWASP vulnerabilities
    model: opus
    tools: [Read, Grep, Glob]
    prompt: |
      Analyze the code for SQL injection, XSS, CSRF,
      insecure deserialization, and other OWASP Top 10 vulnerabilities.
      Report findings with severity (Critical/High/Medium/Low) and
      specific remediation steps.

  tester:
    description: Writes and runs tests
    model: haiku
    tools: [Read, Write, Edit, Bash]
    prompt: |
      Write comprehensive tests. Ensure 80% coverage minimum.
      Run tests and fix any failures before completing.

Prompting technique: Role-specific prompt injection per agent type. Each agent gets its own system-level behavioral instructions. The connections graph then defines the dependency order.

Uniqueness

Claude Swarm (affaan-m) — Uniqueness & Positioning

differs_from_seeds

Claude Swarm is the only pure CLI (no desktop UI) in this batch that implements true multi-model orchestration with dependency-aware wave scheduling. Among the 11 seeds, it is closest to claude-flow (MCP-anchored multi-agent with sqlite-backed memory and hive-mind consensus), but differs architecturally: claude-swarm uses the Claude Agent SDK directly (not MCP tools), has no persistent memory store, implements its own wave-based DAG scheduler rather than a consensus protocol, and adds an explicit multi-model quality gate (Opus→Haiku→Opus) as a first-class design principle. JSONL session recording with CLI replay is unique in this entire corpus — no seed or other batch framework offers command-line session replay. The --demo flag (animated TUI without API key) is also a distinctive onboarding innovation.

Positioning

Hackathon project (Feb 2026) that demonstrates strategic model selection as an architectural principle
"Senior architect designs, junior engineers execute, senior reviews" as software-team metaphor
Pure Python: accessible to the Python/data-science developer community, not the Node.js/Electron crowd
BYOK (just need ANTHROPIC_API_KEY), no cloud platform, no subscription

Observable failure modes

No worktree isolation: concurrent agents edit the same directory; file locking prevents conflicts but doesn't provide rollback if locking fails
Quality Gate is a single Opus call, not multi-agent consensus — a single point of bias
Budget enforcement cancels remaining tasks without cleanup or rollback of partial changes
No CLAUDE.md/skills injection: agents receive only the decomposer-generated task prompt
Alpha-stage (v0.2.0, 172 stars): not production-proven

Inspired by

Claude Code Hackathon (Feb 10-16, 2026)
Claude Agent SDK
Real engineering team structure (architect/engineer/reviewer)

Cross-references

Built on claude-agent-sdk (Anthropic's Python SDK)
swarm.yaml auto-detection similar to docker-compose's convention

Workflow

Claude Swarm (affaan-m) — Workflow

Phases

Phase	Actor	Action	Artifact
1. Decompose	Opus 4.6	Analyzes codebase, identifies dependencies, produces execution plan	`SwarmPlan` (JSON dependency DAG)
2. Execute (waves)	Haiku workers (parallel)	Wave 1: independent tasks run in parallel; Wave N: tasks whose dependencies are complete	File edits, test runs, code changes
2.5. Quality Gate	Opus 4.6	Reviews combined output for correctness, consistency, security	Quality score (0-10) + verdict (PASS/FAIL)
3. Results	CLI	Report: N/N tasks completed, total cost, session ID	`swarm-{id}` JSONL session file

Dependency scheduling

networkx DAG tracks task dependencies
Tasks are dispatched in "waves": a task starts only when all its listed dependencies have status COMPLETED
Tasks with no dependencies run in Wave 1 simultaneously

File conflict prevention

Pessimistic file locking: _file_locks: dict[str, str] maps file path → agent_id
If Agent B requests a file that Agent A holds, Agent B is blocked until Agent A releases
files_to_modify specified per task in the decomposition plan

Budget enforcement

max_budget_usd hard cap (default $5.00)
When total_cost >= max_budget_usd, _budget_exceeded = True
Remaining tasks are cancelled; partial results reported

Retry logic

Failed tasks retry up to max_retries times (default: 1 retry)

Approval gates

--dry-run: shows plan without executing (implicit approval gate)
No interactive approval; once started, the swarm runs to completion or budget exhaustion

Session replay

After execution, claude-swarm replay <session-id> replays the JSONL event log — shows what each agent did, in sequence.

Memory Context

Claude Swarm (affaan-m) — Memory & Context

Session recording

Every swarm execution is recorded as a JSONL file: swarm-{id} containing all agent events, tool calls, outputs, costs, and status changes. This is the primary persistence artifact.

Session replay

claude-swarm replay <session-id> streams the JSONL events to reconstruct exactly what each agent did. This makes Claude Swarm the only CLI tool in this batch with a formal replay capability.

State during execution (in-memory)

SwarmOrchestrator maintains:
- agents: dict[str, SwarmAgent] — live agent state
- completed_task_ids: set[str] — which tasks have finished
- conflicts: list[FileConflict] — detected file conflicts
- total_cost: float — accumulated cost
- _file_locks: dict[str, str] — file path → agent_id
- _retry_counts: dict[str, int] — task → attempt count
All in-memory; only the JSONL session file is persisted

Cross-session memory

No shared memory between swarm runs. Each swarm starts fresh. The only cross-session knowledge is what's in the actual codebase files.

Context per agent

Each worker agent receives its task description + prompt (generated by Opus decomposer) + the project's file system. There is no injected CLAUDE.md or session history.

Context compaction

Not applicable — each worker is a short-lived Claude Agent SDK call (a few turns at most).

Orchestration

Claude Swarm (affaan-m) — Orchestration

Multi-agent support

Yes — core feature. N worker agents (default up to 4 concurrent).

Orchestration pattern

Hierarchical with dependency-aware wave scheduling:

Phase 1: Opus 4.6 (planner) decomposes task → dependency DAG
Phase 2: Worker pool (Haiku) executes tasks in waves; ready tasks (all deps complete) run in parallel
Phase 2.5: Opus 4.6 (reviewer) runs quality gate on all outputs
This is a task-decomposition-tree pattern with parallel execution at each wave

Wave scheduling algorithm

while tasks_remaining:
    ready_tasks = [t for t in pending if all(dep in completed for dep in t.dependencies)]
    dispatch ready_tasks in parallel (up to max_concurrent)
    wait for any to complete
    update completed_task_ids
    handle file conflicts, budget check, retries

Isolation mechanism

None (in-place) — all agents operate in the same working directory. File conflicts are managed via pessimistic locking, not directory isolation.

Multi-model routing

Yes — explicit strategic model selection:

Role	Model	Rationale
Decomposer	Opus 4.6	Best reasoning for dependency analysis
Worker agents	Haiku	3x cheaper, "90% of Sonnet's capability for focused tasks"
Quality Gate	Opus 4.6	Senior review of combined output

File conflict prevention

Pessimistic locking via _file_locks: dict[str, str]. If two agents need the same file simultaneously, the second agent waits. This is the only framework in this batch with explicit file-level locking.

Quality gate

Opus 4.6 reviews all agent outputs after Phase 2:

Score: 0-10
Verdict: PASS/FAIL
Checks: integration issues, missed edge cases, security concerns, task completeness

Budget enforcement

Hard cap: when total cost >= max_budget_usd, remaining tasks are cancelled.

Execution mode

One-shot — claude-swarm TASK runs to completion (or budget exhaustion).

Max concurrent agents

Configurable via --max-agents (default: 4).

Consensus mechanism

None formal — quality gate is a single reviewer, not a multi-agent consensus.

Ui Cli Surface

Claude Swarm (affaan-m) — UI & CLI Surface

CLI binary: `claude-swarm`

Full Python CLI built with Click.

Command	Purpose
`claude-swarm TASK`	Main execution
`claude-swarm --dry-run TASK`	Show plan without executing
`claude-swarm --demo`	Animated TUI without API key
`claude-swarm sessions`	List past sessions
`claude-swarm replay <id>`	Replay session events

Terminal UI

Built with rich library — "htop-style dashboard" showing:

Agent progress bars
Tool usage per agent
Real-time cost tracking ($X.XX / $Y.YY budget)
File conflict notifications
Task status (PENDING/BLOCKED/RUNNING/COMPLETED/FAILED/CANCELLED)

Status indicators:

... = PENDING
BLK = BLOCKED
RUN = RUNNING
OK  = COMPLETED
ERR = FAILED
CXL = CANCELLED

Demo mode

claude-swarm --demo runs an animated TUI simulation without an API key — shows the full workflow with fake agents for presentations or evaluation.

No web dashboard

CLI + TUI only. No Electron, no web UI, no mobile monitor.

Observability

JSONL session files per execution
claude-swarm replay <id> for post-execution review
Real-time TUI with cost tracking during execution
Per-agent and total cost reported at completion

Configuration surface

swarm.yaml (auto-detected in project root or .claude/swarm.yaml) for declarative agent topology.

Related frameworks

same archetype · same primary tool · same memory type

Claude-Flow / Ruflo ★ 55k

A6 Multi-agent orchestrator

Eliminates single-agent context limits and sequential bottlenecks by orchestrating fault-tolerant swarms of specialized AI agents…

Hermes Agent (NousResearch) ★ 168k

A6 Multi-agent orchestrator

Self-improving personal AI agent with closed learning loop, 7 terminal backends, and messaging gateway — not tied to any AI…

OpenCode ★ 165k

A6 Multi-agent orchestrator

Terminal-first AI coding agent with multi-model routing, native desktop app, and a typed .opencode/ configuration system for…

OpenHands ★ 75k

A6 Multi-agent orchestrator

Open-source AI software development platform (open-source Devin alternative) with Docker sandbox isolation, 77.6% SWE-bench…

DeerFlow ★ 70k

A6 Multi-agent orchestrator

Long-horizon superagent that researches, codes, and creates by orchestrating parallel sub-agents with isolated contexts in Docker…

oh-my-openagent (omo) ★ 60k

A6 Multi-agent orchestrator

Multi-provider AI agent orchestration for OpenCode: escape vendor lock-in by routing Sisyphus (Claude/Kimi/GLM) and Hephaestus…

Distribution

Type: cli-tool
License: MIT
Install: one-liner
Version: 0.2.0

Surfaces

CLI binary: claude-swarm
CLI subcmds: 3
Local UI: terminal-tui
Tech stack: rich, textual (htop-style terminal UI)

Components

Commands: 3
Skills: 0
Subagents: 3
Hooks: 0
MCP servers: 0
MCP tools: 0
Scripts: 0
Templates: 1

Workflow

Phases: 4
Approval gates: 1
Spec format: yaml
Spec storage: flat-files
Delta or full: none

Orchestration

Multi-agent: Yes
Pattern: hierarchical
Max concurrent: 4
Isolation: none
Consensus: none
Prompt chaining: Yes

Multi-model

Multi-model: Yes
BYOK: Yes
Modal: text

Execution

Mode: one-shot
Crash recovery: No
Compaction: No
Session handoff: No
Streaming: Yes

Memory

Type: file-based
Persistence: session
Search: none
State files: 1 file

Quality

TDD: No
TDD mechanism: none
Validators: 1
Self-review: adversarial-subagent

Git / Observability

Auto commit: No
Auto PR: No
Auto merge: No
Worktree/feat: No
Audit log: Yes
Audit format: jsonl
Replay: Yes

Tools

Primary: claude-code
Targets: 2
Portability: low

Signals

Stars: 172
Last commit: 2026-02-11
Contributors: 1
Maintainer: dormant
Quality score: 4.1/10

Summary

Claude Swarm (affaan-m) — Summary

Overview

Claude Swarm (affaan-m) — Overview

Origin

Philosophy

Manifesto-style quote

Strategic model selection

Target users

Architecture

Claude Swarm (affaan-m) — Architecture

Distribution

Install

Source structure

Dependencies

Required runtime

Components

Claude Swarm (affaan-m) — Components

CLI commands

CLI options

Python modules

YAML configuration (swarm.yaml)

Prompts

Claude Swarm (affaan-m) — Prompts

Verbatim excerpt 1: decomposer.py — DECOMPOSE_SYSTEM_PROMPT

Verbatim excerpt 2: YAML config agent definition (from README)

Uniqueness

Claude Swarm (affaan-m) — Uniqueness & Positioning

differs_from_seeds

Positioning

Observable failure modes

Inspired by

Cross-references

Workflow

Claude Swarm (affaan-m) — Workflow

Phases

Dependency scheduling

File conflict prevention

Budget enforcement

Retry logic

Approval gates

Session replay

Memory Context

Claude Swarm (affaan-m) — Memory & Context

Session recording

Session replay

State during execution (in-memory)

Cross-session memory

Context per agent

Context compaction

Orchestration

Claude Swarm (affaan-m) — Orchestration

Multi-agent support

Orchestration pattern

Wave scheduling algorithm

Isolation mechanism

Multi-model routing

File conflict prevention

Quality gate

Budget enforcement

Execution mode

Max concurrent agents

Consensus mechanism

Ui Cli Surface

Claude Swarm (affaan-m) — UI & CLI Surface

CLI binary: claude-swarm

Terminal UI

Demo mode

No web dashboard

Observability

Configuration surface

Related frameworks

YAML configuration (`swarm.yaml`)

CLI binary: `claude-swarm`