Batch 16 — Multi-agent workflow loops: Ralph variants, YAML-driven workflows, director+coder splits
Roster (10)
| slug | stars | distribution | cli_binary | local_ui | orchestration | multi_model | tier |
|---|---|---|---|---|---|---|---|
| prodigy-iepathos | 9 | cli-tool (Rust) | prodigy |
partial (axum HTTP, no documented dashboard) | parallel-fan-out (MapReduce) | no | A |
| switchboard | 209 | vscode-extension | none | VS Code kanban webview + MCP HTTP | hierarchical | yes (complexity routing) | A |
| slate-v1 | unknown | unknown | unknown | unknown | unknown | unknown | C |
| blackbox-code | unknown | unknown (404) | unknown | unknown | unknown | unknown | C |
| cestdone | 2 | npm-package | cestdone |
none | hierarchical (Director+Worker) | yes (--director-model / --worker-model) | A |
| ralph-snarktank | 19,612 | bash-script-bundle | ralph.sh (bash) |
none | sequential (continuous-ralph) | no | A |
| ralph-claude-code | 9,205 | bash-script-bundle | ralph (bash, global) |
terminal-tui (tmux) | sequential (continuous-ralph) | no | A |
| evo | 770 | cli-tool (Python PyPI) | evo |
web-dashboard (port 8080) | task-decomposition-tree | no | A |
| pi-mono | 55,497 | npm-package | pi |
terminal-tui (@pi-tui) | none (extensible) | yes (8 providers) | A |
| pi-ralph-orch | 13 | standalone-repo (pi ext) | none (slash commands in pi) | terminal-tui (pi modals) | sequential (continuous-ralph) | no | A |
Intra-batch patterns
All 8 Tier-A frameworks in this batch implement some form of autonomous loop — the shared theme is "keep going until done" rather than "do one thing and stop." However, the loops diverge sharply: Prodigy, evo, and cestDone use process-level isolation (separate OS processes or git worktrees per agent/experiment), while the Ralph variants (snarktank, frankbria, pi-ralph) use in-process loops within a single CLI session. File-based communication through markdown or JSON files is universal: every framework stores its cross-iteration state in human-readable files rather than databases (except pi-mono, which has no prescribed state). The director+worker split appears in three distinct forms: cestDone's thin-Director/fresh-Worker subprocess pattern, Switchboard's Planner→Lead/Coder/Intern hierarchy, and evo's orchestrator→subagent tree. Quality gate enforcement is the differentiating factor for loop reliability: evo adds formal gates (pass/fail checks), Ralph-frankbria adds a circuit breaker and dual-condition exit, and snarktank/ralph relies on CI feedback loops — while pi-ralph and cestDone use more advisory approaches.
Most interesting finds
evo — The
[EVO DIRECTIVE]banner injection via hook channels (PreToolUse, UserPromptSubmit, SessionStart) for orchestrator→subagent mid-run communication is a novel architectural pattern not seen in any other framework in the catalog. Hooks are typically used for monitoring or validation; using them as a downward communication channel to in-flight subagents is genuinely new. Combined with the tree-search frontier strategies (pareto_per_task inspired by GEPA) and remote cloud backends, evo is the most sophisticated autonomous optimization framework analyzed to date.Switchboard — Cross-provider heterogeneous routing (deliberately routing cheap tasks to Gemini Flash and expensive tasks to Claude Opus) via a visual kanban is a unique architectural stance — treating token cost optimization as a first-class feature of the orchestration layer. No other framework in the catalog routes across competing AI providers at runtime.
Items written as Tier C
slate-v1 — URL points to a single-page React app (randomlabs.ai) with no public GitHub repository and no navigable documentation. No material available.
blackbox-code — GitHub URL (https://github.com/blackboxaicode/cli) returns HTTP 404. Repository not found.
Cross-references discovered
- pi-ralph-orch explicitly builds on pi-mono (
@earendil-works/pi-coding-agentExtension API) — direct platform dependency - ralph-claude-code and ralph-snarktank both implement the Geoffrey Huntley Ralph pattern (ghuntley.com/ralph/) — same technique, radically different engineering depth (200 lines bash vs. multi-module system with 566 tests)
- pi-ralph-orch credits ralph-orchestrator (mikeyobrien) as its inspiration, placing it in the snarktank/ralph lineage
- evo lists pi as a supported host and uses pi's subagent spawning mechanism (pi-subagents package) — consumer relationship with pi-mono
- snarktank/ralph is explicitly cited as the inspiration for ralph-claude-code (frankbria credits Geoffrey Huntley's technique and references snarktank)
Two Ralph Namesakes — Comparison
Both snarktank/ralph and ralph-claude-code implement the Geoffrey Huntley "Ralph Wiggum" autonomous loop pattern, but they represent opposite ends of the engineering spectrum:
snarktank/ralph (19,612 stars, dormant):
- ~200-line bash script + 2 SKILL.md files
- Always-fresh context per iteration (new AI process)
- No exit gate sophistication —
<promise>COMPLETE</promise>string detection - No rate limiting, no circuit breaker, no monitoring
- Supports Amp (primary) and Claude Code
- Designed for simplicity; copy-paste deployment
- The reference implementation
ralph-claude-code (9,205 stars, active):
- Multi-module bash system (9 main scripts + 9 lib modules) with 566 tests + CI
- Session-based context with configurable 24-hour expiry (optional continuity)
- Dual-condition exit gate (EXIT_SIGNAL: true + completion indicators)
- Rate limiting (100 calls/hour), circuit breaker (CLOSED/HALF_OPEN/OPEN), 3-layer API limit detection
- Claude Code only (not Amp)
- tmux monitoring dashboard
- Interactive setup wizard (ralph-enable)
- Designed for production reliability
The frankbria version is essentially "snarktank/ralph made production-grade" — the same conceptual pattern, engineered to handle all the edge cases that appear when running autonomous loops unattended in real projects. The 566-test suite and changelog (tracking bugs like checkbox-regex false positives and session hijacking) reveal the full complexity hidden behind the simple snarktank loop.