Batch 09 Notes
Roster
| # | slug | display_name | stars | distribution_type | tier | key_differentiator |
|---|---|---|---|---|---|---|
| 1 | agentsmesh | AgentsMesh | 2137 | saas-platform | A | Only framework with multi-tenant org hierarchy, iOS client (SwiftUI+TCA), and Rust Core compiled to WASM/NAPI/UniFFI; 6 skills; BSL-1.1 |
| 2 | superset | Superset | 11262 | desktop-app | A | Only desktop-native framework (Electron+Bun); 100+ concurrent agents; ElectricSQL real-time state sync; deslop.md anti-verbosity command; ELv2 |
| 3 | stoneforge | Stoneforge | unknown | cli-tool | A | Novel Director/Worker/Steward three-role hierarchy; Steward as dedicated merge agent; persistent inter-agent channels; document library; no public GitHub (website-only) |
| 4 | bernstein | Bernstein (sipyourdrink-ltd) | 460 | cli-tool | A | HMAC-SHA256 chained audit log (RFC 2104); 44 adapters; EU AI Act/SOC2 compliance framing; Python deterministic scheduler with zero LLM in scheduling loop; Apache-2.0 |
| 5 | bernstein-sipyourdrink | bernstein (redirect alias) | — | canonical: false | C | Redirect alias to sipyourdrink-ltd/bernstein; same repo |
| 6 | martinloop | MartinLoop | 22 | cli-tool | A | Governance wrapper for single-agent loops; 11-class failure taxonomy; Red-Blue adversarial probes (6 probes); hard budget caps (maxUsd/maxIterations/maxTokens); JSONL audit; NVIDIA Inception |
| 7 | greatcto | GreatCTO | 32 | claude-code-plugin | A | 57 specialist agents; 12 jurisdiction auto-detection (GDPR, HIPAA, PCI-DSS, DPDPA, etc.); /crystallize → global pattern library; 11-layer memory; depends on superpowers + beads |
| 8 | kagan | Kagan | 88 | cli-tool | A | Hardcoded human review gate (REVIEW→DONE structurally impossible without is_review_approved()); 14 agent backends (ACP); Textual TUI + React web + VS Code extension; SQLite state |
| 9 | paf-framework | paf-framework | unknown | unknown | C | GitHub repo crack00r/paf-framework returns 404; no public material found |
| 10 | orchestr8 | orchestr8 | 65 | claude-code-plugin | A | JIT context management system (NOT an orchestrator despite name); 383 resource fragments; AITMPL community registry (400+ shared resources); GPG-signed checksums; 95-98% token reduction |
Intra-Batch Patterns
The batch spans three genuinely different architectural categories:
Workforce platforms (AgentsMesh, Superset, Stoneforge): Full platforms managing multiple agents as a team, with dashboards, task queues, and merge workflows. The dominant pattern in the batch. All three support 10+ concurrent agents. All use git worktrees for isolation.
Governance layers (Bernstein, MartinLoop, Kagan): Frameworks that wrap or gate agent execution rather than running agents themselves. Bernstein wraps a pipeline with HMAC audit compliance; MartinLoop wraps a single agent with budget governance and adversarial probes; Kagan wraps any agent with a mandatory human merge gate. These frameworks share a philosophy: AI agents need structured oversight, not just more capability.
Knowledge management (GreatCTO, orchestr8): Frameworks that manage what knowledge is available to agents rather than what agents do. GreatCTO routes tasks to domain specialists with memory layers; orchestr8 loads the right knowledge JIT.
Git worktree isolation is the dominant concurrency mechanism across this batch. AgentsMesh, Superset, Stoneforge, Kagan all use git worktrees. Bernstein is the exception (process-level isolation per agent subprocess).
The "Steward" anti-pattern solved differently: The problem of "who merges agent work" is solved three ways in this batch: (1) Stoneforge's Steward agent merges automatically, (2) Kagan enforces a human must approve but automates everything else, (3) MartinLoop prevents bad merges via adversarial probes. These represent a spectrum from fully automated to fully human-gated.
Commercial licenses appear with the largest platforms: AgentsMesh (BSL-1.1), Superset (ELv2) — both restrict commercial use without a license. The smaller tools (MartinLoop, GreatCTO, Kagan, orchestr8) are MIT/Apache-2.0.
Multi-model support is nearly universal: All 8 Tier A frameworks support multiple LLM providers (BYOK). Stoneforge, AgentsMesh, Superset support Claude Code + Codex + OpenCode. MartinLoop supports Claude and Codex adapters. Kagan supports 14 backends. Only orchestr8 is Claude Code-only.
Most Interesting Finds
Kagan's hardcoded human review gate is the architecturally purest statement in this batch: is_review_approved() in transition_task() raises IllegalTransition for REVIEW → DONE if the gate hasn't passed. There is no bypass flag, no config option, no admin override. This is policy enforced at the source code level, not the configuration level. No other framework in the batch or the seeds has an equivalent structural guarantee.
MartinLoop's Red-Blue adversarial probe suite (6 deterministic probes: assertion deletion, silent reverts, context poisoning, budget self-reporting, grounding evasion) is the only framework in the entire research corpus with pre-commit adversarial testing built into the governance layer. Combined with the 11-class failure taxonomy and named "Ralph Loop" failure mode, MartinLoop has done the most conceptual work to name and solve specific failure patterns.
orchestr8's JIT loading architecture represents a different response to the multi-agent problem: rather than "run more agents," it asks "how do we give one agent the right knowledge efficiently?" The 95-98% token reduction claim (validated by comparing 45KB upfront vs ~1,600 tokens JIT) is the most concrete efficiency measurement in the batch.
GreatCTO's /crystallize → global pattern library (GP-*.md files built from session incidents, stored in ~/.great_cto/global-patterns/) is the most sophisticated memory architecture in the batch. The pattern captures "what went wrong in project A" and makes it available to agents in project B before they encounter the same problem.
Tier C Items
| slug | reason |
|---|---|
| bernstein-sipyourdrink | Canonical=false; redirect alias to sipyourdrink-ltd/bernstein; same repository |
| paf-framework | GitHub repo crack00r/paf-framework returns 404; no public material |
Cross-References
- bernstein depends on
bernstein-sipyourdrinkbeing the correct canonical slug; thesipyourdrink-ltdorg owns the repository despitechernistrybeing cited elsewhere - greatcto depends on
superpowersandbeads— both auto-installed; a broken superpowers release could break greatcto's pipeline - orchestr8 is architecturally closer to superpowers (knowledge library) than to AgentsMesh/Superset (execution platform); should be cross-referenced with batch knowledge-management entries
- kagan's 14 agent backends include goose, openhands, and auggie — frameworks analyzed in other batches; kagan is the aggregation point for ACP-capable agents
- MartinLoop named the "Ralph Loop" failure mode; ralphy-openspec (Batch 01) implements the pattern being governed; these two frameworks represent opposite ends of the Ralph Loop spectrum (enabler vs. governance layer)
- stoneforge's Steward role (dedicated merge agent) is the automated alternative to kagan's mandatory human gate — the most direct architectural comparison in the batch for the "who merges agent work" problem
- superset's 100+ concurrent agent count (via ElectricSQL state sync) is the scale ceiling of this batch; agentsmesh's multi-tenant org hierarchy is the governance ceiling