Batch 23 Notes — Codex orchestration (oh-my-codex family + risk-routing governance)
Roster
| Slug | Stars | Archetype | Primary Innovation |
|---|---|---|---|
| oh-my-codex-yeachan | 29,662 | MCP-anchored + skills | Canonical Codex skills platform; ultragoal ledger; tmux team mode; quantitative ambiguity gating (0.15/0.20/0.30); 46+ skills; 20+ personas |
| oh-my-codex-scalarian | 65 (archived) | npm monorepo | v2 monorepo fork; Codex-only; archived April 2026; local path leakage in README |
| oh-my-codex-sigridjineth | 14 | Claude Code fork | Ambassador fork; tier routing (Haiku/Sonnet/Opus); keyword-detector hooks; Discord/Telegram notifications; LSP/AST tooling |
| oh-my-openagent | 59,562 | Platform-class agent OS | OpenCode plugin (TypeScript+Bun); Sisyphus/Hephaestus named agents; Hashline LINE#ID; Boulder state machine; 52 hooks; SUL-1.0 license |
| am-will-swarms | 203 | skills-only behavioral | Dependency-aware DAG (T1/T2/T3 depends_on arrays); wave-based parallel execution; zero CLI/hooks/MCP; npx skills add install |
| vnx-orchestration | 34 | governance-first orchestrator | NDJSON ledger (1,400+ entries); deterministic triple gate (Codex+Gemini+CI); tmux 2x2 grid operator mode; web dashboard; Python |
| do-it | 20 | skills+hooks behavioral | Risk-tier routing (Light/Standard/Heavy); 23 TOML agent definitions; 7 hook scripts; 5 DIM boolean session state; git worktree for Heavy tier |
| hotl-plugin | 22 | skills-only behavioral | 8-phase workflow; triple HOTL contracts (intent/verification/governance); .hotl/state/ resumability; 5-tool support; smart routing |
| sandcastle-mattpocock | 5,103 | TypeScript SDK | Container isolation (Docker/Podman/Vercel Firecracker); npm library API; Effect library; merge-to-head branch strategy; mattpocock brand |
| metaswarm | 284 | hierarchical orchestrator | 9-phase + 4-phase inner loop; 19 agents; BEADS knowledge base; self-improving via /self-reflect; cross-model adversarial review; 3-tool support |
oh-my-codex Side-by-Side Comparison
The "oh-my-codex" naming meme was adopted by three different authors with very different implementations:
| Dimension | yeachan (canonical) | scalarian (fork) | sigridjineth (ambassador) |
|---|---|---|---|
| GitHub stars | 29,662 | 65 (archived) | 14 |
| Contributors | 30 | 3 | 1 |
| Status | Active (May 2026) | Archived (April 2026) | Active (May 2026) |
| Canonical flag | YES (explicitly warns against others) | NO | NO (self-describes as Ambassador) |
| Target tool | Codex CLI (primary) | Codex CLI only (no Claude bridge) | Claude Code (primary), Codex (secondary) |
| Skill count | 46+ | 14 | 41 |
| Persona count | 20+ | ~8 | ~20 |
| Ultragoal system | YES — append-only ledger, lifecycle tracking | NO | YES (inherited from canonical) |
| Deep-interview | YES — quantitative ambiguity scores (0.15/0.20/0.30 thresholds) | NO — simplified version without scoring | YES (inherited, no quantitative scoring) |
| Tier routing | NO — single model per run | NO | YES — Haiku/Sonnet/Opus by task complexity |
| Hooks | SessionStart, PreToolUse, PostToolUse, UserPromptSubmit, Stop | Minimal | UserPromptSubmit (keyword-detector + skill-injector), SessionStart (project-memory) |
| tmux team mode | YES — multiple concurrent sessions | NO | NO |
| Notifications | NO | NO | YES — Discord and Telegram |
| LSP/AST tooling | NO | NO | YES — language server protocol integration |
| State persistence | .omx/ directory | Minimal | .omx/ + project-memory |
| CLI binary | omx (TypeScript+Rust) | omx (packages/cli/dist/bin.js) | omx (inherited) |
| Monorepo structure | Single package | v2 monorepo (packages/cli/core/mcp-server) | Single package |
| Notable defect | None documented | Local paths leaked in README (/Users/staticpayload/...) | None documented |
| "oh-my-codex" meaning | Production Codex skills platform; Codex's missing workflow layer | npm monorepo experiment; Codex-only rebuild; abandoned | Codex/Claude skills platform; tier-aware routing with notifications |
Key Differentiator Summary
- yeachan = the platform: The canonical, highest-star, most-feature project. Defines what oh-my-codex means. tmux team mode for parallelism; quantitative deep-interview gating; 20+ personas; TypeScript+Rust binary.
- scalarian = the failed experiment: A v2 monorepo rebuild that went Codex-only, introduced a local developer path leak in its README, and was archived after ~6 months. The monorepo split (packages/cli/core/mcp-server) is the only architectural innovation.
- sigridjineth = the evolved fork: An Ambassador (explicitly acknowledged by canonical project) that adds tier-routing (Haiku/Sonnet/Opus), notification webhooks (Discord/Telegram), and LSP/AST tooling — features canonical lacks. Uses Claude Code as primary vs Codex. This is the most architecturally interesting of the three derivatives.
Intra-Batch Patterns
Risk-Tier Routing Convergence
Three frameworks independently developed risk-aware routing:
- do-it: Light/Standard/Heavy by DIM session booleans (7 hooks enforce routing)
- hotl-plugin: smart routing (question/fix/debug/build) before workflow selection
- oh-my-codex-sigridjineth: Haiku/Sonnet/Opus by task complexity (tier routing)
None of the 11 seed frameworks has this. Suggests the field has independently converged on cost/complexity-aware agent routing as a natural evolution.
Knowledge Persistence Spectrum
| Framework | Persistence Mechanism | Granularity |
|---|---|---|
| metaswarm | JSONL knowledge base (BEADS) | Per-PR, self-improving |
| hotl-plugin | .hotl/state/ + reports | Per-run resumable |
| do-it | .do-it/session/ (DIM booleans) | Per-session |
| vnx-orchestration | NDJSON ledger (1,400+ entries) | Per-decision |
| oh-my-codex-yeachan | .omx/ + ultragoal ledger | Per-task |
Governance Philosophy Split
| Approach | Frameworks |
|---|---|
| Governance-first (audit trail, receipts) | vnx-orchestration (NDJSON ledger), metaswarm (BEADS + knowledge base) |
| Quality-gate-first (approval gates, reviews) | hotl-plugin (3 HOTL contracts), metaswarm (Design Review Gate) |
| Risk-routing-first (avoid unnecessary overhead) | do-it (DIM tiers), hotl-plugin (smart routing), oh-my-codex-sigridjineth (tier routing) |
Container Isolation Outlier
sandcastle is the only framework in batch 23 (and potentially all 33 batches) using actual container isolation (Docker/Podman/Vercel Firecracker microVM). All other frameworks use git worktrees, file system isolation, or no isolation. This represents a categorically different threat model — sandcastle assumes agent code is potentially hostile to the host; other frameworks assume trusted agents with access to the full repo.
External Tool Delegation
Two frameworks explicitly delegate to competing AI models:
- metaswarm: Codex CLI and Gemini CLI as adversarial review delegates (cross-model review)
- vnx-orchestration: Codex + Gemini + CI as triple gate
This pattern (use cheaper/different model for specific phases) is absent from all 11 seeds and most other batches.
Most Interesting Find
metaswarm's selective knowledge priming: bd prime --files --keywords --work-type is the most elegant solution to the "knowledge base bloat" problem seen in any framework. The JSONL format + metadata-filter retrieval means institutional memory can scale to thousands of entries across hundreds of PRs without ever overflowing the context window. No other framework in the batch of 33 has a comparable solution to this scaling problem.
Runner-up: oh-my-codex-yeachan's quantitative ambiguity scoring (0.15/0.20/0.30 thresholds in deep-interview) — treating ambiguity as a measurable quantity rather than a binary go/no-go is genuinely novel in this space.
Tier C Items
None. All 10 frameworks had sufficient data for full analysis.
Cross-References
- sandcastle is directly comparable to claude-flow (batch seeds) — both are multi-agent runtimes, sandcastle with container isolation vs claude-flow with MCP tools
- metaswarm credits Superpowers (seed framework) as foundational skills source
- hotl-plugin is comparable to kiro (seed) for phased workflow with approval gates
- do-it is comparable to BMAD-METHOD (seed) for role-based agent definitions with quality gates
- oh-my-openagent (Hashline LINE#ID, Boulder state machine) has no seed parallel — platform-class agent OS unique in the catalog
- am-will-swarms is comparable to spec-kit (seed) in minimalism but differs in the DAG dependency model