Batch 21 Notes — Minimal & Educational Coding Agents (Codex/Gemini CLI + tiny SWE-agent family)

Roster

#	Slug	Display Name	Stars	Language	Tier	Files
1	`codex-cli`	OpenAI Codex CLI	85,783	Rust	A	11
2	`gemini-cli`	Gemini CLI	104,604	TypeScript	A	11
3	`ra-aid`	RA.Aid	2,224	Python	A	11
4	`ra-aid-che-incubator`	che-incubator demo	2	YAML/Markdown	A	11
5	`nanocoder`	Nanocoder	N/A	N/A	C	2
6	`mini-swe-agent`	mini-swe-agent	4,526	Python	A	11
7	`micro-agent`	Micro Agent (Builder.io)	4,307	TypeScript	A	11
8	`pi-coding-agent`	pi (badlogic/pi-mono)	55,166*	TypeScript	A	11
9	`tmuxai`	TmuxAI	1,833	Go	A	11
10	`hermes-ide`	Hermes IDE	257	TypeScript/Rust	A	11

*pi-mono is a monorepo; the coding-agent package stars are part of the larger monorepo count.

Total: 9 full 11-file reports, 1 Tier C stub (nanocoder).

Tier C Items

nanocoder (Nanocoder-ai/nanocoder): HTTP 404 from GitHub API. No public repository found. Written as Tier C stub: 00-summary.md + METRICS.yaml only, with status: insufficient-public-material.

Intra-Batch Patterns

1. The Massive Baseline Pair

codex-cli (85k stars, Rust) and gemini-cli (104k stars, TypeScript) dominate by star count. Both are full-featured terminal binaries with skill/command systems, but their design philosophies diverge sharply:

codex-cli: Rust binary, sandboxed execution (Apple Seatbelt + Linux bwrap), parallel fan-out orchestration, 12 skills
gemini-cli: TypeScript, Google Search grounding as built-in tool, A2A (Agent-to-Agent) server, TOML command files with shell interpolation !{cmd}

2. Intentional Minimalism as Pedagogy

mini-swe-agent (~130-line DefaultAgent class) and micro-agent (~8-line system prompt, ~200 LOC) are deliberate reference implementations. Both achieve competitive benchmark results with almost no code:

mini-swe-agent: 74% SWE-bench Verified with bash-only tool and stateless subprocess.run
micro-agent: TDD loop (generate test → iterate until pass) in TypeScript with explicit non-goals (no multi-file, no installs) These are the clearest demonstrations in the entire catalog that agent capability is mostly model capability — the harness is nearly irrelevant.

3. Three-Stage Pipeline as Common Pattern

RA.Aid and (less explicitly) codex-cli both use Research → Planning → Implementation pipelines. RA.Aid formalizes this as three LangGraph stages with separate prompt modules. This pattern recurs across the catalog (BMAD-METHOD, Kiro) but RA.Aid is the most explicit Python implementation.

4. GUI vs CLI Split

This batch spans both ends of the interface spectrum: codex-cli and gemini-cli are pure terminal binaries; hermes-ide is a pure desktop GUI; pi and tmuxai sit in between (terminal binary with optional rich output modes). The GUI approach (hermes-ide) is unique in the batch — it treats Claude's stream-json events as UI data, not as terminal text.

5. Provider Lock-in Patterns

Claude-only: hermes-ide (Agent mode), pi (any provider via packages/ai abstraction — but pi's identity is Claude-native)
Gemini-only: gemini-cli
OpenAI-only: codex-cli, micro-agent (though micro-agent supports any OpenAI-compatible endpoint)
Model-agnostic: ra-aid, mini-swe-agent, tmuxai

6. Memory Architecture Diversity

SQLite-backed typed repositories: ra-aid
Linear message history = trajectory = training data: mini-swe-agent (stateless, no persistence)
Knowledge Base + context pins: tmuxai, hermes-ide
No memory: micro-agent (single TDD session per invocation)
Context files (AGENTS.md, GEMINI.md, CLAUDE.md): codex-cli, gemini-cli, pi

Most Interesting Find

RA.Aid's embedded self-critique in the planning prompt: The planning system prompt contains the line "You have often been criticized for creating overly complex solutions when simpler, more targeted fixes would suffice." This is a direct injection of known model failure modes into the system prompt as a corrective. It is a rare example of a framework explicitly encoding model weaknesses as first-class prompt content, rather than relying on instruction-following to avoid them.

Runner-up: mini-swe-agent treating the linear message history as both trajectory and training data. The framework generates SWE-bench trajectories as a side effect of normal operation, making every production run a potential training sample without any extra tooling.

Cross-References

codex-cli is the upstream reference for the .codex/skills/ skill format — several other batch agents (codex-native bridges, batch 22) derive their skill system from this.
gemini-cli introduces the TOML command file format (.gemini/commands/) and !{cmd} shell interpolation — compare with .claude/commands/ in Anthropic's Claude CLI.
ra-aid is the most complete Python/LangGraph coding agent in the catalog. Cross-reference with batch 28 (LangChain/Pydantic harnesses).
mini-swe-agent is a reference implementation for the SWE-agent family. Cross-reference with the original SWE-agent (batch 5) and SWE-bench evaluation papers.
hermes-ide is the only desktop GUI in the batch. Cross-reference with Kiro (VS Code extension, batch 11) and pi (Tauri desktop, also batch 21 — but pi ships a CLI binary, hermes does not).
tmuxai's tmux screen context pattern is unique in the catalog. No other framework reads raw terminal screen content as context.

Duplicates Encountered

None. All 10 assigned slugs were distinct frameworks. nanocoder was unavailable (404) but is not a duplicate of any other framework.

Architectural Observations for Master Catalog

The "passthrough renderer" archetype (hermes-ide) is underrepresented in seed frameworks — none of the 11 seeds are GUI desktop apps. Hermes is the primary example.
The "TDD loop" pattern (micro-agent) is the cleanest single-purpose agent pattern in the catalog: one input (failing test or task), one output (code that passes the test), one loop.
The "bash-only tool" pattern (mini-swe-agent) is worth tracking — it achieves near-SOTA on SWE-bench without file read/write/edit primitives, relying entirely on shell commands.
The A2A (Agent-to-Agent) protocol in gemini-cli is the only inter-agent communication protocol in this batch. It is experimental but signals a direction for multi-agent coordination in the Gemini ecosystem.