Skip to content
/

Claude Bootstrap + Maggy

claude-bootstrap · alinaqi/claude-bootstrap · ★ 670 · last commit 2026-05-26

Route every task to the cheapest capable model while enforcing TDD, ADR compliance, iCPG intent tracking, and Telos intent fidelity — making token economics a first-class engineering discipline.

Best whenModel cost should be engineered, not left to default — 80% of tasks should run on DeepSeek ($0.14/M), not Claude ($3+/M).
Skip ifUsing expensive model for trivial tasks, No coverage threshold enforcement
vs seeds
claude-flow(large MCP toolserver + multi-agent) but uniquely ships 13-tier cost-aware model routing (T0 Qwen3 local to T12 Claude O…
Primitive shape 123 total
Commands 23 Skills 67 Subagents 6 Hooks 12 MCP tools 15
00

Summary

Claude Bootstrap + Maggy — Summary

Claude Bootstrap is a two-component framework: (1) a Claude Code config pack (~/.claude/ install via install.sh) shipping 67 skills, 12 hooks, 23 commands, and 1 agent with TDD enforcement and iCPG/Mnemos memory; (2) Maggy, an optional local FastAPI server with 13-tier model routing, a Cortex MCP code-intelligence server (15 MCP tools, SQLite graph), a vanilla-JS web dashboard on port 8080, Docker-isolated parallel agent execution (Polyphony), and a plugin system. The 13-tier routing is cost-aware — semantic blast scoring routes trivial tasks to local Qwen3 ($0), bulk work to DeepSeek ($0.14/M), and architecture decisions to Claude Opus ($15/M). The iCPG (Intent-Augmented Code Property Graph) stores why code exists with 6-dimension drift detection; Mnemos provides fatigue-aware memory with typed checkpoints for context compaction. 1,100+ tests are included.

Differs from seeds: Most complex framework in the batch. Closest to claude-flow (MCP toolserver + multi-agent) but distinctly different: claude-flow has 305 MCP tools; Claude Bootstrap has 15 Cortex tools plus 67 skills plus 13-model routing. The Telos testing framework (IFS = F1×F2×F3 across Conformance×Validation×Integrity) and iCPG (code property graph with 10 edge types) are novel in the corpus. Unlike any seed, Claude Bootstrap explicitly ships cost-aware 13-tier model routing as a first-class feature, with pricing tables in CLAUDE.md.

01

Overview

Claude Bootstrap + Maggy — Overview

Origin

GitHub: alinaqi/claude-bootstrap (670 stars, 55 forks, MIT, Python). Author: alinaqi. 1,100+ tests, version 6.37.0, actively updated (pushed 2026-05-26 — same day as analysis).

Philosophy

From README:

"Turn Claude Code into a self-reviewing, test-enforced engineering system that remembers context across sessions — then route work across 13 models from a single dashboard."

The two-tier offering:

"Start with Bootstrap; add Maggy when you need the harness."

Core problems named:

"It picks the most expensive model for everything, including trivial tasks" "Context fills up, state is lost, you re-explain yourself every session" "There's no enforcement: code quality, test coverage, and ADR compliance only happen if you remember to ask" "Running multiple agents on the same repo causes file conflicts" "You have no visibility into what Claude is actually doing inside your codebase"

Key Systems

  1. TDD enforcement: Stop hooks — tests must pass before Claude considers a task done
  2. 13-tier model routing: Semantic blast score (1-10) routes to cheapest capable model (Qwen3 local → DeepSeek → Kimi → Gemini → Grok → Codex → Claude)
  3. iCPG: Intent-Augmented Code Property Graph — stores why code exists, 6-dimension drift detection, prevents duplicate implementations
  4. Mnemos: Task-scoped memory with 4-dimension fatigue model, survives context compaction
  5. ADR enforcement: Non-trivial changes require Architectural Decision Record
  6. Telos: Testing beyond TDD — IFS (Intent Fidelity Scale) = F1×F2×F3
  7. Cortex MCP: Code intelligence with 15 tools, cyclomatic complexity, FTS5 search
  8. Polyphony: Docker-isolated parallel agents

Explicit Antipatterns

From CLAUDE.md:

  • Using expensive model (Claude) for trivial tasks
  • No coverage threshold enforcement
  • No ADR for architecture changes
  • Context loss at compaction

Two-Tier Architecture

Bootstrap (30 seconds): ./install.sh → copies to ~/.claude/ Maggy (5 minutes): pip install -e . + maggy serve → web dashboard

02

Architecture

Claude Bootstrap + Maggy — Architecture

Distribution

  • Type: Claude plugin + Python FastAPI server
  • Install: git clone + ./install.sh (Bootstrap) + pip install -e . (Maggy)
  • License: MIT
  • Language: Python (backend), JavaScript (dashboard)
  • Dashboard: http://localhost:8080 (Maggy only)

Component Diagram

Claude Bootstrap (installed to ~/.claude/)
├── skills/         67 domain skills
├── hooks/          12 lifecycle hooks
├── commands/       23 slash commands
├── rules/          Conditional rules by file glob
└── templates/      settings.json, CLAUDE.md, ADR template

Maggy (local server, optional)
├── pipeline/       ChatPipeline orchestrator
├── skills/         Skill injection + YAML protocol engine
├── api/            REST API (chat, routing, plugins, pipeline logs)
├── static/         Web dashboard (vanilla JS, no build step)
└── services/       Routing, memory, execution, Mnemos

cortex-mcp/         Code intelligence MCP server
├── src/cortex/
│   ├── structure/  AST extraction, edge types, cyclomatic complexity
│   └── storage/    SQLite graph store, FTS5 index

plugins/
├── build-in-public # Auto-posts to LinkedIn/X
├── telos/          Intent Fidelity Scale testing
└── providers/      GitHub, Asana, Monday integrations

Install Paths

Bootstrap (30 seconds)

git clone https://github.com/alinaqi/maggy.git
cd maggy && ./install.sh
# Copies to ~/.claude/

Full Harness

cd maggy && pip install -e .
maggy serve   # Dashboard at localhost:8080

13-Tier Model Routing

Tier Model Cost Role
T0 Qwen3 (local) $0 Classification, triage, free bulk ops
T1 Gemini Flash-Lite $0.10/M Bulk extraction, CIG pipelines
T2 DeepSeek Flash $0.14/M Docs, tests, scaffolding
T3 Gemini Flash $0.15/M Multimodal, vision, audio
T4 DeepSeek Pro $0.435/M Complex coding, multi-file refactors
T5 Gemini CLI ~$0.25-1.25/M Multi-file agentic coding
T6 AGY Google tier End-to-end implementation
T7 Kimi $0.60/M Long-context analysis
T8 Gemini Pro Search $1.25/M Deep research, 2M context
T9 Grok $5/M Competitor intel, deep reasoning
T10 Codex varies Bulk generation, security tasks
T11 Claude Sonnet $3-5/M Quality-critical code
T12 Claude Opus $15-25/M Architecture, security, ADRs

Routing is semantic (Qwen3 as local classifier), fatigue-aware, budget-capped, and cascading.

Required Runtime

  • Python 3.11+ (Maggy)
  • Claude Code (Bootstrap)
  • Optional: Docker (Polyphony parallel agents)
  • Optional: [vectors] for embeddings
03

Components

Claude Bootstrap + Maggy — Components

Skills (67 — in .claude/skills/)

Domain categories: aeo-optimization, agent-teams, agentic-development, ai-models, android-java, android-kotlin, autonomous-testing, aws-aurora, aws-dynamodb, azure-cosmosdb, base, build-in-public, cloudflare-d1, code-deduplication, code-graph, code-review, codex-review, commit-hygiene, cpg-analysis, credentials, cross-agent-delegation, database-schema, existing-repo, external-model-delegation, firebase, flutter, gemini-review, icpg, iterative-development, klaviyo, llm-patterns, maggy, medusa, mnemos, model-routing, ms-teams-apps, nodejs-backend, playwright-testing, polyphony, posthog-analytics, project-tooling, pwa-development, python, react-native, react-web, reddit-ads, reddit-api, security, session-management, shopify-apps, site-architecture, supabase-nextjs, supabase-node, supabase-python, supabase, team-coordination, ticket-craft, typescript, ui-mobile, ui-testing, ui-web, user-journeys, visual-validation, web-content, web-payments, woocommerce, workspace

Commands (23 — in .claude/commands/)

Command Purpose
analyze-repo.md Analyze repository structure
analyze-workspace.md Workspace analysis
build-in-public.md Auto social media post
check-contributors.md Contributor analysis
icpg-bootstrap.md Initialize iCPG
icpg-drift.md Detect iCPG drift
icpg-impact.md Impact analysis
icpg-intent.md Intent recording
icpg-why.md Why-code lookup
initialize-project.md Project setup
maggy-init.md Initialize Maggy harness
maggy.md Maggy operations
mnemos-checkpoint.md Manual memory checkpoint
mnemos-status.md Memory status
polyphony-init.md Initialize Docker isolation
polyphony-spawn.md Spawn parallel agent
polyphony-status.md Agent status
set-tracker.md Set task tracker
spawn-team.md Create agent team
sync-agents.md Sync agent state
sync-contracts.md Contract sync
update-code-index.md Update Cortex index
usage-summary.md Cost/usage summary

Hooks (12 — in .claude/hooks/)

Hook Purpose
auto-review-hook Auto code review on changes
icpg-inject-context Inject iCPG context
icpg-record-intent Record intent before changes
mid-task-escalation Escalate if complexity rises
mnemos-session-start.sh Load Mnemos memory at session start
plugin-trigger Trigger plugins on events
polyphony-auto-isolate Auto Docker isolation
post-commit-graph Update graph after commit
pre-push Pre-push quality gates
route-task-hook Route task to appropriate model tier
usage-summary-hook Update usage metrics
workspace Workspace management

Agents (1 — in .claude/agents/)

  • explore.md — Exploration agent

Agent Teams (6 — via skills/agent-teams)

  • Lead, Quality, Security, Review, Merger, Feature

Cortex MCP Server (15 tools)

Code intelligence tools:

  • analyze_codebase — Full structural analysis
  • get_architecture_overview — Architecture summary
  • get_call_graph — Function call graph
  • orient() — Smart entry point for context
  • 11 more tools for graph traversal, drift, complexity

Plugins

  • build-in-public — Auto-posts to LinkedIn/X
  • telos — Intent Fidelity Scale testing
  • Provider plugins: GitHub, Asana, Monday

Telos Testing Framework

IFS (Intent Fidelity Scale) = F1 × F2 × F3

  • F1 Conformance: passed/total tests (pytest/vitest)
  • F2 Validation: drift severity (Cortex drift_events)
  • F3 Integrity: orphan symbols, empty contracts, stale reasons, scope sprawl

Zero in any plane collapses IFS to zero. 100% test pass rate + severe architectural drift = 0 score.

05

Prompts

Claude Bootstrap + Maggy — Prompts

Excerpt 1: CLAUDE.md — Model Routing Table

Technique: Cost-aware routing as standing instruction. Priced decision table forces model economy at skill level.

## Model Routing — Token Economy (13-Tier)

| Tier | Model | Cost (in/out per M) | Role |
|------|-------|---------------------|------|
| 0 | Qwen3 (local) | $0 | File reads, quick edits, boilerplate, offline |
| 1 | Gemini 2.5 Flash-Lite | $0.10 / $0.40 | Bulk extraction, classification, CIG pipelines |
| 2 | DeepSeek V4 Flash | $0.14 / $0.28 | Sub-agents, cheap internal calls |
| 4 | DeepSeek V4 Pro | $0.435 / $0.87 | Main coding workhorse — ~80% of work |
| 11 | Claude Sonnet | $3-5 / $15-25 | Quality-critical code, complex debugging |
| 12 | Claude Opus | Architecture, security review, ADR decisions |

### Use qwen3 (local) for:
- grep, find, awk, sed, jq questions
- Shell one-liners and regex
- Quick syntax lookups, log reading, file search

Invoke: qwen3 ""

Excerpt 2: CLAUDE.md — TDD Iron Law

Technique: Unconditional mandate with explicit step sequence. Pattern matches superpowers' Iron Law approach.

## TDD — Non-Negotiable

ALWAYS write tests before implementation. No exceptions.

1. Read the requirement
2. Write failing tests that describe the expected behaviour
3. Run tests — confirm they fail for the right reason
4. Write the minimum implementation to pass
5. Refactor, keep tests green
6. Never write a function, class, or API endpoint without a corresponding test

If asked to add a feature without tests, write the tests first then ask to proceed.
If a PR diff shows untested code paths, flag them before continuing.

Excerpt 3: Telos IFS Formula

Technique: Quantitative intent verification — novel pattern. Multiplied dimensions collapse to zero if any fail.

## Telos: Testing Beyond TDD

IFS (Intent Fidelity Scale) = F1 × F2 × F3

F1 — Conformance:  passed / total tests            (pytest / vitest)
F2 — Validation:   drift severity                  (Cortex drift_events)
F3 — Integrity:    IF-3 orphan symbols              (no reason edges)
                   IF-4 empty contracts             (no pre/post/invariants)
                   IF-6 stale reasons               (proposed >7d, never fulfilled)
                   IF-7 scope sprawl                (reason scopes >10 files)

A zero in any plane collapses IFS to zero. 100% test pass rate with severe architectural drift = score of 0.

Excerpt 4: Skill Protocol Routing (from Maggy ChatPipeline)

Technique: Intent matching + YAML workflow execution. "Push to git" becomes a multi-step protocol automatically.

Routing a task:
  You: "review the auth middleware for timing attacks"
  → Blast score: 8/10 (security + architecture)
  → Routed to: Claude (Tier 11)
  → ADR gate: found docs/adr/0003-jwt-strategy.md → injected as context
  → Review runs with full architectural context

Skill Protocol execution:
  You: "push to git"
  → Intent matched: git-push protocol
  → ✅ lint → ✅ typecheck → ✅ tests → ✅ stage → ✅ commit → ✅ push
09

Uniqueness

Claude Bootstrap + Maggy — Uniqueness

Differs from Seeds

Most similar to claude-flow (large MCP toolserver + multi-agent + external memory) but architecturally distinct in five dimensions: (1) 13-tier cost-aware model routing is the most elaborate in any framework analyzed — semantic blast scoring routes tasks to cheapest capable model from local Qwen3 ($0) through DeepSeek ($0.14/M) to Claude Opus ($25/M), with pricing tables embedded in CLAUDE.md as standing instructions; (2) Telos IFS (F1×F2×F3 multiplicative intent fidelity scale) is a novel testing paradigm that collapses to zero on architectural drift even with 100% test pass rate — not present in any seed; (3) iCPG (Intent-Augmented Code Property Graph) with 10 edge types, drift detection, and 15 Cortex MCP tools goes deeper than ccmemory's Neo4j approach by storing why code exists alongside what it does; (4) Docker-isolated parallel agents (Polyphony) is distinct from claude-flow's worktree isolation and superpowers' Task tool spawning; (5) 67 domain-specific skills covering 30+ technology stacks (Supabase, Shopify, Firebase, Flutter, React Native, Playwright, etc.) is the most domain-comprehensive skill library in the corpus.

Positioning

  • Target user: Solo engineers and teams who want aggressive cost optimization, TDD enforcement, and architectural memory without vendor lock-in
  • Key differentiator: 13-tier routing makes token cost a first-class engineering concern — unique in the corpus
  • Comprehensive coverage: 67 skills × 12 hooks × 23 commands + Telos + iCPG + Polyphony + web dashboard = most complete harness in the batch

Observable Failure Modes

  1. Complexity overhead: The combination of iCPG + Mnemos + Telos + 13-tier routing + Cortex is substantial — Bootstrap layer alone is manageable, but full Maggy harness requires Python 3.11+, Docker, and 5+ API keys
  2. Local Qwen3 dependency: T0 routing requires a local Qwen3 model installation — adds GPU/memory requirements for the routing layer
  3. Build-in-Public plugin: Auto-posting to LinkedIn/X is a social media side-effect that runs from code change context — potential for unintended disclosures
  4. Dashboard vanilla JS: No build step is convenient but limits dashboard UI complexity and maintainability
  5. 13 models to maintain: Cost routing breaks as model pricing changes; the pricing table in CLAUDE.md will require updates

Novel Patterns

  • Telos IFS: multiplicative intent fidelity score — most rigorous test quality metric in the corpus
  • 13-tier routing with embedded pricing: treating model cost as a first-class engineering constraint
  • Mnemos 4-dimension fatigue: session memory health as a trackable engineering metric
  • iCPG 10-edge-type intent graph: why-oriented code property graph with contract tracking
04

Workflow

Claude Bootstrap + Maggy — Workflow

Bootstrap-Only Workflow

Install: ./install.sh
    ↓
New Claude Code session:
  mnemos-session-start.sh hook fires → load memory
  icpg-inject-context hook → inject code property graph context
    ↓
Code change:
  TDD: write tests first (enforced by Stop hooks)
  Max 20 lines/function, 3 params, 2 nesting levels (quality gate)
    ↓
Commit:
  pre-push hook → lint + typecheck + tests
  post-commit-graph hook → update Cortex graph
    ↓
Architecture change:
  ADR gate: requires Architectural Decision Record
  icpg-record-intent → why this code exists

Full Maggy Workflow (with routing)

User message
    ↓
ChatPipeline: blast score (1-10)
    ↓
route-task-hook: select cheapest capable model (T0-T12)
    ↓
Skill protocol check: match intent → YAML protocol
    ↓
Execute skill protocol (e.g., git-push: lint → test → stage → commit → push)
    ↓
Usage tracking + fatigue score update

Skill Protocol Execution Example

You: "push to git"
→ Intent matched: git-push protocol
→ ✅ lint       (2.1s)
→ ✅ typecheck   (4.3s)
→ ✅ tests       (11.2s)
→ ✅ stage
→ ✅ commit      [AI-generated message]
→ ✅ push

Approval Gates

  1. TDD gate: Tests must pass before Stop (hook enforcement)
  2. Quality gate: Max lines/function, params, nesting — per-file
  3. ADR gate: Architecture changes require ADR document
  4. Pre-push gate: lint + typecheck + tests
  5. iCPG gate: Drift detection at pre-commit
  6. Telos IFS gate: IFS = 0 if any plane is zero

Phase to Artifact Map

Phase Artifact
Session start Mnemos memory loaded, iCPG context injected
Intent recording iCPG intent node created
Implementation Source code + tests (TDD first)
Commit Post-commit graph update, ADR if needed
Session end Mnemos checkpoint, usage summary
Delivery Build-in-public plugin (optional social post)

Fatigue-Aware Memory

Session fatigue: 0.61 (PRE-SLEEP)
→ Mnemos: auto-checkpoint written
→ Micro-consolidation: 3 ResultNodes compressed
→ iCPG context injected: 2 ReasonNodes, 1 constraint
→ Context freed: ~18k tokens

4-dimension fatigue model tracks session health and auto-compacts context before quality degrades.

06

Memory Context

Claude Bootstrap + Maggy — Memory & Context

Mnemos (Task-Scoped Memory)

  • Type: File-based with 4-dimension fatigue model
  • Persistence: Project + cross-session
  • 7 amnesia types tracked: uninitialized, premature, sequential, contextual, temporal, semantic, terminal
  • Typed checkpoints: ResultNode, ReasonNode, ConstraintNode
  • Engram: Cross-session memory persisting architectural knowledge across weeks

Session start: mnemos-session-start.sh hook loads prior memory. During session: fatigue score tracked in 4 dimensions. At 0.61 fatigue (PRE-SLEEP threshold): auto-checkpoint + micro-consolidation.

iCPG (Intent-Augmented Code Property Graph)

  • Type: Code property graph (10 edge types)
  • Backend: SQLite
  • Features:
    • Why code exists (intent edges)
    • 6-dimension drift detection
    • Cyclomatic complexity
    • Bidirectional traversal
    • FTS5 full-text search
    • Prevents duplicate implementations
  • Cortex MCP: 15 tools expose the graph via MCP protocol

iCPG stores: ReasonNodes (why code was written), ConstraintNodes (architectural constraints), drift events, pre/post/invariant contracts.

State Files

Path Purpose
Cortex SQLite DB Call graph, clusters, iCPG edges
docs/adr/ Architectural Decision Records
Mnemos checkpoints Session memory snapshots
.claude/rules/ Conditional rules by file glob

Compaction Handling

Yes — Mnemos 4-dimension fatigue model triggers auto-checkpoint at fatigue thresholds:

  • Fatigue 0.61 (PRE-SLEEP): auto-checkpoint + micro-consolidation (~18k tokens freed)
  • pre-compact.sh-equivalent via mnemos-session-start.sh hook

Cross-Session Handoff

Yes via Mnemos. Architectural knowledge persists across weeks via Engram. iCPG graph persists in SQLite across sessions.

  • Cortex MCP orient() — graph-native, replaces file reads with targeted graph queries
  • FTS5 full-text search in SQLite
  • Semantic search (optional, if embeddings enabled)
07

Orchestration

Claude Bootstrap + Maggy — Orchestration

Multi-Agent: Yes (via Polyphony + Agent Teams)

  • Polyphony: Docker-isolated parallel agents. Second session auto-provisions a workspace.
  • Agent Teams: 6 named roles (Lead, Quality, Security, Review, Merger, Feature)
  • Cross-agent delegation: Skills for delegating to external models
  • Spawn mechanism: Claude Code Task tool + Docker containers (Polyphony)

Orchestration Pattern: Hierarchical

  • Lead agent coordinates Agent Teams
  • Quality/Security/Review agents run in parallel on specific concerns
  • Merger agent handles conflict resolution

Isolation Mechanism: Container (Polyphony)

Docker-isolated parallel agent execution. Each parallel agent runs in a Docker container. polyphony-auto-isolate hook triggers automatically.

Execution Mode: Interactive-loop

Session-based. Bootstrap layer fires on session events; Maggy adds a REST API for programmatic control.

Multi-Model: Yes (13 tiers)

The most extensive multi-model routing in the batch:

Classification Model
Triage/classification Qwen3 (local, $0)
Bulk work DeepSeek Flash (T2, $0.14/M)
Main coding DeepSeek Pro (T4, ~80% of work)
Quality/security Claude Sonnet (T11)
Architecture/ADR Claude Opus (T12)

Routing: semantic blast score (1-10) + task type classification by local Qwen3 + budget cap + cascade.

Prompt Chaining: Yes

Intent recording (iCPG) → implementation → drift detection → Telos validation. Each stage feeds the next.

Auto-Validators

  • TDD: Stop hooks enforce tests before "done"
  • Quality gates: Max 20 lines/function, 3 params, 2 nesting levels (per-file)
  • ADR gate: Architecture changes require ADR
  • Pre-push: lint + typecheck + tests
  • iCPG drift: Pre-commit drift detection
  • Telos IFS: F1×F2×F3 intent fidelity check
  • Cortex: cyclomatic complexity, orphan symbol detection

Git Automation

  • Skill protocol for git-push: lint → test → stage → commit → push (auto when invoked)
  • post-commit-graph hook updates Cortex after each commit
  • PR creation: not automated

Consensus Mechanism: None

No multi-agent consensus protocol. Agent Teams use a Merger agent for conflict resolution, not a quorum vote.

Cross-Tool Portability: Medium

Bootstrap targets Claude Code. Maggy's REST API and YAML skill protocols are more portable. The 67 skills are Claude Code-specific.

08

Ui Cli Surface

Claude Bootstrap + Maggy — UI / CLI Surface

CLI Binary

  • Name: maggy
  • Is thin wrapper: No — full FastAPI server with routing, memory, and plugins
  • Subcommands: serve (main; others via REST API)
maggy serve   # Dashboard at localhost:8080

Local Web Dashboard

  • Exists: Yes (Maggy only)
  • Type: web-dashboard
  • Port: 8080
  • Tech stack: Vanilla JS (no build step), FastAPI backend

Dashboard features (from README):

  • Multi-model chat routing visualization
  • Pipeline logs (tool calls, routing decisions)
  • Usage metrics and cost tracking
  • Agent status and coordination view
  • Skill protocol execution logs

Dashboard is vanilla JS with no build step — runs directly via FastAPI's static file serving.

Bootstrap Install

git clone https://github.com/alinaqi/maggy.git
cd maggy && ./install.sh
# Installs to ~/.claude/ in ~30 seconds

Cortex MCP

The Cortex MCP server provides code intelligence:

  • orient() — entry point replacing file reads
  • 15 graph-native tools via stdio MCP protocol
  • SQLite backend, FTS5 index
{
  "mcpServers": {
    "cortex": {
      "command": "python",
      "args": ["-m", "cortex_mcp"]
    }
  }
}

Plugin System

Drop-in plugins in plugins/:

  • Build-in-Public: Auto-posts session summaries to LinkedIn/X
  • Telos: IFS testing integration
  • Provider plugins: GitHub, Asana, Monday task integration

Plugin installation: drop .yaml in maggy/skills/protocols/ for skill protocols.

IDE Integration

  • Claude Code: Full support via ~/.claude/ install
  • Codex: codex-review skill
  • Gemini CLI: gemini-review skill
  • External models: external-model-delegation skill

REST API (Maggy)

FastAPI endpoints:

  • POST /chat — Route message through model tiers
  • GET /routing — Get routing decision
  • GET /plugins — List active plugins
  • GET /pipeline-logs — Pipeline execution history

Observability

  • Usage summary via usage-summary-hook and /usage-summary command
  • Pipeline logs in Maggy dashboard
  • Cortex: cyclomatic complexity, call graph centrality
  • Mnemos: fatigue scores and checkpoint history

Related frameworks

same archetype · same primary tool · same memory type

CodeMachine CLI ★ 2.5k

JavaScript-DSL workflow orchestration engine that captures repeatable AI coding agent workflows with tracks, condition groups,…

Codexia ★ 690

Tauri desktop app providing visual control plane, task scheduler, git worktree manager, and headless REST API for Codex CLI +…

Kagan ★ 88

Kanban TUI for AI coding agents with a structurally enforced human review gate (REVIEW → DONE cannot be automated) — one git…

oh-my-claudecode (Yeachan-Heo) ★ 35k

Zero-learning-curve teams-first multi-agent orchestration for Claude Code with autopilot (6-phase lifecycle), ralph (PRD-driven…

Paseo ★ 6.8k

Multi-provider AI coding agent orchestration daemon with cross-device access (phone/desktop/CLI) and git worktree isolation.

CCG Workflow ★ 5.4k

Routes Claude + Codex + Gemini to task-appropriate collaboration strategies (direct-fix through full-collaborate) with hook-based…