Claude Bootstrap + Maggy

claude-bootstrap · alinaqi/claude-bootstrap · ★ 670 · last commit 2026-05-26

Route every task to the cheapest capable model while enforcing TDD, ADR compliance, iCPG intent tracking, and Telos intent fidelity — making token economics a first-class engineering discipline.

Best whenModel cost should be engineered, not left to default — 80% of tasks should run on DeepSeek ($0.14/M), not Claude ($3+/M).

Skip ifUsing expensive model for trivial tasks, No coverage threshold enforcement

vs seeds

claude-flow(large MCP toolserver + multi-agent) but uniquely ships 13-tier cost-aware model routing (T0 Qwen3 local to T12 Claude O…

Primitive shape 123 total

Commands 23 Skills 67 Subagents 6 Hooks 12 MCP tools 15

Summary

Claude Bootstrap + Maggy — Summary

Claude Bootstrap is a two-component framework: (1) a Claude Code config pack (~/.claude/ install via install.sh) shipping 67 skills, 12 hooks, 23 commands, and 1 agent with TDD enforcement and iCPG/Mnemos memory; (2) Maggy, an optional local FastAPI server with 13-tier model routing, a Cortex MCP code-intelligence server (15 MCP tools, SQLite graph), a vanilla-JS web dashboard on port 8080, Docker-isolated parallel agent execution (Polyphony), and a plugin system. The 13-tier routing is cost-aware — semantic blast scoring routes trivial tasks to local Qwen3 ($0), bulk work to DeepSeek ($0.14/M), and architecture decisions to Claude Opus ($15/M). The iCPG (Intent-Augmented Code Property Graph) stores why code exists with 6-dimension drift detection; Mnemos provides fatigue-aware memory with typed checkpoints for context compaction. 1,100+ tests are included.

Differs from seeds: Most complex framework in the batch. Closest to claude-flow (MCP toolserver + multi-agent) but distinctly different: claude-flow has 305 MCP tools; Claude Bootstrap has 15 Cortex tools plus 67 skills plus 13-model routing. The Telos testing framework (IFS = F1×F2×F3 across Conformance×Validation×Integrity) and iCPG (code property graph with 10 edge types) are novel in the corpus. Unlike any seed, Claude Bootstrap explicitly ships cost-aware 13-tier model routing as a first-class feature, with pricing tables in CLAUDE.md.

Overview

Claude Bootstrap + Maggy — Overview

Origin

GitHub: alinaqi/claude-bootstrap (670 stars, 55 forks, MIT, Python). Author: alinaqi. 1,100+ tests, version 6.37.0, actively updated (pushed 2026-05-26 — same day as analysis).

Philosophy

From README:

"Turn Claude Code into a self-reviewing, test-enforced engineering system that remembers context across sessions — then route work across 13 models from a single dashboard."

The two-tier offering:

"Start with Bootstrap; add Maggy when you need the harness."

Core problems named:

"It picks the most expensive model for everything, including trivial tasks" "Context fills up, state is lost, you re-explain yourself every session" "There's no enforcement: code quality, test coverage, and ADR compliance only happen if you remember to ask" "Running multiple agents on the same repo causes file conflicts" "You have no visibility into what Claude is actually doing inside your codebase"

Key Systems

TDD enforcement: Stop hooks — tests must pass before Claude considers a task done
13-tier model routing: Semantic blast score (1-10) routes to cheapest capable model (Qwen3 local → DeepSeek → Kimi → Gemini → Grok → Codex → Claude)
iCPG: Intent-Augmented Code Property Graph — stores why code exists, 6-dimension drift detection, prevents duplicate implementations
Mnemos: Task-scoped memory with 4-dimension fatigue model, survives context compaction
ADR enforcement: Non-trivial changes require Architectural Decision Record
Telos: Testing beyond TDD — IFS (Intent Fidelity Scale) = F1×F2×F3
Cortex MCP: Code intelligence with 15 tools, cyclomatic complexity, FTS5 search
Polyphony: Docker-isolated parallel agents

Explicit Antipatterns

From CLAUDE.md:

Using expensive model (Claude) for trivial tasks
No coverage threshold enforcement
No ADR for architecture changes
Context loss at compaction

Two-Tier Architecture

Bootstrap (30 seconds): ./install.sh → copies to ~/.claude/ Maggy (5 minutes): pip install -e . + maggy serve → web dashboard

Architecture

Claude Bootstrap + Maggy — Architecture

Distribution

Type: Claude plugin + Python FastAPI server
Install: git clone + ./install.sh (Bootstrap) + pip install -e . (Maggy)
License: MIT
Language: Python (backend), JavaScript (dashboard)
Dashboard: http://localhost:8080 (Maggy only)

Component Diagram

Claude Bootstrap (installed to ~/.claude/)
├── skills/         67 domain skills
├── hooks/          12 lifecycle hooks
├── commands/       23 slash commands
├── rules/          Conditional rules by file glob
└── templates/      settings.json, CLAUDE.md, ADR template

Maggy (local server, optional)
├── pipeline/       ChatPipeline orchestrator
├── skills/         Skill injection + YAML protocol engine
├── api/            REST API (chat, routing, plugins, pipeline logs)
├── static/         Web dashboard (vanilla JS, no build step)
└── services/       Routing, memory, execution, Mnemos

cortex-mcp/         Code intelligence MCP server
├── src/cortex/
│   ├── structure/  AST extraction, edge types, cyclomatic complexity
│   └── storage/    SQLite graph store, FTS5 index

plugins/
├── build-in-public # Auto-posts to LinkedIn/X
├── telos/          Intent Fidelity Scale testing
└── providers/      GitHub, Asana, Monday integrations

Install Paths

Bootstrap (30 seconds)

git clone https://github.com/alinaqi/maggy.git
cd maggy && ./install.sh
# Copies to ~/.claude/

Full Harness

cd maggy && pip install -e .
maggy serve   # Dashboard at localhost:8080

13-Tier Model Routing

Tier	Model	Cost	Role
T0	Qwen3 (local)	$0	Classification, triage, free bulk ops
T1	Gemini Flash-Lite	$0.10/M	Bulk extraction, CIG pipelines
T2	DeepSeek Flash	$0.14/M	Docs, tests, scaffolding
T3	Gemini Flash	$0.15/M	Multimodal, vision, audio
T4	DeepSeek Pro	$0.435/M	Complex coding, multi-file refactors
T5	Gemini CLI	~$0.25-1.25/M	Multi-file agentic coding
T6	AGY	Google tier	End-to-end implementation
T7	Kimi	$0.60/M	Long-context analysis
T8	Gemini Pro Search	$1.25/M	Deep research, 2M context
T9	Grok	$5/M	Competitor intel, deep reasoning
T10	Codex	varies	Bulk generation, security tasks
T11	Claude Sonnet	$3-5/M	Quality-critical code
T12	Claude Opus	$15-25/M	Architecture, security, ADRs

Routing is semantic (Qwen3 as local classifier), fatigue-aware, budget-capped, and cascading.

Required Runtime

Python 3.11+ (Maggy)
Claude Code (Bootstrap)
Optional: Docker (Polyphony parallel agents)
Optional: [vectors] for embeddings

Components

Claude Bootstrap + Maggy — Components

Skills (67 — in .claude/skills/)

Domain categories: aeo-optimization, agent-teams, agentic-development, ai-models, android-java, android-kotlin, autonomous-testing, aws-aurora, aws-dynamodb, azure-cosmosdb, base, build-in-public, cloudflare-d1, code-deduplication, code-graph, code-review, codex-review, commit-hygiene, cpg-analysis, credentials, cross-agent-delegation, database-schema, existing-repo, external-model-delegation, firebase, flutter, gemini-review, icpg, iterative-development, klaviyo, llm-patterns, maggy, medusa, mnemos, model-routing, ms-teams-apps, nodejs-backend, playwright-testing, polyphony, posthog-analytics, project-tooling, pwa-development, python, react-native, react-web, reddit-ads, reddit-api, security, session-management, shopify-apps, site-architecture, supabase-nextjs, supabase-node, supabase-python, supabase, team-coordination, ticket-craft, typescript, ui-mobile, ui-testing, ui-web, user-journeys, visual-validation, web-content, web-payments, woocommerce, workspace

Commands (23 — in .claude/commands/)

Command	Purpose
`analyze-repo.md`	Analyze repository structure
`analyze-workspace.md`	Workspace analysis
`build-in-public.md`	Auto social media post
`check-contributors.md`	Contributor analysis
`icpg-bootstrap.md`	Initialize iCPG
`icpg-drift.md`	Detect iCPG drift
`icpg-impact.md`	Impact analysis
`icpg-intent.md`	Intent recording
`icpg-why.md`	Why-code lookup
`initialize-project.md`	Project setup
`maggy-init.md`	Initialize Maggy harness
`maggy.md`	Maggy operations
`mnemos-checkpoint.md`	Manual memory checkpoint
`mnemos-status.md`	Memory status
`polyphony-init.md`	Initialize Docker isolation
`polyphony-spawn.md`	Spawn parallel agent
`polyphony-status.md`	Agent status
`set-tracker.md`	Set task tracker
`spawn-team.md`	Create agent team
`sync-agents.md`	Sync agent state
`sync-contracts.md`	Contract sync
`update-code-index.md`	Update Cortex index
`usage-summary.md`	Cost/usage summary

Hooks (12 — in .claude/hooks/)

Hook	Purpose
`auto-review-hook`	Auto code review on changes
`icpg-inject-context`	Inject iCPG context
`icpg-record-intent`	Record intent before changes
`mid-task-escalation`	Escalate if complexity rises
`mnemos-session-start.sh`	Load Mnemos memory at session start
`plugin-trigger`	Trigger plugins on events
`polyphony-auto-isolate`	Auto Docker isolation
`post-commit-graph`	Update graph after commit
`pre-push`	Pre-push quality gates
`route-task-hook`	Route task to appropriate model tier
`usage-summary-hook`	Update usage metrics
`workspace`	Workspace management

Agents (1 — in .claude/agents/)

explore.md — Exploration agent

Agent Teams (6 — via skills/agent-teams)

Lead, Quality, Security, Review, Merger, Feature

Cortex MCP Server (15 tools)

Code intelligence tools:

analyze_codebase — Full structural analysis
get_architecture_overview — Architecture summary
get_call_graph — Function call graph
orient() — Smart entry point for context
11 more tools for graph traversal, drift, complexity

Plugins

build-in-public — Auto-posts to LinkedIn/X
telos — Intent Fidelity Scale testing
Provider plugins: GitHub, Asana, Monday

Telos Testing Framework

IFS (Intent Fidelity Scale) = F1 × F2 × F3

F1 Conformance: passed/total tests (pytest/vitest)
F2 Validation: drift severity (Cortex drift_events)
F3 Integrity: orphan symbols, empty contracts, stale reasons, scope sprawl

Zero in any plane collapses IFS to zero. 100% test pass rate + severe architectural drift = 0 score.

Prompts

Claude Bootstrap + Maggy — Prompts

Excerpt 1: CLAUDE.md — Model Routing Table

Technique: Cost-aware routing as standing instruction. Priced decision table forces model economy at skill level.

## Model Routing — Token Economy (13-Tier)

| Tier | Model | Cost (in/out per M) | Role |
|------|-------|---------------------|------|
| 0 | Qwen3 (local) | $0 | File reads, quick edits, boilerplate, offline |
| 1 | Gemini 2.5 Flash-Lite | $0.10 / $0.40 | Bulk extraction, classification, CIG pipelines |
| 2 | DeepSeek V4 Flash | $0.14 / $0.28 | Sub-agents, cheap internal calls |
| 4 | DeepSeek V4 Pro | $0.435 / $0.87 | Main coding workhorse — ~80% of work |
| 11 | Claude Sonnet | $3-5 / $15-25 | Quality-critical code, complex debugging |
| 12 | Claude Opus | Architecture, security review, ADR decisions |

### Use qwen3 (local) for:
- grep, find, awk, sed, jq questions
- Shell one-liners and regex
- Quick syntax lookups, log reading, file search

Invoke: qwen3 ""

Excerpt 2: CLAUDE.md — TDD Iron Law

Technique: Unconditional mandate with explicit step sequence. Pattern matches superpowers' Iron Law approach.

## TDD — Non-Negotiable

ALWAYS write tests before implementation. No exceptions.

1. Read the requirement
2. Write failing tests that describe the expected behaviour
3. Run tests — confirm they fail for the right reason
4. Write the minimum implementation to pass
5. Refactor, keep tests green
6. Never write a function, class, or API endpoint without a corresponding test

If asked to add a feature without tests, write the tests first then ask to proceed.
If a PR diff shows untested code paths, flag them before continuing.

Excerpt 3: Telos IFS Formula

Technique: Quantitative intent verification — novel pattern. Multiplied dimensions collapse to zero if any fail.

## Telos: Testing Beyond TDD

IFS (Intent Fidelity Scale) = F1 × F2 × F3

F1 — Conformance:  passed / total tests            (pytest / vitest)
F2 — Validation:   drift severity                  (Cortex drift_events)
F3 — Integrity:    IF-3 orphan symbols              (no reason edges)
                   IF-4 empty contracts             (no pre/post/invariants)
                   IF-6 stale reasons               (proposed >7d, never fulfilled)
                   IF-7 scope sprawl                (reason scopes >10 files)

A zero in any plane collapses IFS to zero. 100% test pass rate with severe architectural drift = score of 0.

Excerpt 4: Skill Protocol Routing (from Maggy ChatPipeline)

Technique: Intent matching + YAML workflow execution. "Push to git" becomes a multi-step protocol automatically.

Routing a task:
  You: "review the auth middleware for timing attacks"
  → Blast score: 8/10 (security + architecture)
  → Routed to: Claude (Tier 11)
  → ADR gate: found docs/adr/0003-jwt-strategy.md → injected as context
  → Review runs with full architectural context

Skill Protocol execution:
  You: "push to git"
  → Intent matched: git-push protocol
  → ✅ lint → ✅ typecheck → ✅ tests → ✅ stage → ✅ commit → ✅ push

Uniqueness

Claude Bootstrap + Maggy — Uniqueness

Differs from Seeds

Most similar to claude-flow (large MCP toolserver + multi-agent + external memory) but architecturally distinct in five dimensions: (1) 13-tier cost-aware model routing is the most elaborate in any framework analyzed — semantic blast scoring routes tasks to cheapest capable model from local Qwen3 ($0) through DeepSeek ($0.14/M) to Claude Opus ($25/M), with pricing tables embedded in CLAUDE.md as standing instructions; (2) Telos IFS (F1×F2×F3 multiplicative intent fidelity scale) is a novel testing paradigm that collapses to zero on architectural drift even with 100% test pass rate — not present in any seed; (3) iCPG (Intent-Augmented Code Property Graph) with 10 edge types, drift detection, and 15 Cortex MCP tools goes deeper than ccmemory's Neo4j approach by storing why code exists alongside what it does; (4) Docker-isolated parallel agents (Polyphony) is distinct from claude-flow's worktree isolation and superpowers' Task tool spawning; (5) 67 domain-specific skills covering 30+ technology stacks (Supabase, Shopify, Firebase, Flutter, React Native, Playwright, etc.) is the most domain-comprehensive skill library in the corpus.

Positioning

Target user: Solo engineers and teams who want aggressive cost optimization, TDD enforcement, and architectural memory without vendor lock-in
Key differentiator: 13-tier routing makes token cost a first-class engineering concern — unique in the corpus
Comprehensive coverage: 67 skills × 12 hooks × 23 commands + Telos + iCPG + Polyphony + web dashboard = most complete harness in the batch

Observable Failure Modes

Complexity overhead: The combination of iCPG + Mnemos + Telos + 13-tier routing + Cortex is substantial — Bootstrap layer alone is manageable, but full Maggy harness requires Python 3.11+, Docker, and 5+ API keys
Local Qwen3 dependency: T0 routing requires a local Qwen3 model installation — adds GPU/memory requirements for the routing layer
Build-in-Public plugin: Auto-posting to LinkedIn/X is a social media side-effect that runs from code change context — potential for unintended disclosures
Dashboard vanilla JS: No build step is convenient but limits dashboard UI complexity and maintainability
13 models to maintain: Cost routing breaks as model pricing changes; the pricing table in CLAUDE.md will require updates

Novel Patterns

Telos IFS: multiplicative intent fidelity score — most rigorous test quality metric in the corpus
13-tier routing with embedded pricing: treating model cost as a first-class engineering constraint
Mnemos 4-dimension fatigue: session memory health as a trackable engineering metric
iCPG 10-edge-type intent graph: why-oriented code property graph with contract tracking

Workflow

Claude Bootstrap + Maggy — Workflow

Bootstrap-Only Workflow

Install: ./install.sh
    ↓
New Claude Code session:
  mnemos-session-start.sh hook fires → load memory
  icpg-inject-context hook → inject code property graph context
    ↓
Code change:
  TDD: write tests first (enforced by Stop hooks)
  Max 20 lines/function, 3 params, 2 nesting levels (quality gate)
    ↓
Commit:
  pre-push hook → lint + typecheck + tests
  post-commit-graph hook → update Cortex graph
    ↓
Architecture change:
  ADR gate: requires Architectural Decision Record
  icpg-record-intent → why this code exists

Full Maggy Workflow (with routing)

User message
    ↓
ChatPipeline: blast score (1-10)
    ↓
route-task-hook: select cheapest capable model (T0-T12)
    ↓
Skill protocol check: match intent → YAML protocol
    ↓
Execute skill protocol (e.g., git-push: lint → test → stage → commit → push)
    ↓
Usage tracking + fatigue score update

Skill Protocol Execution Example

You: "push to git"
→ Intent matched: git-push protocol
→ ✅ lint       (2.1s)
→ ✅ typecheck   (4.3s)
→ ✅ tests       (11.2s)
→ ✅ stage
→ ✅ commit      [AI-generated message]
→ ✅ push

Approval Gates

TDD gate: Tests must pass before Stop (hook enforcement)
Quality gate: Max lines/function, params, nesting — per-file
ADR gate: Architecture changes require ADR document
Pre-push gate: lint + typecheck + tests
iCPG gate: Drift detection at pre-commit
Telos IFS gate: IFS = 0 if any plane is zero

Phase to Artifact Map

Phase	Artifact
Session start	Mnemos memory loaded, iCPG context injected
Intent recording	iCPG intent node created
Implementation	Source code + tests (TDD first)
Commit	Post-commit graph update, ADR if needed
Session end	Mnemos checkpoint, usage summary
Delivery	Build-in-public plugin (optional social post)

Fatigue-Aware Memory

Session fatigue: 0.61 (PRE-SLEEP)
→ Mnemos: auto-checkpoint written
→ Micro-consolidation: 3 ResultNodes compressed
→ iCPG context injected: 2 ReasonNodes, 1 constraint
→ Context freed: ~18k tokens

4-dimension fatigue model tracks session health and auto-compacts context before quality degrades.

Memory Context

Claude Bootstrap + Maggy — Memory & Context

Mnemos (Task-Scoped Memory)

Type: File-based with 4-dimension fatigue model
Persistence: Project + cross-session
7 amnesia types tracked: uninitialized, premature, sequential, contextual, temporal, semantic, terminal
Typed checkpoints: ResultNode, ReasonNode, ConstraintNode
Engram: Cross-session memory persisting architectural knowledge across weeks

Session start: mnemos-session-start.sh hook loads prior memory. During session: fatigue score tracked in 4 dimensions. At 0.61 fatigue (PRE-SLEEP threshold): auto-checkpoint + micro-consolidation.

iCPG (Intent-Augmented Code Property Graph)

Type: Code property graph (10 edge types)
Backend: SQLite
Features:
- Why code exists (intent edges)
- 6-dimension drift detection
- Cyclomatic complexity
- Bidirectional traversal
- FTS5 full-text search
- Prevents duplicate implementations
Cortex MCP: 15 tools expose the graph via MCP protocol

iCPG stores: ReasonNodes (why code was written), ConstraintNodes (architectural constraints), drift events, pre/post/invariant contracts.

State Files

Path	Purpose
Cortex SQLite DB	Call graph, clusters, iCPG edges
`docs/adr/`	Architectural Decision Records
Mnemos checkpoints	Session memory snapshots
`.claude/rules/`	Conditional rules by file glob

Compaction Handling

Yes — Mnemos 4-dimension fatigue model triggers auto-checkpoint at fatigue thresholds:

Fatigue 0.61 (PRE-SLEEP): auto-checkpoint + micro-consolidation (~18k tokens freed)
pre-compact.sh-equivalent via mnemos-session-start.sh hook

Cross-Session Handoff

Yes via Mnemos. Architectural knowledge persists across weeks via Engram. iCPG graph persists in SQLite across sessions.

Search

Cortex MCP orient() — graph-native, replaces file reads with targeted graph queries
FTS5 full-text search in SQLite
Semantic search (optional, if embeddings enabled)

Orchestration

Claude Bootstrap + Maggy — Orchestration

Multi-Agent: Yes (via Polyphony + Agent Teams)

Polyphony: Docker-isolated parallel agents. Second session auto-provisions a workspace.
Agent Teams: 6 named roles (Lead, Quality, Security, Review, Merger, Feature)
Cross-agent delegation: Skills for delegating to external models
Spawn mechanism: Claude Code Task tool + Docker containers (Polyphony)

Orchestration Pattern: Hierarchical

Lead agent coordinates Agent Teams
Quality/Security/Review agents run in parallel on specific concerns
Merger agent handles conflict resolution

Isolation Mechanism: Container (Polyphony)

Docker-isolated parallel agent execution. Each parallel agent runs in a Docker container. polyphony-auto-isolate hook triggers automatically.

Execution Mode: Interactive-loop

Session-based. Bootstrap layer fires on session events; Maggy adds a REST API for programmatic control.

Multi-Model: Yes (13 tiers)

The most extensive multi-model routing in the batch:

Classification	Model
Triage/classification	Qwen3 (local, $0)
Bulk work	DeepSeek Flash (T2, $0.14/M)
Main coding	DeepSeek Pro (T4, ~80% of work)
Quality/security	Claude Sonnet (T11)
Architecture/ADR	Claude Opus (T12)

Routing: semantic blast score (1-10) + task type classification by local Qwen3 + budget cap + cascade.

Prompt Chaining: Yes

Intent recording (iCPG) → implementation → drift detection → Telos validation. Each stage feeds the next.

Auto-Validators

TDD: Stop hooks enforce tests before "done"
Quality gates: Max 20 lines/function, 3 params, 2 nesting levels (per-file)
ADR gate: Architecture changes require ADR
Pre-push: lint + typecheck + tests
iCPG drift: Pre-commit drift detection
Telos IFS: F1×F2×F3 intent fidelity check
Cortex: cyclomatic complexity, orphan symbol detection

Git Automation

Skill protocol for git-push: lint → test → stage → commit → push (auto when invoked)
post-commit-graph hook updates Cortex after each commit
PR creation: not automated

Consensus Mechanism: None

No multi-agent consensus protocol. Agent Teams use a Merger agent for conflict resolution, not a quorum vote.

Cross-Tool Portability: Medium

Bootstrap targets Claude Code. Maggy's REST API and YAML skill protocols are more portable. The 67 skills are Claude Code-specific.

Ui Cli Surface

Claude Bootstrap + Maggy — UI / CLI Surface

CLI Binary

Name: maggy
Is thin wrapper: No — full FastAPI server with routing, memory, and plugins
Subcommands: serve (main; others via REST API)

maggy serve   # Dashboard at localhost:8080

Local Web Dashboard

Exists: Yes (Maggy only)
Type: web-dashboard
Port: 8080
Tech stack: Vanilla JS (no build step), FastAPI backend

Dashboard features (from README):

Multi-model chat routing visualization
Pipeline logs (tool calls, routing decisions)
Usage metrics and cost tracking
Agent status and coordination view
Skill protocol execution logs

Dashboard is vanilla JS with no build step — runs directly via FastAPI's static file serving.

Bootstrap Install

git clone https://github.com/alinaqi/maggy.git
cd maggy && ./install.sh
# Installs to ~/.claude/ in ~30 seconds

Cortex MCP

The Cortex MCP server provides code intelligence:

orient() — entry point replacing file reads
15 graph-native tools via stdio MCP protocol
SQLite backend, FTS5 index

{
  "mcpServers": {
    "cortex": {
      "command": "python",
      "args": ["-m", "cortex_mcp"]
    }
  }
}

Plugin System

Drop-in plugins in plugins/:

Build-in-Public: Auto-posts session summaries to LinkedIn/X
Telos: IFS testing integration
Provider plugins: GitHub, Asana, Monday task integration

Plugin installation: drop .yaml in maggy/skills/protocols/ for skill protocols.

IDE Integration

Claude Code: Full support via ~/.claude/ install
Codex: codex-review skill
Gemini CLI: gemini-review skill
External models: external-model-delegation skill

REST API (Maggy)

FastAPI endpoints:

POST /chat — Route message through model tiers
GET /routing — Get routing decision
GET /plugins — List active plugins
GET /pipeline-logs — Pipeline execution history

Observability

Usage summary via usage-summary-hook and /usage-summary command
Pipeline logs in Maggy dashboard
Cortex: cyclomatic complexity, call graph centrality
Mnemos: fatigue scores and checkpoint history

Related frameworks

same archetype · same primary tool · same memory type

CodeMachine CLI ★ 2.5k

A16 Cross-vendor router

JavaScript-DSL workflow orchestration engine that captures repeatable AI coding agent workflows with tracks, condition groups,…

Codexia ★ 690

A16 Cross-vendor router

Tauri desktop app providing visual control plane, task scheduler, git worktree manager, and headless REST API for Codex CLI +…

Kagan ★ 88

A16 Cross-vendor router

Kanban TUI for AI coding agents with a structurally enforced human review gate (REVIEW → DONE cannot be automated) — one git…

oh-my-claudecode (Yeachan-Heo) ★ 35k

A16 Cross-vendor router

Zero-learning-curve teams-first multi-agent orchestration for Claude Code with autopilot (6-phase lifecycle), ralph (PRD-driven…

Paseo ★ 6.8k

A16 Cross-vendor router

Multi-provider AI coding agent orchestration daemon with cross-device access (phone/desktop/CLI) and git worktree isolation.

CCG Workflow ★ 5.4k

A16 Cross-vendor router

Routes Claude + Codex + Gemini to task-appropriate collaboration strategies (direct-fix through full-collaborate) with hook-based…

Distribution

Type: claude-plugin
License: MIT
Install: multi-step
Version: 6.37.0

Surfaces

CLI binary: maggy
CLI subcmds: 1
Local UI: web-dashboard
UI port: 8080
Tech stack: Vanilla JS (no build step), FastAPI

Components

Commands: 23
Skills: 67
Subagents: 6
Hooks: 12
MCP servers: 1
MCP tools: 15
Scripts: 1
Templates: 4

Workflow

Phases: 6
Approval gates: 6
Spec format: markdown
Spec storage: per-feature-folder
Delta or full: whole-file

Orchestration

Multi-agent: Yes
Pattern: hierarchical
Isolation: container
Consensus: none
Prompt chaining: Yes

Multi-model

Multi-model: Yes
BYOK: Yes
Modal: text

Execution

Mode: interactive-loop
Crash recovery: Yes
Compaction: Yes
Session handoff: Yes
Streaming: Yes

Memory

Type: hybrid
Persistence: global
Search: hybrid
State files: 3 files

Quality

TDD: Yes
TDD mechanism: post-hook-test-runner
Validators: 7
Self-review: adversarial-subagent

Git / Observability

Auto commit: Yes
Auto PR: No
Auto merge: No
Worktree/feat: No
Audit log: Yes
Audit format: sqlite
Replay: No

Tools

Primary: claude-code
Targets: 4
Portability: medium

Signals

Stars: 670
Last commit: 2026-05-26
Maintainer: active
Quality score: 8.6/10