Skip to content
/

Spartan AI Toolkit

spartan-ai-toolkit · c0x12c/ai-toolkit · ★ 72 · last commit 2026-05-11

Engineering discipline layer delivering 73 commands, 9 reviewer agents, 8 stack profiles, and mandatory git worktree + quality gate enforcement for every AI-generated feature.

Best whenNever self-review: always spawn a dedicated reviewer agent before shipping any code.
Skip ifjumping straight to code without spec, agent self-review (never allowed)
vs seeds
openspec's 11, mandatory git worktree isolation pe…
Primitive shape 118 total
Commands 73 Skills 34 Subagents 9 Hooks 2
00

Summary

Spartan AI Toolkit — Summary

Spartan AI Toolkit is an npm-installable Claude Code plugin delivering a comprehensive engineering discipline layer: 73 slash commands (70 under spartan:* + meta), 34 skills, 9 specialized reviewer agents, 28 coding rules across 8 stack profiles, 5 quality gates, and a 3-layer agent memory system. The framework enforces end-to-end workflows — each workflow leader command (build, debug, startup, onboard, research) runs a full pipeline (spec → design → TDD → code review → PR) with mandatory quality gates between every step and git worktrees per feature. It mandates worktree isolation (MANDATORY in the build command), spawns dedicated reviewer agents rather than allowing self-review, and requires conventional commits. A PostToolUse statusline hook and a SessionStart update-check hook are the only Claude Code lifecycle hooks shipped. Compared to seeds: Spartan is architecturally similar to openspec (mirrored commands + quality gates) but substantially richer — 73 commands vs openspec's 11, enforced worktree isolation, stack-specific profiles, and a reviewer-agent separation that is unique among seeds and batch members.

01

Overview

Spartan AI Toolkit — Overview

Origin

Repo: c0x12c/ai-toolkit (GitHub org: c0x12c). Display name: "Spartan AI Toolkit". npm: @c0x12c/ai-toolkit v1.26.0. JavaScript, no license specified in package.json. 72 stars. 7 contributors. Author: Khoa Tran (c0x12c). Created March 2026.

Philosophy

From README:

"Stop AI coding agents from shipping sloppy code."

"AI coding agents are fast. They're also careless. They skip tests, ignore your coding standards, push without review, and forget everything between sessions."

"Spartan fixes that. One command runs a full engineering workflow — spec, plan, TDD, code review, PR — with quality gates between each step. Your rules, your standards, enforced every time."

Key beliefs:

  1. Workflow leaders not just prompts — commands run pipelines, not one-off actions
  2. Quality gates before progression — nothing advances without passing the gate
  3. No self-review — always spawn a dedicated reviewer agent (never "let Claude check its own work")
  4. Worktree isolation — every feature in its own worktree (MANDATORY in build command)
  5. Stack-specific rules — coding rules should match your actual tech stack

From README: Before/After

"Without Spartan: 'Build this feature' → jumps straight to code, no plan, no tests, pushes broken code" "With Spartan: /spartan:build → writes spec, plans tasks, TDD for each, code review, then PR"

Plugin Description

From plugin.json:

"Engineering discipline layer for Claude Code — 5 workflows, 69 commands, 21 rules, 29 skills, 9 agents organized in 12 packs"

02

Architecture

Spartan AI Toolkit — Architecture

Distribution

  • npm: npx @c0x12c/ai-toolkit@latest --local (interactive menu)
  • Claude Code plugin: toolkit/.claude-plugin/
  • Installer installs commands into .claude/commands/spartan/

Install Methods

npx @c0x12c/ai-toolkit@latest --local          # Interactive menu
npx @c0x12c/ai-toolkit@latest --local --packs=backend-micronaut,frontend-react
npx @c0x12c/ai-toolkit@latest --local --packs=all

Directory Structure (repo)

c0x12c/ai-toolkit/
  toolkit/
    .claude-plugin/
      plugin.json          # Claude Code plugin manifest
      marketplace.json
    commands/
      spartan.md           # Router command
      spartan/             # 70 individual spartan:* commands
    skills/                # 34 skills (domain knowledge packs)
    agents/                # 9 specialized reviewer/planner agents
    hooks/
      spartan-statusline.js     # PostToolUse hook
      spartan-check-update.js   # SessionStart hook
    rules/                 # Stack-specific coding rules
    profiles/              # 8 stack profiles
    packs/                 # Pack bundles
    frameworks/            # Per-framework utilities
    lib/                   # Shared library
    templates/             # Project templates
    scripts/               # Build scripts
    claude-md/             # CLAUDE.md templates
    codex/                 # Codex compatibility
    bin/                   # CLI binary
    bridges/               # External integrations
    experiments/           # Experimental features
  .claude/                 # Example Claude config
  .claude-plugin/          # Root-level plugin
  .codex/                  # Codex config

Required Runtime

  • Node.js (npm/npx install)
  • Claude Code (primary target)
  • Codex (secondary support)

Target AI Tools

Claude Code (primary), Codex (secondary), Cursor, Windsurf, Copilot (via rules being plain markdown).

03

Components

Spartan AI Toolkit — Components

Commands (73 total)

5 Workflow Leaders

Command Pipeline
/spartan:build spec → design → plan → TDD → code review → PR
/spartan:debug reproduce → root cause → test-first fix → PR
/spartan:startup brainstorm → validate → research → pitch
/spartan:onboard scan codebase → map architecture → save to memory
/spartan:research frame question → gather sources → analyze → report

Key Individual Commands (70 under spartan:*)

Including: brainstorm, brownfield, build, careful, codex, commit-message, commit-message-with-codex, content, context-save, contribute, daily, debug, deep-dive, deploy, e2e, env-setup, epic, fe-review, figma-to-code, freeze, fundraise, gate-review, guard, init-project, init-rules, interview, js-security, kickoff, kotlin-service, lean-canvas, lint-rules, magic-doc, memory-consolidate, migration, next-app, next-feature, onboard, ops-investigate-alert, ops-oncall-log, outreach, pitch, plan, pr-ready, qa, research, review, scan-rules, sessions, ship-pr, ship-pr-codex, spec, startup, teardown, testcontainer, tf-cost, tf-deploy, tf-drift, tf-import, tf-module, tf-plan, tf-review, tf-scaffold, tf-security, think, unfreeze, update, ux, validate, web-to-prd, write, and more.

Meta Command

/spartan — smart router that figures out what the user needs.

Skills (34 domain knowledge packs)

Examples: api-endpoint-creator, backend-api-design, brainstorm, browser-qa, ci-cd-patterns, competitive-teardown, database-patterns, deep-research, design-intelligence, idea-validation, investor-materials, js-security-audit, kotlin-best-practices, market-research, python-api-endpoint-creator, python-best-practices, python-testing-strategies, security-checklist, service-debugging, startup-pipeline, terraform-best-practices, terraform-module-creator, terraform-review, terraform-security-audit, terraform-service-scaffold, testing-strategies, ui-ux-pro-max, web-to-prd.

Agents (9)

Agent Purpose
ai-designer UI/UX design intelligence
design-critic Critical design feedback
idea-killer Challenge ideas adversarially
infrastructure-expert Infrastructure review
micronaut-backend-expert Kotlin/Micronaut specialist
phase-reviewer Gate review between workflow phases
research-planner Research planning specialist
solution-architect-cto Architecture decisions
sre-architect Reliability engineering

Hooks (2)

Hook Event Purpose
spartan-statusline.js PostToolUse Status line showing model, context, branch
spartan-check-update.js SessionStart Check for Spartan updates, write to cache

Coding Rules (28+)

Stack-specific rules in rules/ — organized by language/framework.

Stack Profiles (8)

kotlin-micronaut, react-nextjs, go-standard, python-django, python-fastapi, java-spring, typescript-node, custom.

Quality Gates (5)

Built into the workflow leaders: Gate 1 (post-spec), Design Gate, Gate 2 (post-plan), Gate 3 (post-TDD), Gate 3.5 (post-code-review), Gate 4 (pre-PR).

05

Prompts

Spartan AI Toolkit — Prompts

Verbatim Excerpt 1: /spartan:build command (Mandatory Stages table)

### Mandatory Stages

| Stage | Can skip? | Agent Teams behavior |
|-------|-----------|----------------------|
| 1 Spec | NO | single session |
| 2 Design | Only if pure data change (no UI) | single session |
| 3 Workspace + Plan | NO | single session |
| 4 Implement | NO | **MUST `TeamCreate` ONCE** as `spartan-{feature-slug}` when `AGENT_TEAMS=on` |
| 5 Review | **NEVER** — spawn review agent, never self-review | **REUSE the Stage 4 team** |
| 6 Ship | NO | single session + ONE `TeamDelete` for the shared team |

Technique: mandatory/skip matrix — the command uses a two-column table (Can skip?) to make the required vs optional distinction explicit and unambiguous for the agent. The ALL-CAPS "NEVER" for self-review and "MANDATORY" for worktrees are Claude-targeted emphasis signals.

Verbatim Excerpt 2: /spartan:build pipeline diagram

SINGLE FEATURE:
  Context → Spec → Design? → Workspace → Plan → Implement → Review → Ship
                                  ↑                            ↑
                            git worktree                  Spawn agent
                            (MANDATORY)                   (MANDATORY)

PARALLEL (multiple terminals — each gets its own worktree):
  Terminal 1: /spartan:build auth     → .worktrees/auth/     → PR #1
  Terminal 2: /spartan:build payments → .worktrees/payments/  → PR #2

Technique: ASCII architecture diagram — the workflow is represented as a visual pipeline with arrows, not prose. The (MANDATORY) annotations are placed at the decision points to prevent the agent from skipping them.

Verbatim Excerpt 3: README Quality Gates


spec → design (if UI) → plan → TDD → code review → PR | | | | | | Gate 1 Design Gate Gate 2 Gate 3 Gate 3.5 Gate 4


Nothing ships without passing every gate.

Technique: gate-as-invariant framing — "Nothing ships without passing every gate" is an Iron Law statement. The gate positions in the pipeline diagram are labeled with numbers, making them countable and auditable. This is the same "Iron Law with position-in-pipeline context" technique used by superpowers.

09

Uniqueness

Spartan AI Toolkit — Uniqueness & Positioning

Differs from Seeds

Spartan is most architecturally similar to openspec (mirrored command structure, quality-gate enforcement, spec-first workflow) but with dramatically larger scope: 73 commands vs openspec's 11, git worktree isolation (MANDATORY) vs openspec's none, 9 reviewer agents vs openspec's 0, 8 stack-specific profiles vs openspec's single approach. The mandatory reviewer-agent separation (never self-review) is a distinctive opinion not present in any seed. The 3-layer memory system (index → topics → transcript archive) resembles taskmaster-ai's file-based state but is explicitly structured for cross-session continuity rather than task tracking. The spec-kit seed is the closest comparator for the hook + command + quality-gate pattern, but spec-kit has 18 hooks vs Spartan's 2, and Spartan has far more domain commands. BMAD-METHOD is closest for the agent specialization (6 personas), but Spartan's 9 agents are narrower-scoped reviewers rather than full personas.

Positioning

Signal type: workflow-enforcing quality gates + mandatory reviewer spawning Intervention point: phase gate (prompts enforce each step before proceeding) + PostToolUse statusline Unique features: 73 commands, 8 stack profiles, mandatory git worktree per feature, mandatory reviewer-agent (never self-review), 3-layer memory, 5 named quality gates Target user: professional dev teams who want systematic engineering discipline applied to AI-generated code

Observable Failure Modes

  • Mandatory reviewer spawning costs additional context/tokens per session
  • 73 commands means high cognitive load (the /spartan router mitigates this)
  • Worktree isolation requires clean git repo state
  • Stack profiles must be configured per project (init-rules command)
  • Plugin.json says "69 commands" but repo has 70+ — minor inconsistency
  • No license in package.json is a concern for enterprise adoption
  • 72 stars — decent traction but not mainstream

Relationship to Batch 31

Spartan is the only framework in Batch 31 that is primarily a workflow-discipline layer rather than a safety/guardrail layer. Where clauder, Sponsio, DashClaw, and pi-steering-hooks prevent bad things from happening at runtime, Spartan enforces good engineering practices from the start of each feature. The closest peers in this batch are vibelint (audit) and ctxlint (pre-session lint) but Spartan's intervention is active workflow enforcement, not passive analysis.

04

Workflow

Spartan AI Toolkit — Workflow

Build Workflow (Primary)

spec → design (if UI) → plan → TDD → code review → PR
  |         |              |      |         |          |
Gate 1   Design Gate    Gate 2  Gate 3   Gate 3.5   Gate 4

Each gate must pass before the next phase begins. From the build command:

"Nothing ships without passing every gate."

Mandatory Stages (from build.md source)

Stage Can skip? Notes
1 Spec NO Define spec before any code
2 Design Only for pure data changes Required if UI involved
3 Workspace + Plan NO git worktree creation MANDATORY
4 Implement NO Must TeamCreate once if AGENT_TEAMS=on
5 Review NEVER self-review Spawn dedicated review agent
6 Ship NO PR creation

Worktree Isolation (MANDATORY)

From build.md:

SINGLE FEATURE:
  Context → Spec → Design? → Workspace → Plan → Implement → Review → Ship
                                  ↑                            ↑
                            git worktree                  Spawn agent
                            (MANDATORY)                   (MANDATORY)

PARALLEL (multiple terminals — each gets its own worktree):
  Terminal 1: /spartan:build auth     → .worktrees/auth/     → PR #1
  Terminal 2: /spartan:build payments → .worktrees/payments/  → PR #2

Session Memory

~/.spartan/sessions/   # Active session tracking (per PPID)

Phases and Artifacts

Phase Artifact
Spec spec.md in feature worktree
Design design.md (optional)
Plan task list
TDD failing tests first
Implement passing code
Review review report (from dedicated agent)
Ship PR created

Debug Workflow

reproduce → root cause → test-first fix → PR

06

Memory Context

Spartan AI Toolkit — Memory & Context

State Storage

3-layer memory system:

Layer Path Content
Index ~/.spartan/memory/index.md High-level project index
Topics ~/.spartan/memory/topics/*.md Per-topic detailed context
Transcripts ~/.spartan/memory/transcripts/ Session transcripts (grep-only archive)

Session Tracking

Active session tokens stored at ~/.spartan/sessions/<PPID>:

  • Created on session start
  • Cleaned up after 120 minutes
  • Used for concurrent session counting

Cross-Session Continuity

The memory system (/spartan:onboard and /spartan:memory-consolidate) captures:

  • Architecture decisions
  • Codebase map
  • Per-session decisions and context

Worktree State

Each feature worktree (.worktrees/<feature-slug>/) contains:

  • Feature spec.md
  • Feature design.md
  • Branch: spartan/<feature-slug>

Context Save

/spartan:context-save — explicit command for saving context to memory.

Compaction

Not explicitly handled. Memory consolidation via /spartan:memory-consolidate.

07

Orchestration

Spartan AI Toolkit — Orchestration

Multi-Agent

Yes — mandatory reviewer agent spawning. The /spartan:build command mandates:

  • Stage 5 (Review): NEVER self-review, always spawn a dedicated review agent
  • Phase-reviewer agent reviews between stages
  • If AGENT_TEAMS=on: TeamCreate for the feature team, TeamDelete on completion

Orchestration Pattern

Sequential with mandatory phase-gating + parallel-fan-out for multi-feature parallel builds.

Parallel mode: multiple terminals each running /spartan:build <feature> with separate worktrees, separate branches, separate PRs.

Isolation Mechanism

Git worktrees (MANDATORY per feature). Each feature gets its own .worktrees/<feature-slug>/ directory and branch spartan/<feature-slug>.

Execution Mode

Interactive-loop (within a session, sequential phase execution with quality gates).

Multi-Model

No explicit multi-model routing in config.

Cross-Tool Portability

Medium — primarily Claude Code, with Codex secondary support and rule files usable by Cursor/Windsurf/Copilot (plain markdown).

Subagent Spawn Mechanism

Claude Code's native Task tool / Claude Managed Agents API (TeamCreate/TeamDelete) when AGENT_TEAMS=on.

Consensus

None — phase-reviewer makes the gate decision; no consensus mechanism.

08

Ui Cli Surface

Spartan AI Toolkit — UI & CLI Surface

CLI Binary (Installer)

Exists: yes (installer only) Name: @c0x12c/ai-toolkit (npx) Install command: npx @c0x12c/ai-toolkit@latest --local Is thin wrapper: no — own installer runtime This is the install CLI, not a runtime CLI — after install, Spartan operates via Claude Code slash commands, not via a CLI binary.

Local UI

None beyond Claude Code's built-in interface.

Statusline Hook

spartan-statusline.js (PostToolUse) shows:

  • Active model name
  • Current directory
  • Context window usage (normalized to usable %, accounting for Claude Code's 16.5% autocompact buffer)
  • Writes context metrics to $TMPDIR/claude-ctx-<session>.json for the context-monitor hook

Update Check Hook

spartan-check-update.js (SessionStart) checks for Spartan updates in the background:

  • Writes result to cache dir
  • Supports multiple config dir locations: .claude/, .opencode/, .gemini/

IDE Integration

Claude Code plugin at toolkit/.claude-plugin/plugin.json.

Codex Support

toolkit/codex/ and toolkit/commands/spartan/ship-pr-codex.md, commit-message-with-codex.md for Codex-specific workflow variants.

Stack Profiles (Config)

# .spartan/config.yaml
stack: go-standard
architecture: clean
rules:
  backend:
    - rules/go/ERROR_HANDLING.md
    - rules/custom/OUR_AUTH_RULES.md
commands:
  test:
    backend: "go test ./..."
  lint:
    backend: "golangci-lint run"

Related frameworks

same archetype · same primary tool · same memory type

OpenHarness ★ 13k

Open-source Python agent runtime providing complete harness infrastructure: tools, memory, governance, swarm coordination, and…

Trae Agent ★ 12k

Research-friendly open-source CLI coding agent by ByteDance, designed for academic ablation studies and modular LLM provider…

Sweep AI ★ 7.7k

Autonomous GitHub bot that converts issues to pull requests using a sequential multi-agent pipeline.

Agent Governance Toolkit (microsoft) ★ 2.3k

Enterprise-grade AI agent governance: YAML policy enforcement, 12-vector prompt injection defense, zero-trust identity,…

TDD Guard ★ 2.1k

Mechanically enforces the Red-Green-Refactor TDD cycle by blocking file writes that violate TDD principles via a PreToolUse hook…

Agentic Coding Flywheel Setup (ACFS) ★ 1.5k

Take a complete beginner from laptop to three AI coding agents running on a VPS in 30 minutes via an idempotent manifest-driven…