Skip to content
/

Spec-Driver (Greenfield Spec-Driven Development)

spec-driver · davidlee/spec-driver · ★ 25 · last commit 2026-05-01

Prevents spec rot in AI-assisted development by making implementation changes flow back into evergreen, authoritative specs via audits, drift ledgers, and design revisions.

Best whenSpecs are not an input consumed at project start but an evergreen truth system that implementation must continuously conform to — the direction is code → spe…
Skip ifDisposable specifications (generate-and-forget), Skipping design revision because 'it's too simple' (named as a trap)
Primitive shape 26 total
Skills 24 Hooks 2
00

Summary

spec-driver — Summary

spec-driver (by David Lee) is a Python CLI + TUI toolkit that installs a structured specification workspace into any Git repo and drives Claude Code (or Codex) to maintain it automatically. It solves the problem of spec rot in AI-assisted development: without enforcement, agents produce code that drifts from stated intent; spec-driver closes that loop by making specs the runtime authority that code must conform to, not disposable scaffolding thrown away after generation. What makes it distinct is its philosophy that specs are evergreen truth systems — implementation change flows back into specs via audits, drift ledgers, and design revisions, so truth accumulates rather than decays. It targets senior developers and architects running greenfield or legacy projects with Claude Code or Codex who need repeatable governance at varying ceremony levels (pioneer / settler / town_planner). As of May 2026, the project is in Beta (v0.9.7), approaching 1.0, with active development and 1,133+ commits.

01

Overview

spec-driver — Overview

Tagline

"You install it. Claude drives the CLI tool. At first it might seem like too much. Eventually, nothing less will make sense."

Origin Story

spec-driver was created by David Lee (GitHub: davidlee) and first committed in October 2025. The project is self-referential: ADR-001 in the repo states "use spec-driver to build spec-driver", so the framework has been dog-fooded from day one. Lee describes his motivation on the companion site supekku.dev as a Socratic dialogue working through the fundamental problems with naive spec-driven development.

Problem Framing (in the author's words)

"Spec-driver is not a framework for development from disposable specifications. It's for driving trustworthy, evergreen specifications out of implementation. A machine, if you will, for converting change into truth."

"The machinery required is more complex than naive approaches (research -> specify -> plan -> execute) - but it's a very conscious tradeoff."

"Because: when machines can code, what remains is system design. This is a system designed to take care of the rest."

The core insight is that most "spec-driven" approaches treat specs as an input consumed at project start. Lee inverts this: implementation changes are deltas that must flow back into the canonical spec record. Every code change lives inside a delta artefact; deltas trigger design revisions, implementation plans, phase sheets, and audits; audits write back into specs. The lifecycle is: delta → DR → IP → phase sheet(s) → implement → audit → revision → patch specs → close.

Philosophy

The README describes the operating model:

"Instead of expensive throwaway research, maintain verifiably accurate, evergreen specs." "Cheap, fast, deterministically generated docs to help those messy, stochastic agents."

Spec-driver explicitly distinguishes three operator modes called "ceremony levels":

  • Pioneer: lightweight, low-ceremony, no mandatory governance
  • Settler: moderate structure with optional governance layers
  • Town Planner: full governance — ADRs, policies, standards, mandatory delta/DR/IP workflow

Token efficiency is stated as a design goal: the README claims < 3k tokens to boot with everything activated. Skills are described as "designed for token-efficiency".

"What We Believe"

The project's internal dogma file (.agents/spec-driver-boot.md) states three principles every agent must internalize:

  • No implementation without a spec-driver artefact
  • Guide the user invisibly in the correct use of spec-driver
  • Pursue correctness, compact token-efficiency, and crisp, pragmatic rigour

Manifesto Aesthetic

The repo's tag line adopts a military aphorism: "Slow is smooth. Smooth is fast." — framing the overhead of spec governance as the mechanism that actually accelerates sustainable delivery.

Acknowledgements

Lee explicitly credits superpowers by obra as a source of intentionally borrowed ideas, and gives a shout-out to lazyspec by jkaloger for TUI ideas.

02

Architecture

spec-driver — Architecture

Distribution Type

cli-tool — published as a Python package on PyPI; also available via Homebrew and Nix Flake.

Install Methods

macOS (Homebrew)

brew update
brew tap davidlee/spec-driver
brew install spec-driver
# Try before install
uvx spec-driver install

# Install as UV tool (recommended)
uv tool install spec-driver
spec-driver install

# As project dependency
uv init
uv add spec-driver
uv run spec-driver install

From GitHub (development)

uv add git+https://github.com/davidlee/spec-driver
uvx --from git+https://github.com/davidlee/spec-driver spec-driver --help

Nix Flake

nix profile install github:davidlee/spec-driver
# or as flake input:
inputs.spec-driver.url = "github:davidlee/spec-driver";

Post-Install Project Setup

spec-driver install   # creates workspace in current git repo
spec-driver doctor    # health check
spec-driver tui       # browse documentation TUI
spec-driver sync      # sync all registries
spec-driver validate  # validate all registries

Install Footprint (per the README)

# spec-driver install creates:
# - .spec-driver/     (YAML registries & configuration)
# - .claude/          (project-local settings & skills)
# - .agents/          (project-local settings & skills)
# - CLAUDE.md         (adds a line to invoke boot script)
# - AGENTS.md         (adds a line to invoke boot script)

All install locations are project-local; no changes to ~/.claude or system config.

Repository Top-Level Directory Layout

spec-driver/
├── .agents/          # Agent-facing skills + boot context
│   ├── skills/       # 24 SKILL.md files (symlinked from .spec-driver/skills/)
│   └── spec-driver-boot.md
├── .claude/          # Claude Code integration
│   ├── agents/       # Skill symlinks
│   ├── hooks/        # SessionStart + PostToolUse Python/shell hooks
│   ├── rules/
│   ├── settings.json # Hook config
│   └── skills/       # Skill symlinks
├── .contracts/       # Auto-generated API contract docs
├── .cursor/rules/    # Cursor IDE rules
├── .spec-driver/     # Canonical workspace
│   ├── agents/       # Agent guidance files (boot.md, workflow.md, routing.md, policy.md, memory.md)
│   ├── audits/
│   ├── backlog/      # issues/, problems/, improvements/, risks/
│   ├── decisions/    # ADRs
│   ├── deltas/       # Change bundles (DE-xxx/)
│   ├── drift/        # Drift ledgers
│   ├── hooks/        # doctrine.md
│   ├── memory/       # Indexed markdown memory files
│   ├── policies/
│   ├── product/      # PROD-xxx product specs
│   ├── registry/     # YAML registries
│   ├── revisions/    # RE-xxx spec revisions
│   ├── skills/       # 24 subdirs, each with SKILL.md
│   ├── standards/
│   ├── tech/         # SPEC-xxx tech specs
│   ├── skills.allowlist
│   └── workflow.toml
├── spec_driver/      # Python package source
├── scripts/
├── supekku/          # TUI app source
├── wub/              # Additional tooling
├── tests/
├── CLAUDE.md
├── AGENTS.md
├── pyproject.toml
├── flake.nix
├── shell.nix
├── Justfile
└── VERSION           # 0.9.7

Required Dependencies

  • Python >= 3.12 (required runtime)
  • uv (recommended package manager; also supports pip)
  • jinja2 >= 3.1.0
  • pydantic >= 2.0, < 3.0
  • pyyaml >= 6.0.3
  • python-frontmatter >= 1.1.0
  • textual >= 8.0, < 9.0 (TUI)
  • tomlkit >= 0.13.0
  • typer >= 0.15.0 (CLI framework)
  • watchfiles >= 1.0, < 2.0 (live agent follow mode)

Optional Contract Generators (for code doc extraction)

  • gomarkdoc — Go: go install github.com/princjef/gomarkdoc/cmd/gomarkdoc@latest
  • zigmarkdoc — Zig: github.com/davidlee/zigmarkdoc
  • ts-doc-extract — TypeScript/JS: npm install -g ts-doc-extract

Configuration Files

File Purpose
.spec-driver/workflow.toml Master project configuration (ceremony mode, tool exec, sync options, enabled features)
.claude/settings.json Claude Code hook bindings (SessionStart, PostToolUse)
CLAUDE.md Injected @.spec-driver/agents/boot.md reference
AGENTS.md Injected @.spec-driver/AGENTS.md and @.spec-driver/agents/boot.md references
.agents/spec-driver-boot.md Pre-generated boot context for cache-optimised sessions
.spec-driver/hooks/doctrine.md Project-specific doctrine for agent hooks
03

Components

spec-driver — Components

Commands (slash-commands / skills)

24 skills under .spec-driver/skills/ (also symlinked into .claude/skills/ and .agents/skills/):

Skill Purpose
/boot Mandatory onboarding; validates pre-generated boot context; every agent invokes on startup
/using-spec-driver Mandatory routing skill for any substantive work; forces agent to choose governing workflow skill before acting
/spec-driver Interact with spec-driver CLI entities (ADRs, specs, deltas, revisions, audits, memories, backlog items)
/preflight Bounded up-front research before implementation; surfaces unknowns and confirms readiness
/retrieving-memory Retrieves indexed project memories before assumptions; mandatory before modifying unfamiliar subsystems
/scope-delta Convert intent into a concrete change bundle (delta + DR + IP)
/draft-design-revision Draft or refine a design revision (DR) for a delta; adversarial review included
/plan-phases Turn design intent into an executable phase plan (IP + phase sheets)
/execute-phase Mandatory execution skill for any delta/IP implementation phase
/audit-change Post-implementation reconciliation — create AUD artefact, disposition findings, patch specs
/close-change Formal delta closure — coverage gates, complete delta, lifecycle updates
/shape-revision Revision-first governance path (concession path, not default)
/capturing-memory Create new project memory from implementation discoveries
/maintaining-memory Update or supersede existing project memories
/reviewing-memory Review and audit memory health/staleness
/dispatch Dispatch work to sub-agents
/sub-driver Sub-agent driver skill
/implement Well-defined implementation already grounded in correct artefacts
/next Determine next governing workflow step
/notes Record implementation notes during execution
/consult Surface design ambiguity or policy conflict for user input
/continuation Resume work across context/session boundaries
/doctrine Load and apply project-specific governance conventions
/update-delta-docs Reconcile and update delta/IP/DR/DE state after execution changes

Subagents / Personas / Roles

The framework defines ceremony-level operating modes as conceptual roles rather than named subagents:

  • Pioneer — lightweight, minimal ceremony
  • Settler — moderate structure
  • Town Planner — full governance

/dispatch and /sub-driver skills suggest multi-agent dispatch capability, but no distinct named subagent personas are defined beyond the skill names.

Hooks

Defined in .claude/settings.json:

Event Matcher Action
SessionStart startup|clear Runs .claude/hooks/startup.sh — generates boot context via spec-driver admin preboot, injects SessionStart additional context instructing the agent to invoke /boot immediately
PostToolUse Read|Edit|Write Runs artifact_event.py asynchronously — tracks which spec-driver artefacts are touched during the session for live TUI follow mode

MCP Servers

(none) — spec-driver does not bundle or require any MCP servers.

Scripts / Binaries

Name Description
spec-driver Unified Python CLI (installed via PyPI / Homebrew / Nix); all commands accessed through this binary
spec-driver install Initialize workspace structure in a git repo
spec-driver sync Synchronize all registries
spec-driver validate Validate all registries
spec-driver doctor Health check
spec-driver tui Launch TUI for browsing documentation
spec-driver admin preboot Generate static boot context file for cache-optimised sessions
spec-driver create <kind> "<title>" Create a new entity (adr, spec, delta, revision, audit, memory, policy, standard, backlog)
spec-driver list <kind> List entities with filters
spec-driver show <kind> <id> Show a single entity
spec-driver complete <kind> <id> Execute lifecycle completion action
spec-driver edit <kind> <id> Edit entity via $EDITOR or inline flags
spec-driver find <kind> <pattern> Find entities by glob pattern
.claude/hooks/startup.sh SessionStart hook script
.claude/hooks/artifact_event.py PostToolUse artifact tracking hook
05

Prompts

spec-driver — Prompts

This file quotes verbatim the most important prompt files from the spec-driver repository.


Excerpt 1: using-spec-driver SKILL.md (mandatory routing skill)

Source: .spec-driver/skills/using-spec-driver/SKILL.md

---
name: using-spec-driver
description: Mandatory routing skill for ANY substantive work in a spec-driver project. Choose the governing skill before acting, and do not start implementation until the required delta/design/plan/phase artefacts exist.
---

You MUST choose the governing workflow skill before doing substantive work.

This skill is not optional process fluff. It is the routing layer for work in a
spec-driver repo.

If there is a reasonable chance that another spec-driver skill governs the
task, you MUST route through that skill before proceeding.

Do not respond, explore, inspect files, run commands, or start implementation
until you have decided which skill governs the task.

Do not rationalize your way around this.

If you skip routing because the task feels familiar, simple, urgent, or
"probably fine", you are doing it wrong.

Red-flag thoughts:

- "I can just inspect files first."
- "I already know the command shape."
- "This is small enough that I do not need workflow routing."
- "I will gather context first and decide later."

Those are routing failures. Stop and choose the governing skill.

Technique demonstrated: Anti-reasoning attack. The prompt explicitly names the mental shortcuts an LLM uses to rationalize skipping governance ("familiar", "small", "urgent") and labels them as failure modes. Forces the agent into a declared routing decision before any action.


Excerpt 2: retrieving-memory SKILL.md (mandatory context retrieval)

Source: .spec-driver/skills/retrieving-memory/SKILL.md

---
name: retrieving-memory
description: |
  Invoke this skill before making non-trivial assumptions in a large codebase. Mandatory triggers: (1) you are about to modify a subsystem you have not touched in this run; (2) you are about to run, change, or suggest a command pipeline (tests, builds, releases, migrations); (3) you see conflicting cues in code/docs; (4) you are asked "what is the right way here?"; (5) you are debugging a recurring failure mode; (6) you are about to answer with "probably/usually/likely".
  Default rule: if you cannot cite a source-of-truth file/doc/ADR/SPEC from the repo, you must consult memories first and then proceed.
---

Retrieval procedure (fast → thorough):

1. Contextual list (preferred):
   - Use scope matching to get the most relevant memories:
     `spec-driver list memories -p <path>... -c "<command tokens>" --match-tag <tag>...`
   - Before working on a specific subsystem, build this query from the concrete
     files you expect to read or edit first. Memories with `scope.globs` still
     match those `-p` paths, so you do not need a separate glob flag.
   - For planned code changes, prefer at least one exact file path plus the
     command context you are about to run. Example:
     `spec-driver list memories -p supekku/scripts/lib/skills/sync.py -c "uv run pytest" --match-tag skills`

Decision framework (what to trust):

- Prefer memories with higher `priority.severity`, higher `priority.weight`, higher scope specificity, and more recent `verified/updated` (the list ordering already encodes this).
- If a memory lacks `provenance.sources` for a claim, treat it as advisory only and verify against code/docs before acting.
- If retrieved memories disagree, do not "average"; escalate to maintenance (update/supersede) before proceeding with consequential changes.

Output discipline:

- When responding or planning, cite the relevant memory IDs and their linked sources (paths/ADRs/specs). If you cannot, retrieve again with tighter `--path/--command/--match-tag` until you can.
- **Staleness awareness** — when presenting a retrieved memory, surface its
  verification state qualitatively:
  - No `verified_sha`: "this memory has not been attested against the codebase"
  - High staleness (many commits since attestation): "many commits have affected
    its scope since last attestation — treat with caution"

Technique demonstrated: Mandatory trigger list with enumerated conditions. Turns the agent's use of hedging language ("probably", "usually", "likely") into a circuit-breaker that forces memory retrieval before proceeding.


Excerpt 3: execute-phase SKILL.md (mandatory implementation skill)

Source: .spec-driver/skills/execute-phase/SKILL.md

---
name: execute-phase
description: Mandatory execution skill for any delta/IP implementation phase. Use it before code changes, move the owning delta to in-progress, keep notes current, reconcile structured execution docs, and surface blockers early.
---

This skill is mandatory for implementation work under a delta or implementation
plan.

Do not start coding, editing tests, or updating implementation docs for a
delta/IP phase until you have entered through `/execute-phase`.

If the delta still says `draft`, that is not harmless bookkeeping. Change it to
`in-progress` before implementation continues so the lifecycle truth matches the
actual state of work.

Process:

1. Confirm entry criteria are met for the active phase.
2. Read DR + IP + phase sheet before coding and use `/preflight` to surface
   confirmed inputs, assumptions, unresolved questions, and tensions before
   implementation.
3. Identify the concrete files or components you expect to touch first and run
   `/retrieving-memory` against those paths before deep reading or editing so
   any `scope.globs` gotchas or patterns surface early.
4. Ensure the owning delta frontmatter says `status: in-progress` before implementation work proceeds. If it still says `draft`, update it first.
5. Implement phase tasks (code/tests/docs) in small coherent units.
6. After each meaningful unit, run `/notes`.
7. If that unit produced a durable gotcha, pattern, or subsystem fact worth
   future retrieval, run `/capturing-memory` or `/maintaining-memory` before
   moving on.

Technique demonstrated: Lifecycle truth enforcement. The skill frames a "draft" status as "not harmless bookkeeping" — it forces agents to treat artefact state as authoritative ground truth, not a label to update lazily. Every implementation step is coupled to a memory/notes action.


Excerpt 4: preflight SKILL.md (bounded research skill)

Source: .spec-driver/skills/preflight/SKILL.md (first 50 lines)

---
name: preflight
description: Use after routing has already happened, when the next step is bounded up-front research...
---

This is NOT the first routing skill for work in a spec-driver repo.

If you have not already chosen the governing skill path, stop and use
`/using-spec-driver` first.

You're about to begin work on something: $ARGUMENTS

First: you need to understand what it entails. Your immediate task is to
correctly decide how much effort to spend on this preliminary research.

Too little is as bad as too much. You must estimate where the goldilocks zone is,
and arrive with maximum tokens intact.

If you chase the questions which arise as you go, you'll disappear like a
helium balloon into the open sky.

Instead:

1. Read the material in front of you
2. Take stock of relevant `/retrieving-memories` and `/doctrine`
3. Confirm your stopping conditions before you expand the search
4. Decide **up front**, out loud:

- what, concretely, you need to know.
- when, concretely, you will stop even if you don't have all the answers.

7. Before declaring readiness, produce a critical assessment with these headings:

- confirmed inputs
- assumptions you would carry into the next step
- unresolved questions, risks, or dependencies
- tensions or ambiguities (including any apparent contradictions between artifacts, or design & implementation surface)

Technique demonstrated: Token budget awareness baked into research direction. The "helium balloon" metaphor is a memorable anti-pattern guard. The skill forces a declared stopping condition before research begins, preventing open-ended context consumption.


Excerpt 5: Boot context Dogma (loaded into every session)

Source: .agents/spec-driver-boot.md (pre-generated at session start)

## Dogma

# Spec-Driver Dogma

- No implementation without a spec-driver artefact
- Guide the user invisibly in the correct use of spec-driver
- Pursue correctness, compact token-efficiency, and crisp, pragmatic rigour

Technique demonstrated: Three-axiom constraint injection at session start. "Guide the user invisibly" is the UX principle — the agent should steer users into correct process without making the governance feel like friction. Combined with the routing system, this creates an invisible governance layer.


Excerpt 6: scope-delta SKILL.md (change scoping)

Source: .spec-driver/skills/scope-delta/SKILL.md

You MUST run `/draft-design-revision` before `/plan-phases` unless **explicitly instructed** by the user to skip a DR.
You do not treat IP or phase creation as a substitute for missing design. `/plan-phases` comes after DR work, not instead of it.

Technique demonstrated: Explicit sequencing gate. The skill names the temptation (skip design, jump to planning) and requires explicit user override to do so. The "unless explicitly instructed" carve-out prevents the agent from becoming brittle while maintaining the default enforcement.


Excerpt 7: draft-design-revision — Anti-Pattern Guard

Source: .spec-driver/skills/draft-design-revision/SKILL.md

## Anti-Pattern: it's too simple ...

It's a trap. Follow the process.

Technique demonstrated: Terse pattern naming as a trap warning. The brevity is intentional — three words name the exact rationalization ("it's simple") that would lead to skipping design work, then dismiss it immediately with "Follow the process."

09

Uniqueness

spec-driver — Uniqueness & Positioning

GSD Disambiguation: spec-driver vs. glittercowboy/get-shit-done

IMPORTANT: There are two frameworks that use the "GSD" abbreviation in the spec-driven development space:

spec-driver (this framework) glittercowboy/get-shit-done
Author David Lee (davidlee) TÂCHES / glittercowboy
Full name Spec-Driver / Greenfield Spec-Driven Dev Get Shit Done (GSD) by TÂCHES
GitHub davidlee/spec-driver glittercowboy/get-shit-done
Nature Python CLI + TUI + workflow governance system Task/card methodology
Primary tool Claude Code, Codex (different)

spec-driver is the "GSD" in the sense of Greenfield Spec-Driven Development. It is NOT the same as glittercowboy/get-shit-done. These are entirely different frameworks that happen to share an abbreviation. This document covers davidlee/spec-driver only.


What spec-driver Does That No Other Seed Framework Does

  1. Evergreen spec loop — spec-driver is the only framework that explicitly treats implementation as evidence that must flow back into specs. Every delta closes with an audit that dispositions findings as spec_patch, revision, follow_up_delta, or tolerated_drift. Specs accumulate truth; they do not decay.

  2. Ceremony levels as a first-class dimension — Pioneer / Settler / Town Planner modes are configurable in workflow.toml, allowing the same framework to scale from a solo kanban board to full ADR + policy + standard + delta + DR + IP governance without forking the tooling.

  3. File-based agentic memory with provenance and scope — Memories carry scope.globs, provenance.sources, verified_sha, and priority metadata. The /retrieving-memory skill is mandatory before assumptions. No other seed framework ships this level of structured memory governance.

  4. Pre-generated boot context for cache optimizationspec-driver admin preboot generates a single compact static file containing all session-critical context. The README claims < 3k tokens to boot with everything activated. Designed for cost-efficiency in multi-session agentic workflows.

  5. Contract generation integration — Auto-generated API documentation from TypeScript, Go, and Zig code stored in .contracts/ as a derived-but-canonical corpus. Specs validate against contracts, not raw code.

  6. TUI with live session follow mode — A Textual-based TUI (spec-driver tui) can follow which spec-driver artefacts a running Claude Code session touches in real time. Unique among spec-driven frameworks.

  7. Drift ledgers and audits as first-class artifactsDL-xxx drift ledgers and AUD-xxx audits have dedicated CLI commands, lifecycle management, and can feed back into requirement status. Most frameworks treat these as informal notes.


What spec-driver Explicitly Drops

  • No throwaway specs — The README explicitly states: "Spec-driver is not a framework for development from disposable specifications." Single-pass generate-and-forget is rejected by design.
  • No global config — "All install locations are project-local - no change to your system ~/.claude etc"
  • No Windows support — "It probably doesn't work on Windows, but what does?"
  • No MCP servers — No bundled or required MCP servers; pure CLI + file system approach
  • No vendor lock-in — FSL license converts to Apache2 in 2 years; artifacts in plain Markdown + YAML

One-Sentence Positioning

spec-driver is a Python CLI + governance system that makes AI-generated code conform to evolving specs by treating implementation change as evidence that must flow back into authoritative, evergreen truth — inverting the typical "spec → code" direction into a continuous "code → spec → code" loop.


Compared to Other Frameworks in the Seed List

Framework Approach vs. spec-driver
BMAD Method Persona-based multi-agent orchestration BMAD focuses on agent choreography; spec-driver focuses on artefact lifecycle governance
superpowers Skills + commands for Claude Code spec-driver explicitly credits superpowers as inspiration; operates at higher governance abstraction
claude-flow Multi-agent coordination claude-flow coordinates agents; spec-driver governs what agents are allowed to do and how truth is maintained
openspec Spec templating openspec generates specs; spec-driver manages their entire lifecycle including post-implementation reconciliation

Failure Modes / Criticisms

No Reddit or HN criticism found (25 stars, limited public visibility). Potential criticisms based on the framework itself:

  1. High ceremony overhead — The full town_planner path (delta → DR → IP → phase sheets → execute → audit → patch specs → close) is 8+ steps per feature. The author acknowledges this: "The machinery required is more complex than naive approaches."

  2. Python + uv dependency — Requires Python 3.12+; adds a language runtime dependency to non-Python projects.

  3. Opinionated installation — "It's opinionated. It installs claude hooks and cross-platform skills."

  4. Beta status — Version 0.9.7, "Approaching 1.0". Data formats are promised stable but the API may still change.

  5. Solo maintainer risk — 1 primary human contributor; the project depends on David Lee's continued involvement.

04

Workflow

spec-driver — Workflow

Workflow Phases

The canonical default narrative is delta-first:

Phase Governing Skill Artifact(s) Produced
1. Boot /boot Agent verifies spec-driver-boot.md context is loaded
2. Route /using-spec-driver Routing decision (no artifact)
3. Scope /scope-delta DE-XXX.md (delta bundle) + optional backlog item link
4. Design /draft-design-revision DR-XXX.md (design revision); adversarial review included
5. Plan /plan-phases IP-XXX.md (implementation plan) + phase-0N.md (phase sheets)
6. Execute /execute-phase Code changes, tests, notes
7. Audit /audit-change AUD-XXX.md (audit artefact) with dispositioned findings
8. Spec Patch (within audit) Updated SPEC-XXX.md or RE-XXX.md (spec revision)
9. Close /close-change DE-XXX marked complete; registries synced

The workflow.toml states the narrative explicitly:

delta -> DR -> IP -> phase sheet(s) -> implement -> audit -> revision -> patch specs -> close

A revision-first concession path exists (/shape-revision) for town-planner governance, but is not the default.

Ceremony Modes (adjustable in workflow.toml)

  • pioneer — lightweight; kanban cards as primary unit of work
  • settler — moderate structure
  • town_planner — full governance; mandatory DR/IP/phase sheets; coverage gates enforced at complete delta

Human Approval Gates

The user must explicitly confirm/provide input at:

  1. After preflight/preflight ends with surfaced unknowns; agent must receive user acknowledgement before implementation
  2. Design approval/draft-design-revision presents design sections sequentially, requires user approval after each section
  3. Ambiguity / consult/consult is invoked whenever policy ambiguity or design tension cannot be resolved autonomously
  4. --force on complete delta — if coverage gates fail; user must justify use of --force

TDD Enforcement

Optional — Evidence from CLAUDE.md:

"Write tests BEFORE marking work complete" "Tests written and passing (just test)"

From execute-phase SKILL.md: verification artifacts (VT/VA/VH) are tracked through phases and must be present at delta closure. However, TDD itself is a recommended practice guided by the CLAUDE.md, not a runtime-enforced gate.

Multi-Agent Execution

Yes/dispatch and /sub-driver skills exist. The execute-phase skill references "preserving your own context by delegating to sub-agents" as an expected pattern. The boot system (spec-driver admin preboot) is explicitly designed for "cache-optimised agent sessions" in a multi-agent context.

Git Worktrees / Isolated Workspaces

No — not referenced in any fetched documentation.

Spec Format

Markdown + YAML frontmatter — all artefacts are markdown files with structured YAML frontmatter. The workflow.toml is TOML. No Gherkin or JSON spec format.

Files Generated Per Feature (Delta)

.spec-driver/deltas/DE-XXX/
├── DE-XXX.md           # Delta bundle (scope, applies-to, risks, context_inputs)
├── DR-XXX.md           # Design revision (current vs target architecture)
├── IP-XXX.md           # Implementation plan (phases, entry/exit criteria)
└── phases/
    ├── phase-01.md     # Phase sheet (tasks, verification steps, VT/VA/VH)
    └── phase-0N.md

.spec-driver/audits/AUD-XXX.md      # Post-implementation audit
.spec-driver/revisions/RE-XXX.md    # Spec revision (if specs are patched)

Additionally, linked tech specs are updated:

.spec-driver/tech/SPEC-XXX/
├── SPEC-XXX.md                     # Technical specification (patched after audit)
└── SPEC-XXX.tests.md               # Testing companion (optional)

Verification Artifact Types

  • VT (Verification Test): Automated test artifact
  • VA (Verification by Agent): Agent-generated test report or analysis
  • VH (Verification by Human): Manual verification requiring user attestation

Backlog → Delta Promotion

spec-driver create delta --from-backlog ISSUE-XXX
# auto-populates context_inputs and relations from source item

Key Lifecycle States

Delta statuses: draftin-progress → (audit) → completed

The execute-phase skill explicitly enforces: if delta status is still draft when implementation begins, the agent must update it to in-progress first.

06

Memory Context

spec-driver — Memory & Context

Memory Model

file-based — Memories are indexed Markdown files stored in .spec-driver/memory/. Each memory has YAML frontmatter with structured metadata including:

  • priority.severity and priority.weight (for ranking)
  • scope.globs (path patterns that this memory applies to)
  • provenance.sources (cited sources for claims)
  • verified_sha (commit hash when memory was last attested against the codebase)
  • Tags for topic filtering

Persistence Scope

project — All memory files live under .spec-driver/memory/ within the project repository. There is no global or cross-project memory layer.

Context Compaction Strategy

spec-driver uses a pre-generated boot context strategy to handle context loss:

spec-driver admin preboot

This command generates a static file (.agents/spec-driver-boot.md) containing all critical context (glossary, workflow, skill routing, governance, accepted ADRs, required policies/standards, project doctrine) in a single pre-rendered markdown document. The SessionStart hook regenerates this file automatically.

The README states: < 3k tokens to boot with everything activated — the boot context is deliberately compact to minimise the token cost of reloading project context.

Cross-Session Handoffs

The system uses three mechanisms for cross-session continuity:

  1. Pre-generated boot file (.agents/spec-driver-boot.md) — loaded via @-reference in CLAUDE.md/AGENTS.md at every session start. Contains static snapshot of governance, workflow, routing, and ADRs.

  2. Artefact-based state — the .spec-driver/ workspace is the persistent source of truth. Delta status (draft/in-progress/completed), phase sheets, DR/IP docs, and audit records survive session boundaries by design.

  3. Agentic memory index.spec-driver/memory/ contains scope-indexed memories. The /retrieving-memory skill is mandatory before assumptions, routing the agent to query memories by file path context before touching any subsystem.

Memory Creation and Maintenance

Dedicated skills handle memory lifecycle:

  • /capturing-memory — create new memory from implementation discoveries
  • /maintaining-memory — update or supersede existing memories
  • /reviewing-memory — audit memory health and staleness

The execute-phase skill mandates: after each implementation unit, if a "durable gotcha, pattern, or subsystem fact worth future retrieval" was produced, run /capturing-memory or /maintaining-memory before moving on.

"Memory Bank" / "Knowledge Base" References

The README explicitly calls out:

"Agentic Memory designed for hard use and repairability in humid environments."

The boot context file states:

"Creating and maintaining compact, high-signal, interlinked memories to help agents and users orient effectively is an important part of your role."

The retrieving-memory skill description states: "Default rule: if you cannot cite a source-of-truth file/doc/ADR/SPEC from the repo, you must consult memories first and then proceed."

ADR-005 in the repo is titled: "Memories and skills are the canonical guidance layer" — establishing memories as first-class governance artifacts, not supplemental notes.

07

Target Tools

spec-driver — Target Tools

Officially Supported Tools

Claude Code (primary)

Evidence from README:

"greenfield spec-driven development with Claude Code & friends (Codex, etc)"

The install creates .claude/ project-local settings and skills. The SessionStart and PostToolUse hooks in .claude/settings.json are Claude Code-specific. CLAUDE.md is explicitly managed. The boot script calls Claude Code as claude.

README Getting Started:

"Boot up Claude Code or Codex."

Install notes:

"If you use Claude Code or Codex, your agent can manage the workflow"

Codex (secondary / explicitly supported)

Evidence from README:

"greenfield spec-driven development with Claude Code & friends (Codex, etc)"

The workflow.toml has a [skills] targets config option that lists ["claude", "codex"]. The spec_driver package generates skills for both Claude Code (.claude/) and Codex (.agents/) directories.

Topics on the GitHub repo include codex.

Cursor (supported, no installer)

Evidence from repository:

  • .cursor/rules/ directory present in the repo
  • Commit message: "chore: cursor bootstrap (no installer)" — Cursor support exists but the install command does not automate it

Caveat: Cursor support is present but manual; the spec-driver install command does not set up Cursor automatically.

Other Tools

The framework is agent-tool agnostic at the workflow level (markdown + git artifacts work with any tool that can read files). The README's note "Zero Lock-In, Zero Cost" and "Things change fast, but if text in open formats goes out of fashion, all bets are off" suggests intentional compatibility openness.

No explicit support for: aider, cline, copilot, opencode, goose, windsurf, gemini-cli, roo, kilo, qwen, jules, continue.

Compatibility Notes

  • All install locations are project-local — no changes to ~/.claude or system config, avoiding conflicts with global Claude Code configurations
  • The workflow.toml [tool] section allows overriding how spec-driver is invoked (exec = "uv run spec-driver" vs bare spec-driver), enabling compatibility with different Python environment setups
  • Python >= 3.12 is required; older Python environments will not work
  • "It probably doesn't work on Windows" — explicitly stated in README caveats
08

Signals

spec-driver — Signals

GitHub Stats

  • Stars: 25 (as of 2026-05-26, via gh api)
  • Forks: 3
  • Watchers: 1
  • Open Issues: 1
  • Pull Requests open: 1
  • Contributors: 2 human contributors (davidlee + dependabot; Claude Opus 4.6 appears as a co-author in commit messages)
  • Commits: 1,133+
  • Last commit date: 2026-05-01
  • Repository created: 2025-10-09
  • Version: 0.9.7

Maintainer Status

Active — The repository has 1,133+ commits, the last commit was May 1, 2026 (25 days before this analysis), and the commit history shows near-daily activity. The version is tracking toward 1.0 with Beta status declared in the README.

Reddit / HN Sentiment

Unknown — No Reddit or HN discussion found in the available seed materials (_index/wave-2b-reddit-discovery.md not referenced for this project). The low star count (25) suggests limited public visibility as of this analysis date.

Community Notes

  • The framework is self-hosted (uses spec-driver to build spec-driver, per ADR-001)
  • Commits frequently show Claude Opus 4.6 as co-author, demonstrating the framework working as advertised
  • No open issues beyond 1; no public user discussions found
  • PyPI listing: https://pypi.org/project/spec-driver/
  • Companion website: https://supekku.dev/ (Socratic dialogue on spec-driven development philosophy)

Related frameworks

same archetype · same primary tool · same memory type

BMAD-METHOD ★ 48k

Provides a full agile delivery lifecycle with named expert-persona AI collaborators that elicit the human's best thinking rather…

Agent OS ★ 4.6k

Extracts implicit codebase conventions into token-efficient markdown standards files and injects them selectively into AI agent…

Claude Conductor ★ 367

Gives Claude Code a persistent, cross-linked, auto-analyzed documentation system so it retains codebase context across sessions.

Anthropic Knowledge Work Plugins ★ 16k

Role-specialized plugin bundles with live MCP connectors that turn Claude into a domain expert for enterprise knowledge workers.

Codex Integration for Claude Code (skill-codex) ★ 1.3k

Single Claude Code skill that handles Codex CLI invocation correctly (stdin blocking, thinking token suppression, session resume)…

mini-coding-agent ★ 882

A single-file zero-dependency Python coding agent that demonstrates the six core components of coding agents for educational…