Skip to content
/

spec-compare (cameronsjo)

spec-compare · cameronsjo/spec-compare · ★ 39 · last commit 2026-02-25

Primitive shape
No installable primitives
00

Summary

spec-compare — Summary

spec-compare is a research and comparison documentation repository, not a deployable AI coding framework. It contains in-depth analysis of spec-driven development tools for AI-assisted coding, organized as 13 Markdown documents covering comparison matrices, 12-scenario use-case scoring, git worktree analysis, a landscape survey of 30+ tools, and practical decision frameworks. The original six tools compared are Spec-Kit, Spec Kitty, BMad Method, OpenSpec, Kiro, and Tessl; a February 2026 extension adds GSD, Ralph Loop, Zencoder, Kilo Code, Conductor, and PromptX. The repository's primary original contribution is the "SDD Maturity Levels" taxonomy — Spec-First (specs precede code but are discarded), Spec-Anchored (specs persist and evolve), and Spec-as-Source (only specs are edited, code auto-generates) — which provides a conceptual framework for categorizing all tools in the SDD space. With 39 stars and 2 contributors, spec-compare has more adoption signal than most deployable frameworks in this batch, reflecting its utility as a curated research entry point for practitioners choosing between SDD tools.

differs_from_seeds

spec-compare has no relationship to any seed framework — it does not extend, fork, or build on any of the five archetypes. It is a pure methodology documentation artifact: no commands, no skills, no agents, no MCP servers, no hooks, and no install script. Its value is analytical rather than operational: it tells you which seed-class framework to choose, rather than being a framework itself. In this batch it is the only Tier C item — it does not fit the "AI coding agent framework" definition, but it is included because it is the most comprehensive publicly available comparison of the exact tool category this research project covers.

01

Overview

spec-compare — Overview

Origin

spec-compare was created by cameronsjo with a second contributor, last updated 2026-02-25. It is MIT-licensed with 39 stars. The repository has no source code — it is entirely documentation.

Purpose

From the README:

A comprehensive research and comparison of spec-driven development (SDD) tools for AI-assisted coding, including analysis of git worktree support, architectural approaches, and practical recommendations.

The repository was originally created to compare six tools (Spec-Kit, Spec Kitty, BMad Method, OpenSpec, Kiro, Tessl) and was extended in February 2026 to add GSD, Ralph Loop, Zencoder, Kilo Code, Conductor, and PromptX.

Primary Original Contribution

The "SDD Maturity Levels" taxonomy, from the README:

  1. Spec-First: Specs precede coding but are discarded (Spec-Kit, Kiro, BMad)
  2. Spec-Anchored: Specs persist and evolve (OpenSpec, Spec Kitty)
  3. Spec-as-Source: Only specs are edited, code auto-generates (Tessl)

Key Findings

From the README:

Critical Gap: Most SDD tools excel when requirements are clear upfront but struggle with iterative changes like "change button from blue to green."

  • OpenSpec — Purpose-built for modifications with delta format (ADDED, MODIFIED, REMOVED)
  • Tessl — Spec-as-source enables edit-and-regenerate (but closed beta)
  • Spec-Kit — Requires /speckit.clarify workaround, not optimized for small changes
  • Kiro/BMad — "Sledgehammer to crack a nut" problem for trivial changes

Spec Kitty is the only tool with built-in git worktree support, enabling:

  • Automatic worktree creation per feature
  • Parallel feature isolation without branch switching
  • Automated cleanup on merge

Key finding: OpenSpec v1.0 is the only tool that migrated from AGENTS.md/CLAUDE.md to the newer SKILL.md standard. All other open-source tools still generate AGENTS.md. The AGENTS.md standard itself (28.64% runtime reduction in evaluations) continues to gain adoption — OpenAI Codex ships 88 AGENTS.md files in its own repo.

02

Architecture

spec-compare — Architecture

Distribution Type

Methodology documentation (static Markdown files). No deployable artifact.

Repository Structure

spec-compare/
├── README.md                        # Entry point with key findings, quick comparison table, recommendations
├── CHANGELOG.md                     # Version history of the research
├── CONTRIBUTING.md                  # How to contribute findings
├── LICENSE                          # MIT
└── docs/
    ├── comparison.md                # Side-by-side feature comparison matrices
    ├── use-case-scoring.md          # 12-scenario graded scoring (5-star scale)
    ├── iterative-development.md     # Spec modification workflows analysis
    ├── git-worktree-support.md      # Detailed worktree analysis (updated with Beads, Conductor)
    ├── recommendations.md           # Decision frameworks by use case
    ├── critical-analysis.md         # Concerns, critiques, future outlook
    ├── landscape.md                 # 30+ multi-agent tools surveyed
    ├── beads.md                     # Agent memory, messaging, multi-agent villages
    ├── gaps.md                      # Zencoder, Kilo Code, Conductor, PromptX analyses
    ├── cheatsheet-beads-openspec.md # Practical Beads + OpenSpec setup cheatsheet
    ├── sources.md                   # All citations and references
    ├── research-beads-agentmail-for-skill-update.md  # Research notes
    └── tools/
        ├── spec-kit.md              # Individual profile
        ├── spec-kitty.md            # Individual profile
        ├── bmad-method.md           # Individual profile
        ├── openspec.md              # Individual profile
        ├── kiro.md                  # Individual profile
        ├── tessl.md                 # Individual profile
        ├── gsd.md                   # Individual profile
        └── ralph-loop.md            # Individual profile

No Install, No Runtime, No Configuration

spec-compare has no install method, no runtime dependencies, and no configuration files. It is read directly on GitHub or cloned for offline reference.

03

Components

spec-compare — Components

spec-compare contains no executable components. All content is static Markdown research documentation.

Document Index

Document Purpose
docs/comparison.md Side-by-side feature matrices comparing 6 tools on 20+ dimensions, plus agent configuration support table
docs/use-case-scoring.md 12 real-world scenarios graded on 5-star scale for each tool; expanded 11-tool heatmap
docs/iterative-development.md Analysis of how each tool handles spec modifications vs. greenfield
docs/git-worktree-support.md Detailed git worktree capability analysis including Beads and Conductor
docs/recommendations.md Decision frameworks: by project type, team size, complexity, workflow preference
docs/critical-analysis.md Honest concerns and critiques about each tool; future outlook
docs/landscape.md 30+ multi-agent tools surveyed including Claude Code Agent Teams
docs/beads.md Analysis of Beads, Agent Mail, and Gas Town (agent memory and messaging ecosystem)
docs/gaps.md Analysis of Zencoder, Kilo Code, Conductor, PromptX
docs/cheatsheet-beads-openspec.md Practical daily workflow cheatsheet for Beads + OpenSpec combination
docs/sources.md All citations and references
docs/tools/spec-kit.md Spec-Kit individual profile
docs/tools/spec-kitty.md Spec Kitty individual profile
docs/tools/bmad-method.md BMad Method individual profile
docs/tools/openspec.md OpenSpec individual profile
docs/tools/kiro.md Kiro individual profile
docs/tools/tessl.md Tessl individual profile
docs/tools/gsd.md GSD individual profile
docs/tools/ralph-loop.md Ralph Loop individual profile

Tools Analyzed (12 total)

Original Six

  1. GitHub Spec-Kit — Open-source CLI toolkit for greenfield projects
  2. Spec Kitty — Community fork with built-in git worktree orchestration
  3. BMad Method — Enterprise framework with 21 specialized AI agents
  4. OpenSpec — Lightweight change-management for brownfield projects
  5. Kiro — AWS-backed agentic IDE with multimodal input
  6. Tessl — Experimental spec-as-source platform

February 2026 Extensions

  1. GSD — Meta-prompting SDD system with wave-based context management (11.9K stars)
  2. Ralph Loop — Stateless iterative execution pattern by Geoffrey Huntley
  3. Zencoder/Zenflow — Commercial SDD-as-a-Service platform
  4. Kilo Code — Open-source agentic platform with Memory Bank ($8M seed, 1.5M users)
  5. Conductor — macOS parallel agent runner using git worktrees
  6. PromptX — AI agent context platform via MCP (gap entry)
05

Prompts

spec-compare — Prompts

spec-compare contains no prompt files, agent definitions, or skill documents. It is a research comparison repository. The sections below quote the most substantive analytical text from the documentation.

Verbatim: Use-case scoring for trivial modification

From docs/use-case-scoring.md:

## Use Case 1: Change Button Color (Trivial Modification)

**Scenario:** Change primary button background from blue (#0000FF) to green (#00FF00)

| Tool | Score | Reasoning |
|------|-------|-----------|
| **OpenSpec** | ⭐⭐⭐⭐⭐ | Lightweight delta format. Create `changes/button-color/` with MODIFIED section. Minimal overhead. |
| **Tessl** | ⭐⭐⭐⭐ | Edit spec directly, regenerate code. Elegant but closed beta limits access. |
| **Spec-Kit** | ⭐⭐⭐ | Must use `/speckit.clarify` workaround. `/speckit.specify` wants to create new feature. |
| **Spec Kitty** | ⭐⭐ | Same Spec-Kit issues + worktree overhead excessive for one-line change. |
| **Kiro** | ⭐⭐ | Generates 5-page spec for trivial change. "Sledgehammer to crack a nut" problem. |
| **BMad** | ⭐ | 21 agents, 30+ minute workflow for button color. Massive overkill. |

**Winner:** OpenSpec
**Avoid:** BMad, Kiro

Verbatim: Feature matrix excerpt (comparison.md)

From docs/comparison.md:

| Feature | Spec-Kit | Spec Kitty | BMad | OpenSpec | Kiro | Tessl |
|---------|----------|------------|------|----------|------|-------|
| **Spec Maturity Level** | Spec-First | Spec-Anchored | Spec-First | Spec-Anchored | Spec-First | Spec-as-Source |
| **Git Worktrees** | No | **Yes** | No | No | No | No |
| **AGENTS.md** | ✅ Generated | ✅ Generated | ✅ Generated | ⚠️ Pre-1.0 only | ❌ | ❌ |
| **SKILL.md** | ❌ | ❌ | ❌ | ✅ v1.0 | ❌ | ❌ |
| **Slash Commands** | ✅ 8 | ✅ 13 | ✅ 50+ workflows | ✅ 10 (`/opsx:`) | ❌ | ❌ |
| **MCP Support** | ❌ | ❌ | ❌ | ❌ | ✅ | ✅ |
| **Multi-Agent** | ❌ | ✅ | ✅ | ❌ | ⚠️ | ❌ |
| **Dashboard** | ❌ | ✅ | ❌ | ⚠️ | ✅ (IDE) | ❌ |

Verbatim: AGENTS.md finding (comparison.md)

From docs/comparison.md:

**Key finding:** OpenSpec v1.0 is the only tool that migrated from AGENTS.md/CLAUDE.md to the 
newer SKILL.md standard. All other open-source tools still generate AGENTS.md. The AGENTS.md 
standard itself (28.64% runtime reduction in evaluations) continues to gain adoption — OpenAI 
Codex ships 88 AGENTS.md files in its own repo.

Verbatim: Modification problem (README)

From README.md:

### The Modification Problem

**Critical Gap:** Most SDD tools excel when requirements are clear upfront but struggle with 
iterative changes like "change button from blue to green."

- **OpenSpec** - Purpose-built for modifications with delta format (ADDED, MODIFIED, REMOVED)
- **Tessl** - Spec-as-source enables edit-and-regenerate (but closed beta)
- **Spec-Kit** - Requires `/speckit.clarify` workaround, not optimized for small changes
- **Kiro/BMad** - "Sledgehammer to crack a nut" problem for trivial changes
09

Uniqueness

spec-compare — Uniqueness and Positioning

differs_from_seeds

spec-compare has no relationship to any of the five seed archetypes — it does not extend, fork, or build on any seed framework. It is a pure methodology documentation artifact: a curated research collection designed to help practitioners choose between seed-class frameworks. No other repository in this batch serves this function. It is the only Tier C item in batch-01: included for completeness (and because its 39 stars reflect genuine practitioner adoption as a reference resource) but not analyzable as a deployable framework.

Positioning

  • Category: Research/comparison documentation — NOT a deployable AI coding framework
  • Primary value: Curated cross-tool analysis with graded use-case scoring and the "SDD Maturity Levels" taxonomy
  • Unique contribution: The 3-level SDD Maturity taxonomy (Spec-First / Spec-Anchored / Spec-as-Source) provides conceptual vocabulary for categorizing every other framework in this research project
  • Scope: Covers the widest span of any single document in the corpus — original 6 tools + 6 extended tools + 30+ landscape entries

Observable Limitations

  1. Not a framework: Cannot be installed, configured, or run. Zero operational value for practitioners who already know which tool they want.
  2. Point-in-time accuracy: Some entries (Kiro, Tessl) are based on preview/beta state as of early 2026 and may be stale.
  3. Author bias unknown: Tool scoring reflects the author's priorities (brownfield/modification workflows ranked highly; this favors OpenSpec). Readers with greenfield priorities may weight the scoring differently.
  4. No code evidence: The comparison relies on documentation claims rather than reproducible benchmarks. The "30+ minute workflow for button color" for BMad is an estimate, not a measurement.
  5. Missing frameworks: Doesn't cover ralphy-openspec, goopspec, or the wave of OpenCode-targeted tools that emerged after February 2026.
04

Workflow

spec-compare — Workflow

spec-compare has no executable workflow. It is a reference artifact used to inform tool selection decisions.

How Practitioners Use It

  1. Read README for key findings and quick comparison table
  2. Consult docs/use-case-scoring.md to find tools that score highest for their specific scenario
  3. Read docs/recommendations.md for decision frameworks by project type and team profile
  4. Read individual tool profiles in docs/tools/ for deeper investigation
  5. Consult docs/comparison.md for detailed feature matrix
  6. Check docs/git-worktree-support.md if parallel development is a requirement
  7. Check docs/iterative-development.md if brownfield/modification workflows are primary

No Phases, No Artifacts, No Gates

spec-compare produces no code, no specs, no artifacts. The "workflow" is reading documentation.

Decision Framework Summary (from docs/recommendations.md)

Scenario Recommended Tool
Trivial modifications (button color, text change) OpenSpec
Greenfield project with full planning Spec-Kit or BMad
Parallel feature development Spec Kitty
Enterprise multi-agent workflow BMad
Spec-as-source (regenerate from spec) Tessl
IDE-integrated experience Kiro

SDD Maturity Taxonomy (Key Analytical Output)

The primary analytical contribution of spec-compare:

Level Name Definition Examples
1 Spec-First Specs precede coding but are discarded after use Spec-Kit, Kiro, BMad
2 Spec-Anchored Specs persist and evolve alongside code OpenSpec, Spec Kitty
3 Spec-as-Source Only specs are edited; code auto-generates Tessl
06

Memory Context

spec-compare — Memory and Context

State Storage

None. spec-compare is static documentation. No state files, no session management, no persistence.

Context Approach

spec-compare is read as reference material prior to making tool selection decisions. It has no runtime context mechanism.

Cross-Session Handoff

Not applicable.

Memory Type

Not applicable — there is no executable component.

07

Orchestration

spec-compare — Orchestration

Multi-Agent

No. spec-compare has no agents, no orchestration, and no execution layer.

Orchestration Pattern

None.

Execution Mode

Not applicable. spec-compare is read as static documentation.

Note

spec-compare analyzes orchestration patterns in other tools — notably git worktree orchestration in Spec Kitty, 21-agent hierarchical orchestration in BMad, and parallel agent runners like Conductor. But it does not implement any orchestration itself.

08

Ui Cli Surface

spec-compare — UI, CLI, and Observability

Dedicated CLI Binary

None.

Local Web Dashboard

None.

IDE Integration

None.

Observability

Not applicable — spec-compare is static documentation with no runtime behavior to observe.

Access Method

Read directly on GitHub at https://github.com/cameronsjo/spec-compare, or clone locally with git clone.

Related frameworks

same archetype · same primary tool · same memory type

Context-Engineering Handbook ★ 9.0k

Provides a first-principles, research-grounded vocabulary and learning path for context engineering — the discipline of designing…

walkinglabs/learn-harness-engineering ★ 6.6k

Teach harness engineering from first principles (12 lectures + 6 projects) and provide a scaffolding skill (harness-creator) that…

Awesome Harness Engineering (walkinglabs) ★ 2.7k

Curate the authoritative reference list of articles, benchmarks, and tools for harness engineering — the practice of shaping the…

cline-memory-bank (nickbaumann98) ★ 581

Custom instructions + 6-file hierarchical Markdown memory bank so Cline maintains full project context across sessions, with a…

FPF (First Principles Framework) ★ 372

Provides a formal pattern language for making reasoning explicit, traceable, and publishable in mixed human/AI engineering work —…

nexu-io/harness-engineering-guide ★ 134

Provide a practical, code-first reference guide to harness engineering — from first principles to production patterns —…