Blueprint (JuliusBrussee) / cavekit

blueprint-juliusbrussee · JuliusBrussee/blueprint · ★ 948 · last commit 2026-05-17

Compresses the entire SDD workflow into a single SPEC.md file using caveman encoding (~75% fewer tokens) with automatic bug-to-invariant backpropagation on test failure.

Best whenSDD frameworks bury their value under ceremony — the only things worth keeping are a durable spec, caveman encoding for token efficiency, and a backprop refl…

Skip ifSub-agents and parallel workers, Dashboards (cat SPEC.md is the dashboard)

vs seeds

spec-driver(spec-first, test-enforced, single spec file) but cavekit adds caveman encoding (§G/§C/§I/§V/§T/§B notation, ~75% token …

Primitive shape 8 total

Commands 3 Skills 5

Summary

Blueprint (JuliusBrussee) / cavekit — Summary

Important disambiguation: The GitHub repository JuliusBrussee/blueprint is NOT a planning tool named "Blueprint." The repository was renamed to cavekit (v4.0.0) as of current analysis. The README header reads "cavekit" and the plugin.json name field is "ck". This is a completely different product from imbue-ai/blueprint despite sharing the same repository name slug.

cavekit 4 is a compressed spec-driven development framework for Claude Code that compresses the entire SDLC into three slash commands and one file (SPEC.md) using "caveman encoding" — a ~75% token-reduction notation that replaces prose with symbols, fragments, and pipe tables. Its three commands are: /ck:spec (sole mutator of SPEC.md — creates, amends, or backprops bugs), /ck:build (single-thread plan→execute against spec with auto-backprop on failure), and /ck:check (read-only drift report listing §V/§I/§T violations). The central innovation is backpropagation: every test failure becomes a §B bug entry and a §V invariant in the spec, building institutional memory that prevents recurrence. With 948 stars and a previous v3.1.0 "Hunt lifecycle" version (12+ commands, 12 subagents), cavekit 4 is a deliberate minimization to the essential core. Compared to seeds, it is closest to spec-driver (spec-first, test-enforced, single-file SPEC.md) but differs in caveman encoding, explicit backprop protocol, and refusing all orchestration.

Overview

cavekit (JuliusBrussee/blueprint) — Overview

Origin

Created by JuliusBrussee. MIT license. Repository: JuliusBrussee/blueprint (renamed cavekit v4). 948 stars. Pushed 2026-05-17. Previous version v3.1.0 is frozen at git tag.

Philosophy

From README:

"Plan-then-execute forgets. SDD remembers — but most SDD frameworks bury that value under agent swarms, dashboards, and ceremony that costs more tokens than it saves." "Cavekit 4 is a rewrite from the ground up. It keeps only what earns its place."

cavekit's philosophy is aggressive minimalism: remove everything that doesn't pay for itself in token efficiency. The three things it keeps:

Durable spec — SPEC.md at repo root survives context resets
Caveman encoding — ~75% fewer tokens than prose
Backprop reflex — every test failure becomes a §B entry; classes of bug become §V invariants the spec never forgets

Caveman encoding

From README:

"~75% fewer tokens than prose. Symbols, fragments, pipe tables for repeating records."

Section markers: §G (goal), §C (constraints), §I (interfaces), §V (invariants), §T (tasks, pipe table), §B (bugs, pipe table).

Example task table:

§T
| id | status | task | cites |
|----|--------|------|-------|
| T1 | .      | impl auth mw | V2,I.api |
| T2 | ~      | add rate limit | V3 |

Status: . (pending), ~ (in progress), x (done)

Previous version (v3.1.0)

The prior "Hunt lifecycle" had: 4-command cycle (/ck:sketch → /ck:map → /ck:make → /ck:check) + 8 more commands (ship, review, revise, status, design, research, init, config, resume, help = 12 total), 12 named subagents, per-task token budgets, stop-hook state machine, model-tier routing, auto-backpropagation, tool-result caching, Codex peer review, knowledge-graph integration, caveman compression, parallel wave execution, team mode.

v4 deliberately removes all of this. v3.1.0 is preserved at the tag and still fully functional.

Non-goals

From README (explicit):

"no sub-agents. Main Claude does the work." "no dashboards. cat SPEC.md is the dashboard." "no parallel workers. One thread, one spec, one diff." "no JSON / YAML spec bodies. Markdown + pipe tables." "no hooks, no orchestration binaries, no TypeScript helpers."

Architecture

cavekit — Architecture

Distribution

Claude Code plugin: JuliusBrussee/cavekit
plugin.json name: ck
Version: 4.0.0
Install options:
1. npx skills add JuliusBrussee/cavekit (agentskills.io skills CLI)
2. /plugin marketplace add juliusbrussee/cavekit && /plugin install ck@cavekit (Claude Code marketplace)
3. git clone https://github.com/juliusbrussee/cavekit.git ~/.claude/plugins/cavekit

Required runtime

None beyond Claude Code (or compatible AI host)

Directory tree

JuliusBrussee/blueprint/  (repo name, product is cavekit)
├── README.md
├── FORMAT.md             # SPEC.md schema + caveman encoding rules
├── plugin.json           # {"name": "ck", "version": "4.0.0"}
├── CHANGELOG.md
├── FORMAT.md
├── LAUNCH-POST.md
├── UPGRADE.md
├── commands/
│   ├── spec.md           # /ck:spec slash command
│   ├── build.md          # /ck:build slash command
│   └── check.md          # /ck:check slash command
└── skills/
    ├── spec/
    │   └── SKILL.md      # spec skill (mirrors commands/spec.md)
    ├── build/
    │   └── SKILL.md      # build skill (mirrors commands/build.md)
    ├── check/
    │   └── SKILL.md      # check skill (mirrors commands/check.md)
    ├── caveman/
    │   └── SKILL.md      # caveman encoding utility
    └── backprop/
        └── SKILL.md      # bug → spec backpropagation utility

SPEC.md sections (from FORMAT.md)

§G  goal (1 line, caveman)
§C  constraints
§I  interfaces (external surfaces)
§V  invariants (numbered V1, V2...)
§T  tasks (pipe table: id | status | task | cites)
§B  bugs (pipe table: id | date | cause | fix)

Output file

SPEC.md at repo root (single file, sole state artifact)

Target AI tools

Claude Code primary. agentskills.io-compatible tools via skills CLI.

Components

cavekit — Components

Commands (3)

Command	Purpose
`/ck:spec`	Sole mutator of SPEC.md — create (NEW), distill from code (DISTILL), amend section (AMEND), backprop bug (BACKPROP)
`/ck:build`	Single-thread plan→execute against SPEC.md; auto-backprops on failure
`/ck:check`	Read-only drift report — lists §V/§I/§T violations without modifying spec

Skills (5)

Name	Purpose
`spec`	Mirrors /ck:spec command (skill activation)
`build`	Mirrors /ck:build command (skill activation)
`check`	Mirrors /ck:check command (skill activation)
`caveman`	Encoding utility — applies caveman notation rules to any text
`backprop`	Bug → §B + §V protocol (6 steps)

Dispatch Modes in /ck:spec

Mode	Trigger	Action
NEW	No SPEC.md + idea args	Full spec from scratch
DISTILL	No SPEC.md + `from-code`	Walk repo → extract §G/§C/§I/§V/§T/§B from code
BACKPROP	SPEC.md exists + args start `bug:`	Add §B row + §V invariant
AMEND	SPEC.md exists + `amend §V.3`	Targeted section edit

Backprop Protocol (6 steps)

Parse bug description
Find root cause (read relevant code)
Decide: would a new invariant catch recurrence?
Append §B row: B<N>|<date>|<cause>|V<N>
Append §V invariant (if applicable)
If behavioral fix → add/update §T rows

Single State File

SPEC.md — the only persistent artifact. Tasks tracked as pipe table rows with . / ~ / x status.

Prompts

cavekit — Prompts

Excerpt 1: /ck:spec — BACKPROP Mode

Source: skills/spec/SKILL.md

## BACKPROP — bug → §B + §V

Input: `bug: <description>`.

Steps:
1. Parse bug description.
2. Find root cause (read relevant code).
3. Decide: would a new invariant catch recurrence? If yes → draft `V<next>`.
4. Append §B row: `B<next>|<date>|<cause>|V<N>`.
5. Append §V invariant (if applicable).
6. If fix also changes behavior → add/update §T rows.
7. Show diff. Apply only on user OK.

Rule: every bug gets a §B entry. Invariant optional but preferred.

Technique: Structured failure-to-invariant extraction. Every bug triggers a root cause analysis, then a decision about whether an invariant would prevent recurrence. The invariant is added to the spec, not just to code comments — the spec learns from failures and becomes a better guide for future implementation.

Excerpt 2: /ck:build — EXECUTE loop

Source: skills/build/SKILL.md

## EXECUTE

Per task in order:

1. Flip §T.n status cell `.` → `~`. Just write to SPEC.md.
2. Edit code per plan.
3. Run verification command.
4. **Pass** → flip `~` → `x`. Next task.
5. **Fail** → invoke backprop skill. Do NOT retry blindly.

## FAIL → BACKPROP

On test/build failure:

1. Read failure output.
2. Ask: is failure (a) my code bug, (b) spec wrong, or (c) unspecified edge case?
3. If (a) → fix code, re-run. No spec change.
4. If (b) or (c) → invoke spec skill with `bug: <cause>` first, let it update §V and §B, then resume build against updated spec.

Rule: never silently fix root-cause without considering backprop. §B is the memory that stops recurrence.

Technique: Failure classification gate. Before retrying, the agent must classify the failure type (my bug / spec wrong / edge case), preventing blind retry loops. Type (b) or (c) forces a spec update before any retry — the implementation cannot proceed against a known-wrong spec.

Excerpt 3: /ck:spec — OUTPUT RULES (Caveman Encoding)

Source: skills/spec/SKILL.md

## OUTPUT RULES

- Caveman format per `FORMAT.md`.
- Preserve identifiers, paths, code verbatim.
- Numbering monotonic — never reuse §V.N or §B.N.
- §T row `cites` column ! list §V/§I deps: `T5|.|impl auth mw|V2,I.api`.

Technique: Monotonic identifier protocol + dependency citation. Invariant numbers (V1, V2...) and bug numbers (B1, B2...) are never reused — deleted invariants leave gaps in the numbering. Task rows cite their §V and §I dependencies explicitly in the cites column, creating a traceability matrix within the spec.

Uniqueness

cavekit — Uniqueness

Differs from Seeds

Closest seed is spec-driver (spec-first, test-enforced, single CLAUDE.md/SPEC.md file workflow). cavekit differs from spec-driver in three ways: (1) caveman encoding — ~75% token reduction vs prose specs, with a defined notation system (§G/§C/§I/§V/§T/§B) rather than free markdown; (2) formal backprop protocol — every test failure triggers a structured 6-step root-cause-to-invariant extraction that updates the spec, whereas spec-driver does not mandate spec learning from failures; (3) explicit non-goals — no subagents, no dashboards, no parallel workers, no YAML/JSON bodies — these are stated design choices, not gaps. Compared to kiro (spec storage in .kiro/specs/requirements.md/design.md/tasks.md), cavekit uses a single flat SPEC.md and custom section syntax rather than multiple files. cavekit is the only framework in this batch or the seeds with a designed token compression encoding for spec files.

Two-Blueprint Disambiguation

JuliusBrussee/blueprint (this repo, slug: blueprint-juliusbrussee) is cavekit 4 — a compressed spec framework. imbue-ai/blueprint (slug: blueprint-imbue) is a planning Q&A tool. They share only the name "blueprint" as a previous identity. See batch notes.

Version history note

v3.1.0 (frozen at tag) is a completely different product — 12+ commands, 12 subagents, parallel waves, peer review. v4.0.0 is a deliberate minimization. Both are valid and functional; v4 is the "distilled core," v3.1.0 is the "autonomous loop." The README recommends choosing based on use case, not defaulting to v4.

Positioning

cavekit occupies the "maximum token efficiency, minimum ceremony" niche in SDD frameworks. It is the only framework that makes token budget a first-class design constraint — the encoding notation exists specifically to keep SPEC.md loadable in full on every turn.

Observable Failure Modes

Caveman encoding learning curve: Teams unfamiliar with §V/§B notation must read FORMAT.md before contributing to SPEC.md.
Single file as bottleneck: Large projects with many modules may outgrow the single-SPEC.md model (tasks.md + modules could be hundreds of rows).
Repository name confusion: JuliusBrussee/blueprint is the GitHub repo name but the product is cavekit — discoverability and naming are inconsistent.
Manual plan approval: /ck:build shows the plan and waits for user OK before executing, breaking autonomous pipelines.
No merge/PR automation: cavekit tracks task status but does not create branches, commits, or PRs automatically.

Workflow

cavekit — Workflow

Main Workflow

/ck:spec → SPEC.md created (NEW/DISTILL/AMEND/BACKPROP)
    ↓
user review of spec
    ↓
/ck:build [§T.n | --next | --all]
    → LOAD: Read SPEC.md + FORMAT.md
    → PLAN: Native plan mode per task (cite §V + §I; list files + tests)
    → User OK on plan (or auto mode)
    → EXECUTE per task:
         flip §T.n → ~ (in progress)
         edit code
         run verification command
         PASS → flip ~ → x, next task
         FAIL → BACKPROP → spec updated → resume
    ↓
/ck:check → drift report (violations listed, no changes made)

Approval Gates

NEW spec: "spec OK? suggest edits or invoke build"
build plan: show plan, wait for user OK (unless auto mode)
BACKPROP amend: show diff, apply only on user OK

Task Lifecycle

Status	Meaning
`.`	Pending
`~`	In progress
`x`	Complete

Fail → Backprop Loop

On test/build failure:

Read failure output
Ask: is failure (a) my code bug, (b) spec wrong, or (c) unspecified edge case?
(a) → fix code, re-run (no spec change)
(b) or (c) → invoke spec skill with bug: <cause> → spec updated → resume build against updated spec

Rule: Never silently fix root cause without considering backprop. §B is the memory that stops recurrence.

Artifacts

Phase	Artifact
/ck:spec	`SPEC.md` at repo root
/ck:build	committed code; §T status updates; §B + §V entries on failure
/ck:check	drift report (console output only)

Memory Context

cavekit — Memory & Context

State Storage

Single file: SPEC.md at repo root.

Memory Encoding

Caveman notation — ~75% fewer tokens than prose:

Goal (§G): 1 line, fragments and symbols
Constraints (§C): bulleted fragments
Interfaces (§I): external surface names
Invariants (§V): numbered, monotonic, never reused
Tasks (§T): pipe table with status dots and cites column
Bugs (§B): pipe table with date, cause, fix columns

The §B + §V sections are the institutional memory — bugs are never deleted, only added, creating an append-only failure log.

Persistence Scope

Project-level. SPEC.md is a single file at repo root — survives context resets by design.

Cross-Session Handoff

Yes — SPEC.md is the primary context recovery artifact. On any new session, reading SPEC.md gives full project state (goal, constraints, interfaces, invariants, task progress, bug history) in ~75% fewer tokens than equivalent prose.

Context Compaction

SPEC.md IS the compaction mechanism. The caveman encoding is explicitly designed to keep the full spec loadable within a single context window. §T status (./~/x) provides task progress at a glance.

Memory Type

File-based, single file. No database, no vector search.

Search

None. The spec is small enough to read in full.

Orchestration

cavekit — Orchestration

Multi-Agent

Explicitly no. From README:

"no sub-agents. Main Claude does the work." "no parallel workers. One thread, one spec, one diff."

Orchestration Pattern

Sequential. Single thread, single spec.

Isolation Mechanism

None.

Multi-Model

No. Single model (main Claude).

Execution Mode

Interactive-loop. User invokes commands explicitly.

Crash Recovery

Yes — SPEC.md task status (./~/x) is written per task. A crashed build resumes from the last ~ task.

Context Compaction

SPEC.md is the compaction artifact. The caveman encoding keeps the full project state loadable in minimal tokens.

Cross-Session Handoff

Yes — SPEC.md provides full state recovery.

Prompt Chaining

Yes — /ck:spec output (SPEC.md) is the primary input to /ck:build and /ck:check. Backprop updates SPEC.md, which is immediately available to the ongoing build session.

Subagent Definition Format

None.

Previous version (v3.1.0) comparison

v3.1.0 had 12 named subagents, parallel wave execution, team mode, Codex peer review. All of these were removed in v4.0.0. The current version is a deliberate retreat from orchestration complexity.

Ui Cli Surface

cavekit — UI & CLI Surface

CLI Binary

None. All interaction via Claude Code slash commands and skill activation.

Slash Commands

Command	Invocation
`/ck:spec`	Create/amend/backprop SPEC.md
`/ck:build`	Build against SPEC.md
`/ck:check`	Drift report

Local UI

From README:

"cat SPEC.md is the dashboard."

No web dashboard, no TUI, no metrics panel.

IDE Integration

Claude Code plugin via /plugin marketplace add juliusbrussee/cavekit. Also agentskills.io skills via npx skills add JuliusBrussee/cavekit.

Observability

SPEC.md §B section is the audit log (append-only bug table). No structured JSON/JSONL logging.

Previous version (v3.1.0)

v3.1.0 had a stop-hook state machine and token budget tracking per task. Both removed in v4.

Related frameworks

same archetype · same primary tool · same memory type

Context Mode ★ 16k

A19 Token-efficient encoding

Keeps raw tool output data out of the context window via sandbox execution and SQLite+FTS5 session indexing, reducing context…

lean-ctx ★ 2.2k

A19 Token-efficient encoding

A full-session context runtime that compresses file reads (10 modes), shell output (60+ patterns), and session memory (CCP) to…

Nemp Memory ★ 101

A19 Token-efficient encoding

Persists AI agent context across sessions as 100%-local plain JSON files with zero dependencies, zero cloud, and agent identity…

CogniLayer v4 ★ 28

A19 Token-efficient encoding

Provides AI coding agents with typed semantic memory, tree-sitter code intelligence, and a multi-agent coordination protocol to…

cursor-coding-agent-os (Mugiwara555343) ★ 3

A19 Token-efficient encoding

Lean/Verbose dual-mode Agent OS fork for solo developers on token budgets.

rtk (Real Token Killer) ★ 55k

A19 Token-efficient encoding

Intercepts Claude Code's Bash tool calls at the PreToolUse hook and compresses verbose CLI output (git status, test runners,…

Distribution

Type: claude-plugin
License: MIT
Install: one-liner
Version: 4.0.0

Surfaces

CLI binary: No
CLI subcmds: 0
Local UI: No
Tech stack: none

Components

Commands: 3
Skills: 5
Subagents: 0
Hooks: 0
MCP servers: 0
MCP tools: 0
Scripts: 0
Templates: 1

Workflow

Phases: 3
Approval gates: 2
Spec format: proprietary
Spec storage: flat-files
Delta or full: whole-file

Orchestration

Multi-agent: No
Pattern: sequential
Max concurrent: 1
Isolation: none
Consensus: none
Prompt chaining: Yes

Multi-model

Multi-model: No
BYOK: Yes
Modal: text

Execution

Mode: interactive-loop
Crash recovery: Yes
Compaction: Yes
Session handoff: Yes
Streaming: No

Memory

Type: file-based
Persistence: project
Search: none
State files: 1 file

Quality

TDD: Yes
TDD mechanism: post-hook-test-runner
Validators: 1
Self-review: none

Git / Observability

Auto commit: No
Auto PR: No
Auto merge: No
Worktree/feat: No
Audit log: Yes
Audit format: structured-md
Replay: No

Tools

Primary: claude-code
Targets: 1
Portability: low

Signals

Stars: 948
Last commit: 2026-05-17
Maintainer: active
Quality score: 5.4/10