Skip to content
/

Chachamaru Claude Code Harness

chachamaru-cc-harness · Chachamaru127/claude-code-harness · ★ 1.6k · last commit 2026-05-26

Turns raw Claude Code sessions into a disciplined Plan→Work→Review→Release delivery loop with spec contracts and worktree-isolated agent teams.

Best whenThe Go binary enforcing constraints at the OS level, outside the LLM loop, is the only reliable way to prevent agent drift.
Skip ifPromoting unobserved data into claims (not_observed != absent), Starting work before plan contract is approved
vs seeds
superpowers(skills-as-behavioral-framework, Claude Code plugin) but adds a compiled Go enforcement binary, named worktree-isolated …
Primitive shape 53 total
Commands 7 Skills 34 Subagents 4 Hooks 8
00

Summary

Chachamaru Claude Code Harness — Summary

Claude Code Harness by Chachamaru is the most architecturally complete single-author harness in this batch: a Go-native binary (bin/harness), a TOML-driven config generator (harness.toml), a Claude Code plugin, Codex CLI and OpenCode compatibility paths, 34+ named skills, 4 named subagents, and a runtime hook system with PreToolUse/PermissionRequest/Setup lifecycle events. The central thesis is "Plan→Work→Review→Release as a disciplined delivery loop" — the 5-verb surface (harness-plan, harness-work, harness-review, harness-sync, harness-release) is an enforced workflow gate, not a collection of optional helpers. The Go-native bin/harness binary supplies permission enforcement, inbox checking, and migration reporting outside the Claude Code runtime, making it the only harness in this batch that ships language-native compiled tooling. Worktree-per-agent isolation is declared in the agent definitions (worker.md declares isolation: worktree). TDD is enforced by skill-level --tdd-bypass flag with mandatory audit trail, placing it in the "persona-instruction" enforcement camp but with an escape-valve mechanism. Differs from seeds: closest to superpowers (skills-only behavioral framework) but adds a CLI binary, TOML config layer, named subagents, and multi-tool compatibility that superpowers lacks; closer to kiro in the sense of a holistic delivery harness, but distributed as a Claude Code plugin rather than a closed IDE.

01

Overview

Chachamaru Claude Code Harness — Overview

Origin

GitHub: https://github.com/Chachamaru127/claude-code-harness
Version analyzed: v4.12.3 (from harness.toml)
Language: Shell (primary), Go (binary), with Japanese-first documentation.
Stars: 1,611 as of 2026-05-26.

Philosophy

From the README:

"Claude Code is powerful, but raw agent work drifts: plans live in chat, tests become optional, review happens too late, and release evidence gets rebuilt by memory. Harness turns that into one repeatable operating path."

The stated goal is to shift the user's job from "hand-write the plan" to "approve or correct the generated contract before execution continues." Harness occupies the meta-layer: it frames every agent session as a contract lifecycle (spec.md → Plans.md → implementation → verification → release evidence).

Manifesto-Style Quotes

From README:

"After install, the default changes from 'ask the agent to code' to: 1. write the spec and plan, 2. implement only the approved slice, 3. verify the result, 4. review independently, 5. package evidence for PR or release."

From harness-plan SKILL.md:

"harness-plan は、spec.md product contract and Plans.md task contract の co-required planning output を作る planning surface である."

From agent worker.md:

"推測で要件を足さない。未確認事項は 'missing-input' として明示する." (Do not add requirements by inference. Flag unchecked items as 'missing-input'.)

From harness-work SKILL.md:

"明示的なモードフラグは常にオートモードを上書きする" (Explicit mode flags always override auto-mode selection.)

Key Design Decisions

  1. TOML as source of truth: harness.toml is the project config; bin/harness sync regenerates all plugin files from it.
  2. Go binary for guardrails: compiled harness binary handles hook dispatch, permission enforcement, and migration reporting without requiring Node.js.
  3. 5-verb discipline: the only valid entry points are plan/work/review/sync/release — no catch-all "chat with agent" path.
  4. not_observed != absent: missing local proof means "not proven here", not "impossible". A documented epistemic stance.
02

Architecture

Chachamaru Claude Code Harness — Architecture

Distribution

  • Primary: Claude Code plugin marketplace (/plugin marketplace add Chachamaru127/claude-code-harness)
  • Codex CLI: scripts/setup-codex.sh --user
  • OpenCode: scripts/setup-opencode.sh
  • Cursor / GitHub Copilot CLI: candidate paths (no install evidence yet)

Install Complexity

Multi-step: marketplace add + plugin install + /harness-setup initialization.

Directory Tree

claude-code-harness/
├── .claude-code-harness.config.yaml  # runtime config
├── .claude-plugin/                   # Claude Code plugin manifest
├── .claude/                          # memory/, output-styles/, rules/
├── .codex-plugin/                    # Codex plugin support
├── .cursor/                          # Cursor profile
├── agents/                           # 4 subagent definitions
│   ├── advisor.md
│   ├── reviewer.md
│   ├── scaffolder.md
│   └── worker.md
├── bin/                              # Go binaries
│   ├── harness                       # Universal launcher
│   ├── harness-darwin-amd64
│   ├── harness-darwin-arm64
│   ├── harness-linux-amd64
│   └── harness-windows-amd64.exe
├── codex/                            # Codex-specific files
├── go/                               # Go source
├── hooks/
│   └── hooks.json                    # PreToolUse/PermissionRequest/Setup hooks
├── opencode/                         # OpenCode-specific files
├── output-styles/                    # Output format templates
├── scripts/                          # Setup and migration scripts
├── skills/                           # 34+ skill directories
├── templates/                        # Project templates
├── tests/                            # Test suite
├── workflows/                        # Workflow definitions
├── harness.toml                      # Master config (v4.12.3)
└── spec.md                           # Self-describing spec

Required Runtime

  • Claude Code v2.1+ (supported path)
  • Go runtime (bundled binary, no install needed)
  • No Node.js required

Target AI Tools

Tool Tier
Claude Code supported
Codex CLI internal-compatible
Codex app candidate
OpenCode internal-compatible
Cursor candidate
GitHub Copilot CLI candidate
03

Components

Chachamaru Claude Code Harness — Components

Skills (34+)

Name Purpose
harness-plan Core planning: spec.md + Plans.md generation with quality scoring
harness-work Task execution: solo/parallel/breezing/codex modes
harness-review Independent review phase, major findings block completion
harness-release Release preflight and evidence packaging
harness-setup One-time project initialization
harness-sync Synchronize Plans.md with implementation state
harness-accept Acceptance criteria verification
harness-loop Continuous loop management
harness-plan-brief Lightweight plan variant
harness-progress Progress tracking
session-control Session lifecycle management
session-init Session initialization
session-memory Cross-session memory management
session-state State persistence
session General session handling
memory Memory read/write primitives
principles Behavioral principles injection
routing-rules Tool routing configuration
agent-browser Browser automation skill
auth Authentication patterns
breezing Full-team parallel execution
cc-cursor-cc Cross-harness bridging
cc-update-review Update and review cycle
ci CI failure recovery
crud CRUD pattern enforcement
deploy Deployment patterns
generate-slide Presentation generation
generate-video Video generation
gogcli-ops Go CLI operations
maintenance Maintenance protocols
notebookLM NotebookLM integration
ui UI implementation patterns
vibecoder-guide Vibe coding guidance
workflow-guide Workflow reference

Agents (4 subagents)

Name Role
worker.md Implements tasks in TDD→impl→preflight→verify→commit-prep cycle; isolation: worktree
reviewer.md Independent review of worker output
advisor.md Strategic advice before implementation
scaffolder.md Project scaffolding

Hooks

File: hooks/hooks.json

Event Matcher Action
PreToolUse Write|Edit|MultiEdit|Bash|Read bin/harness hook pre-tool (Go binary)
PreToolUse AskUserQuestion bin/harness hook ask-user-question-normalize
PreToolUse Write|Edit bin/harness hook inbox-check + agent review for secrets/TODO stubs
PreToolUse mcp__chrome-devtools__.|mcp__playwright__. bin/harness hook browser-guide
PermissionRequest Edit|Write|MultiEdit bin/harness hook permission
PermissionRequest Bash (git/npm/test) bin/harness hook permission (auto-allow pattern)
Setup init bin/harness hook setup-init
Setup maintenance maintenance hook

Scripts

Path Purpose Trigger
scripts/setup-codex.sh Codex CLI installation manual
scripts/setup-opencode.sh OpenCode installation manual
bin/harness Go binary for hooks/permissions/migration hook/manual

Config Files

  • harness.toml — master configuration, generates plugin files on harness sync
  • .claude-code-harness.config.yaml — runtime override config
  • claude-code-harness.config.example.json — example config
  • claude-code-harness.config.schema.json — JSON schema for config validation

Templates

Located in templates/ — project templates for common patterns.

05

Prompts

Chachamaru Claude Code Harness — Prompts

Excerpt 1: harness-plan SKILL.md (quality flow)

Technique: Iron-law prompt with rationalization table + precedence hierarchy

harness-plan は、spec.md product contract and Plans.md task contract の co-required planning output を作る planning surface である。
precedence は `spec.md > sub-spec > Plans.md` のまま維持する。
Plans.md は task ledger、root `spec.md` は product contract であり、上下関係は崩さない。
渡された情報をそのまま Plans.md に落とさない。
計画作成や大きな task 追加では、最新情報・既存仕様・記憶・複数視点の議論を確認し、
このプロダクトに取り入れるべき要素だけを task contract に変換する。
`/harness-plan create` は `Spec delta` または `Spec skip reason` と `Plans.md` task 生成をセットで返す。
出力には必ず `Spec delta` または `Spec skip reason` を含める。

This uses a "dual-output requirement" pattern: every plan creation must produce either a spec delta OR an explicit spec-skip reason — never just code tasks.

Excerpt 2: harness-work SKILL.md (auto-selection)

Technique: Decision table encoding in prompt + explicit override protocol

## Execution Mode Auto Selection(フラグなし時の自動判定)

明示的なモードフラグ(`--parallel`, `--breezing`, `--codex`)がない場合、
対象タスク数に応じて最適なモードを自動選択する:

| 対象タスク数 | 自動選択モード | 理由 |
|-------------|---------------|------|
| **1 件** | Solo | オーバーヘッド最小。直接実装が最速 |
| **2〜3 件** | Parallel(Task tool) | Worker 分離のメリットが出始める閾値 |
| **4 件以上** | Breezing | Lead 調整 + Worker 並列 + Reviewer 独立の三者分離が効果的 |

### ルール
1. **明示フラグは常にオートモードを上書き**する
   - `--parallel N` → Parallel モード(タスク数に関係なく)
   - `--breezing` → Breezing モード(タスク数に関係なく)
   - `--codex` → Codex モード(タスク数に関係なく)

Excerpt 3: worker agent (isolation declaration)

Technique: Structured JSON input contract + ordered startup checklist

model: claude-sonnet-4-6
isolation: worktree

initialPrompt: |
  セッション開始後、最初に次の 4 点をこの順で確認する。
  1. task と task_id
  2. 変更してよいファイル
  3. DoD と sprint-contract のパス
  4. 仕様正本のパスまたは spec_skip_reason
  5. 実行する検証コマンド
  その後は TDD 判定 -> 実装 -> preflight -> 検証 -> commit 準備の順で進める。
  推測で要件を足さない。未確認事項は "missing-input" として明示する。

Excerpt 4: hooks.json — inline agent review hook

Technique: In-hook LLM review with structured deny/allow output

{
  "type": "agent",
  "prompt": "Review the following code change for quality issues. Check if the change: (1) introduces hardcoded secrets or credentials, (2) leaves TODO/FIXME stubs without implementation, (3) has obvious security vulnerabilities (SQL injection, XSS, command injection). If any issue is found, return JSON with permissionDecision: 'deny' and permissionDecisionReason explaining the issue. If the change looks acceptable, return nothing (exit 0). Input: $ARGUMENTS",
  "model": "haiku",
  "timeout": 30
}
09

Uniqueness

Chachamaru Claude Code Harness — Uniqueness

differs_from_seeds

Closest to superpowers (skills-as-behavioral-framework, Claude Code plugin) but differs in three significant ways: (1) a compiled Go binary that runs outside the LLM loop as an enforcement engine — superpowers has no binary; (2) named subagent definitions with worktree isolation per worker — superpowers has no named agents; (3) a TOML-driven config system (harness.toml) that generates all plugin files, making the harness itself a meta-level generator. Also resembles kiro in the sense of treating spec.md + Plans.md as a binding contract, but kiro is a closed IDE while Chachamaru distributes as a Claude Code plugin with optional Codex/OpenCode paths. Unlike BMAD-METHOD, the subagents here are not personas but functional roles with strict input/output contracts.

Positioning

This is the most technically complete Claude Code harness in the batch. It occupies a unique position as the only framework in the batch that ships a compiled binary, which means it can enforce constraints at the OS level (process exit codes, file permission checks) rather than relying entirely on the LLM to follow instructions.

Observable Failure Modes

  1. Token-intensive: The 5-verb workflow with quality checks at every stage will consume significant context budget. Large Plans.md files may exhaust context before reaching implementation.
  2. Japanese-first documentation: Most skill files are bilingual (JP+EN) but detailed reasoning is in Japanese — non-Japanese users may miss nuances.
  3. Codex/OpenCode parity gap: "internal-compatible" is weaker than "supported" — the README explicitly warns not to inherit support claims from other projects.
  4. TOML indirection: harness.toml as config requires harness sync to apply changes — users who edit plugin files directly will see their changes overwritten.
  5. Go binary trust: The compiled binary is distributed as a pre-built artifact; users cannot audit its behavior without reading the Go source.
04

Workflow

Chachamaru Claude Code Harness — Workflow

Phases

Stage Skill Output Gate
Investigate (ad-hoc) Evidence + unknowns Do not promote unobserved data into claims
Plan /harness-plan spec.md + Plans.md User approves or corrects the generated contract
Work /harness-work Code + tests TDD required when task says so; --tdd-bypass requires explicit audit reason
Review /harness-review Independent verdict Major findings block completion
PR (manual) Evidence pack PR-ready is not release-ready
Release /harness-release Tag + release artifacts Release preflight must pass

Execution Mode Auto-Selection (harness-work)

Task count Auto mode
1 Solo (direct implementation)
2-3 Parallel (Task tool worker separation)
4+ Breezing (Lead + Worker + Reviewer triad)

Explicit flags override: --parallel N, --breezing, --codex.

Approval Gates

  1. Plan approval: user must approve or correct spec.md + Plans.md before work begins.
  2. TDD gate: tests written before implementation (bypassable with explicit reason written to audit log).
  3. Review blocker: major findings from /harness-review halt progress.
  4. Release preflight: /harness-release runs readiness checklist before tagging.

Artifact Map

Phase Artifact
Plan spec.md (product contract), Plans.md (task contract)
Work implementation code, test files
Review review artifact (verdict + findings)
Release CHANGELOG boundary, tag, evidence package
Audit .claude/state/contracts/<task>.sprint-contract.json
06

Memory Context

Chachamaru Claude Code Harness — Memory & Context

State Storage

  • spec.md: product contract (root truth)
  • Plans.md: task ledger with completion markers (cc:完了)
  • .claude/state/contracts/<task>.sprint-contract.json: per-task execution contracts
  • .claude/state/active-plan.json: tracks the currently active named plan
  • plans/manifest.json: registry of named plans (supports multiple parallel plan sets)
  • output/: harness-managed output directory (structured evidence)

Memory Type

File-based, project-scoped. No external database required.

Session Persistence

  • harness-sync reconciles Plans.md against implementation state across sessions
  • --resume <id|latest> flag on harness-work restarts interrupted tasks
  • /recap command re-summarizes context after long absence before resuming
  • Optional: harness-mem for extended memory (referenced in README, separate repo)

Context Compaction

The skill system uses Claude Code's native compaction. harness-plan tracks spec.md > sub-spec > Plans.md precedence to prevent plan drift across compact events.

Handoffs

  • Sprint contracts (.claude/state/contracts/) provide structured handoff context for parallel workers
  • Named plan switching (.claude/state/active-plan.json) enables multi-feature parallel development
  • bin/harness doctor --migration-report inventories old state without deletion for migration handoffs

Audit Trail

The Go binary writes structured logs for hook invocations. TDD bypass events require HARNESS_TDD_BYPASS_REASON environment variable or explicit inline reason, written to audit log.

07

Orchestration

Chachamaru Claude Code Harness — Orchestration

Multi-Agent Architecture

Yes. The breezing mode instantiates a three-agent team:

  • Lead: coordinates task execution and progress tracking
  • Worker: implements assigned tasks (4 workers max by default, configurable via --parallel N)
  • Reviewer: independent review at task completion

All agents are defined as .md files in agents/ with YAML frontmatter specifying model, isolation, skills, and initialPrompt.

Orchestration Pattern

task-decomposition-tree: Plans.md provides the task tree; harness-work decomposes it into solo/parallel/breezing execution based on task count.

The spawn mechanism is Claude Code's Task tool — workers are spawned via the Task tool from within harness-work.

Isolation Mechanism

git-worktree: the worker agent declares isolation: worktree. Each worker operates in its own worktree, preventing concurrent edits to shared files.

Multi-Model

Yes (limited): the inline hook review uses model: haiku for fast inline code review at the PreToolUse hook level. The main agent and workers default to the session model (claude-sonnet-4-6 per worker.md frontmatter).

Execution Mode

Interactive-loop: driven by slash commands (/harness-plan, /harness-work, etc.). Not a background daemon; each command invokes a discrete agent session.

Crash Recovery

--resume <id|latest> flag on harness-work restarts interrupted sessions. The sprint contract file preserves task state across interruptions.

Cross-Tool Portability

Medium: Claude Code is the fully supported path. Codex CLI and OpenCode are "internal-compatible" (install scripts exist, runtime parity not claimed). Cursor is a candidate.

Consensus

None. The reviewer agent produces a verdict that blocks or allows completion; no voting or quorum mechanism.

08

Ui Cli Surface

Chachamaru Claude Code Harness — UI & CLI Surface

Dedicated CLI Binary

Yes. bin/harness is a compiled Go binary (Darwin arm64/amd64, Linux amd64, Windows amd64).

It is NOT a thin wrapper over claude/codex CLI — it is an independent runtime for:

  • Hook dispatch (hook pre-tool, hook permission, hook setup-init)
  • Inbox checking (hook inbox-check)
  • Migration reporting (doctor --migration-report)
  • Browser guidance (hook browser-guide)
  • Harness sync (harness sync — regenerates plugin files from harness.toml)

CLI Subcommands

Subcommand Action
hook pre-tool PreToolUse lifecycle handler
hook permission PermissionRequest handler
hook setup-init Initialization hook
hook inbox-check Check for pending inbox items
hook browser-guide Browser automation guidance
hook ask-user-question-normalize Normalize question format
doctor --migration-report Inventory stale caches and old state
sync Regenerate plugin files from harness.toml

Local UI

None (no web dashboard or TUI).

Slash Commands (Claude Code)

  • /harness-setup — initialization
  • /harness-plan — planning
  • /harness-work — implementation
  • /harness-review — review
  • /harness-release — release preflight
  • /recap — context re-summary
  • /undo (/rewind alias) — rollback last plan update

Observability

  • The Go binary writes structured hook execution logs
  • TDD bypass events are written to the audit log with mandatory reason
  • bin/harness doctor --migration-report provides inventory output

IDE Integration

  • Claude Code: native plugin (primary)
  • Cursor: .cursor/ profile (candidate)
  • No IDE fork or extension

Related frameworks

same archetype · same primary tool · same memory type

Liza ★ 227

Hardened multi-agent coding system with code-enforced role boundaries, adversarial doer/reviewer pairs, and 55+ failure mode…

DotForge ★ 6

Declare behavioral policies for Claude Code and compile them into enforcing PreToolUse hooks, with cross-project audit and sync…

Superpowers ★ 207k

Enforces spec-first, TDD, and subagent-reviewed development as mandatory automatic workflows rather than optional practices.

Claude-Flow / Ruflo ★ 55k

Eliminates single-agent context limits and sequential bottlenecks by orchestrating fault-tolerant swarms of specialized AI agents…

BMAD-METHOD ★ 48k

Provides a full agile delivery lifecycle with named expert-persona AI collaborators that elicit the human's best thinking rather…

Agent OS ★ 4.6k

Extracts implicit codebase conventions into token-efficient markdown standards files and injects them selectively into AI agent…