Agent Harness Skill (Harness Engineering)

ldzhouquan-agent-harness-skill · ldzhouquan/agent-harness-skill · ★ 9 · last commit 2026-03-17

Systematic Harness Engineering methodology with Iron Laws, Reflexion Loop, and 5-second session-amnesia recovery protocol.

Best whenArchitecture as Law with custom linter is the only way to prevent architecture rot — suggestions are not enough.

Skip ifRetrying without observation (STOP protocol violation), Leaving broken code at session end (Clean State violation)

vs seeds

superpowers(Archetype 1: skills-only behavioral framework, Iron Laws TDD enforcement) but adds explicit session-amnesia recovery vi…

Primitive shape 11 total

Commands 1 Skills 10

Summary

ldzhouquan Agent Harness Skill — Summary

Agent Harness Skill by ldzhouquan is a Claude Code skill that implements "Harness Engineering" as a structured development methodology: a single entry-point SKILL.md dispatches to 9 workflow modules (initialization, feature management, development workflow, bug-fix protocol, architecture enforcement, code merge, autonomous development, technical debt, progress tracking) based on task context. The central thesis is "shift focus from writing code to designing feedback loops" — the Reflexion Loop (Design→Code→Test→Fix) is declared "the Main Execution Engine, not an optional step." Seven Iron Laws govern all work: Clean State, One Feature At A Time, Knowledge Must Be In Repository, Architecture Is Law, Observability First, TDD, and Boring Technology. The framework treats CI/lint enforcement as non-negotiable (Linter Error = Fix Instruction), architecture violation detection as continuous, and progress tracking as a mandatory file-based protocol. Differs from seeds: closest to superpowers (Archetype 1: skills-only behavioral framework with Iron Laws) but adds structured module dispatch, explicit architecture-as-law enforcement, and the progress.txt/feature_list.json context-recovery protocol for session amnesia.

Overview

ldzhouquan Agent Harness Skill — Overview

Origin

GitHub: https://github.com/ldzhouquan/agent-harness-skill
Stars: 9
License: unknown (none declared)
Language: Shell
Last commit: 2026-03-17

Philosophy

From README:

"A systematic implementation of Harness Engineering, enabling agents to work in a stable, controllable, and verifiable environment. It shifts focus from 'writing code' to 'designing feedback loops'."

Four pain points addressed:

Amnesia: agents forget context between sessions → Context Discovery Protocol (5-second recovery via progress.txt + feature_list.json)
Hallucinations: agents write non-running code → TDD + Reflexion Loop (no test evidence = code does not exist)
Architecture Rot: circular dependencies as projects grow → Architecture as Law (custom linter as unbreakable red line)
Blind coding: agents can't see runtime errors → Observability First (logs + screenshots before modifying code)

Manifesto-Style Quotes

From SKILL.md:

"⚠️ Critical Warning: The Reflexion Loop (Design -> Code -> Review -> Test -> Fix) is NOT an optional step. It IS the workflow." "Violation of the 'STOP' protocol (retrying without observation) is a critical failure." "Violation of the 'Clean State' protocol (leaving broken code) is a critical failure."

Iron Laws (7)

Clean State — code must be runnable at end of each session
One Feature At A Time — never allow handling multiple features simultaneously
Knowledge Must Be In Repository — if not in the repo, it doesn't exist
Architecture Is Law — layering and Providers pattern must be enforced
Observability First — agent must "see" system state before modifying code
Test-Driven Development — write the test/reproduction script FIRST
Choose Boring Technology — prefer mature, stable tech stacks

Architecture

ldzhouquan Agent Harness Skill — Architecture

Distribution

Manual install:

git clone https://github.com/ldzhouquan/agent-harness-skill.git
ln -s agent-harness-skill/Harness ~/.claude/skills/Harness

Or tell Claude directly:

"Use the Harness skill from https://github.com/ldzhouquan/agent-harness-skill"

Directory Tree

agent-harness-skill/
├── .claude/
│   └── settings.local.json   # Local Claude Code settings
├── Harness/
│   ├── SKILL.md              # Entry point — dispatches to modules
│   ├── workflow.md           # Detailed step-by-step execution guide
│   ├── Tools/                # Tool scripts
│   ├── scripts/              # Automation scripts
│   ├── references/           # Reference materials
│   │   ├── initialization/
│   │   ├── bugfix/
│   │   └── development/
│   └── modules/
│       ├── initialization.md           # Setup + Golden Spike
│       ├── feature-management.md       # Planning + Specs
│       ├── development-workflow.md     # The Loop
│       ├── bug-fix-protocol.md         # TDD bug fix
│       ├── architecture-enforcement.md # Architecture as Law
│       ├── code-merge.md               # Review + Merge
│       ├── autonomous-development.md   # E2E Autonomy
│       ├── technical-debt.md           # Cleanup Protocols
│       └── progress-tracking.md        # Progress tracking
└── README.md

Required Runtime

Claude Code (implied by skill format)
No additional dependencies

Target AI Tools

Claude Code (.claude/ directory structure).

Components

ldzhouquan Agent Harness Skill — Components

Skills (1 entry point + 9 modules)

Name	Dispatch trigger	Purpose
Harness/SKILL.md	All work	Entry point — routes to appropriate module
modules/initialization.md	New project	Setup CI/Lint/Test + Golden Spike (Hello World)
modules/feature-management.md	Planning	Deconstruct requirements into feature_list.json
modules/development-workflow.md	Implementation	Locate→Ground→Recall→Verify→Claim loop
modules/bug-fix-protocol.md	Bug fix	TDD bug fix (reproduction script first)
modules/architecture-enforcement.md	Architecture check	Layering + Providers pattern enforcement
modules/code-merge.md	Review/merge/PR	Verify clean state → merge
modules/autonomous-development.md	Full autonomy	E2E autonomous mode
modules/technical-debt.md	Cleanup	Technical debt protocols
modules/progress-tracking.md	Status	progress.txt + feature_list.json management

State Files (mandatory protocol)

File	Purpose
progress.txt	Current task state — enables 5-second context recovery
feature_list.json	Decomposed requirements
AGENTS.md	Project agent configuration

Scripts

Located in Harness/scripts/ — automation scripts (content not enumerated in public API).

Tools

Located in Harness/Tools/ — tool scripts (content not enumerated in public API).

Prompts

ldzhouquan Agent Harness Skill — Prompts

Excerpt 1: SKILL.md — Unbreakable Iron Laws

Technique: Numbered iron laws with zero-tolerance language

## Unbreakable Iron Laws

1. **Clean State** - Code must be runnable at end of each session.
2. **One Feature At A Time** - Never allow handling multiple features simultaneously.
3. **Knowledge Must Be In Repository** - If it's not in the repo (file/doc), it doesn't exist.
4. **Architecture Is Law** - Layering and Providers pattern must be enforced.
5. **Observability First** - Agent must "see" system state (logs, metrics, screenshots) to verify itself.
6. **Test-Driven Development** - Write the test/reproduction script FIRST. Prove failure before fixing.
7. **Choose Boring Technology** - Prefer mature, stable tech stacks that agents understand best.

Excerpt 2: SKILL.md — Critical Warning

Technique: Emergency-tone critical failure declaration

> **⚠️ Critical Warning:**
> The Reflexion Loop (Design -> Code -> Review -> Test -> Fix) is NOT an optional step. It IS the workflow.
> - **Violation of the "STOP" protocol** (retrying without observation) is a critical failure.
> - **Violation of the "Clean State" protocol** (leaving broken code) is a critical failure.

Excerpt 3: SKILL.md — Architecture Enforcement Protocol

Technique: Zero-tolerance declarative rule

## Architecture & Linter Enforcement Protocol

1. **Linter Error = Fix Instruction**: Do not ask permission. Fix immediately.
2. **Zero Tolerance**: Task is NOT complete until all linter checks pass.
3. **Autonomous Refactoring**: Resolve architecture violations autonomously.

Uniqueness

ldzhouquan Agent Harness Skill — Uniqueness

differs_from_seeds

Closest to superpowers (Archetype 1: skills-only behavioral framework with Iron Laws, prompt-iron-law TDD enforcement) but adds two innovations superpowers lacks: (1) explicit session-amnesia recovery via progress.txt + feature_list.json (5-second context recovery); (2) architecture-as-law with custom linter enforcement as a non-negotiable red line. Unlike superpowers which triggers on autonomous signals, this has a single explicit entry point (SKILL.md) that dispatches to 9 workflow modules. Also similar to spec-driver (24 skills, modular dispatch) but spec-driver ships more skills while this ships deeper behavioral guidance per module.

Distinctive Feature

The "Context Discovery Protocol" (5-second recovery of full project context via standardized files) is a direct answer to LLM session amnesia — more explicit than any seed's approach.

Observable Failure Modes

Chinese/English mixed docs: main modules may be in Chinese (README is bilingual, modules may not be).
No license: unknown licensing.
Dormant: last commit 2026-03-17; not actively maintained.
No hooks: no automatic enforcement — relies on the LLM honoring the Iron Laws.
Golden Spike coupling: the Golden Spike ceremony is required before feature development — this can be friction for projects that already have a working baseline.

Workflow

ldzhouquan Agent Harness Skill — Workflow

Overview

Init → Plan → Dev → Reflexion Loop → Merge

Phases

Phase	Module	Artifact	Gate
Init	initialization.md	CI/Lint/Test config + AGENTS.md + Golden Spike (runnable Hello World)	Golden Spike must pass
Plan	feature-management.md	feature_list.json + specs	user approval implied
Dev	development-workflow.md	code + progress.txt updates	Reflexion Loop required
Bug Fix	bug-fix-protocol.md	reproduction test + fix	test proves failure before fix
Architecture Check	architecture-enforcement.md	linter output	zero tolerance
Merge	code-merge.md	PR	clean state required

Reflexion Loop (The Main Execution Engine)

Design → Code → Review → Test → Fix

Rules:

NOT optional — it IS the workflow
STOP protocol: if verification fails, stop and analyze root cause; do NOT retry without observation
Clean State: code must be runnable at each session end

Development Workflow Loop

Locate → Ground → Recall → Verify → Claim

Locate: find the relevant code
Ground: understand what exists before modifying
Recall: check progress.txt and feature_list.json
Verify: run tests/linter to confirm working state
Claim: only claim completion when verified

Approval Gates

Golden Spike must pass before moving to feature development
Linter errors are mandatory fixes (zero tolerance)
Architecture violations are mandatory fixes (not suggestions)

Memory Context

ldzhouquan Agent Harness Skill — Memory & Context

State Storage

File-based, project-scoped:

progress.txt — current task state. Enables 5-second context recovery after session restart.
feature_list.json — decomposed requirements
AGENTS.md — project agent config

Context Recovery Protocol

On session start, the agent reads progress.txt and feature_list.json in the first 5 seconds to recover full project context. This directly addresses the "amnesia" problem.

Iron Law 3: Knowledge Must Be In Repository

Any information not stored in a repository file does not exist. This enforces complete externalization of all state.

Memory Type

File-based, project-scoped.

Cross-Session Persistence

Via progress.txt and feature_list.json. The protocol is explicit: update these files before ending each session.

Orchestration

ldzhouquan Agent Harness Skill — Orchestration

Multi-Agent

The autonomous-development.md module describes "E2E Autonomy" mode but the specifics are not public. The main workflow is single-agent (Reflexion Loop).

Orchestration Pattern

Sequential: Reflexion Loop is the dominant pattern. The development workflow loop (Locate→Ground→Recall→Verify→Claim) is sequential per task.

Isolation Mechanism

None described explicitly.

Multi-Model

No.

Execution Mode

Interactive-loop: user invokes Harness skill; agent dispatches to appropriate module.

Cross-Tool Portability

Low to medium: targets Claude Code (.claude/skills/) but the skill could theoretically be used with other Claude Code-compatible tools.

Ui Cli Surface

ldzhouquan Agent Harness Skill — UI & CLI Surface

Dedicated CLI Binary

No.

Local UI

No.

Slash Command

/Harness — triggers the main SKILL.md entry point (or invoke directly by description).

Observability

progress.txt — current task state (readable by humans)
CI/linter output (external, referenced by architecture enforcement module)

Related frameworks

same archetype · same primary tool · same memory type

BMAD-METHOD ★ 48k

A4 Markdown scaffold

Provides a full agile delivery lifecycle with named expert-persona AI collaborators that elicit the human's best thinking rather…

Agent OS ★ 4.6k

A4 Markdown scaffold

Extracts implicit codebase conventions into token-efficient markdown standards files and injects them selectively into AI agent…

Claude Conductor ★ 367

A4 Markdown scaffold

Gives Claude Code a persistent, cross-linked, auto-analyzed documentation system so it retains codebase context across sessions.

Spec-Driver (Greenfield Spec-Driven Development) ★ 25

A4 Markdown scaffold

Prevents spec rot in AI-assisted development by making implementation changes flow back into evergreen, authoritative specs via…

Anthropic Knowledge Work Plugins ★ 16k

A4 Markdown scaffold

Role-specialized plugin bundles with live MCP connectors that turn Claude into a domain expert for enterprise knowledge workers.

Codex Integration for Claude Code (skill-codex) ★ 1.3k

A4 Markdown scaffold

Single Claude Code skill that handles Codex CLI invocation correctly (stdin blocking, thinking token suppression, session resume)…

Distribution

Type: skill-pack
Install: clone-and-configure

Surfaces

CLI binary: No
CLI subcmds: 0
Local UI: No
Tech stack: none

Components

Commands: 1
Skills: 10
Subagents: 0
Hooks: 0
MCP servers: 0
MCP tools: 0
Scripts: 0
Templates: 0

Workflow

Phases: 5
Approval gates: 2
Spec format: markdown
Spec storage: flat-files
Delta or full: whole-file

Orchestration

Multi-agent: No
Pattern: sequential
Max concurrent: 1
Isolation: none
Consensus: none
Prompt chaining: Yes

Multi-model

Multi-model: No
BYOK: Yes
Modal: text

Execution

Mode: interactive-loop
Crash recovery: Yes
Compaction: Yes
Session handoff: Yes
Streaming: No

Memory

Type: file-based
Persistence: project
Search: none
State files: 3 files

Quality

TDD: Yes
TDD mechanism: prompt-iron-law
Validators: 1
Self-review: inline-self

Git / Observability

Auto commit: No
Auto PR: No
Auto merge: No
Worktree/feat: No
Audit log: No
Audit format: none
Replay: No

Tools

Primary: claude-code
Targets: 1
Portability: low

Signals

Stars: 9
Last commit: 2026-03-17
Contributors: 1
Maintainer: dormant
Quality score: 4.2/10