Skip to content
/

Agent Harness Skill (Harness Engineering)

ldzhouquan-agent-harness-skill · ldzhouquan/agent-harness-skill · ★ 9 · last commit 2026-03-17

Systematic Harness Engineering methodology with Iron Laws, Reflexion Loop, and 5-second session-amnesia recovery protocol.

Best whenArchitecture as Law with custom linter is the only way to prevent architecture rot — suggestions are not enough.
Skip ifRetrying without observation (STOP protocol violation), Leaving broken code at session end (Clean State violation)
vs seeds
superpowers(Archetype 1: skills-only behavioral framework, Iron Laws TDD enforcement) but adds explicit session-amnesia recovery vi…
Primitive shape 11 total
Commands 1 Skills 10
00

Summary

ldzhouquan Agent Harness Skill — Summary

Agent Harness Skill by ldzhouquan is a Claude Code skill that implements "Harness Engineering" as a structured development methodology: a single entry-point SKILL.md dispatches to 9 workflow modules (initialization, feature management, development workflow, bug-fix protocol, architecture enforcement, code merge, autonomous development, technical debt, progress tracking) based on task context. The central thesis is "shift focus from writing code to designing feedback loops" — the Reflexion Loop (Design→Code→Test→Fix) is declared "the Main Execution Engine, not an optional step." Seven Iron Laws govern all work: Clean State, One Feature At A Time, Knowledge Must Be In Repository, Architecture Is Law, Observability First, TDD, and Boring Technology. The framework treats CI/lint enforcement as non-negotiable (Linter Error = Fix Instruction), architecture violation detection as continuous, and progress tracking as a mandatory file-based protocol. Differs from seeds: closest to superpowers (Archetype 1: skills-only behavioral framework with Iron Laws) but adds structured module dispatch, explicit architecture-as-law enforcement, and the progress.txt/feature_list.json context-recovery protocol for session amnesia.

01

Overview

ldzhouquan Agent Harness Skill — Overview

Origin

GitHub: https://github.com/ldzhouquan/agent-harness-skill
Stars: 9
License: unknown (none declared)
Language: Shell
Last commit: 2026-03-17

Philosophy

From README:

"A systematic implementation of Harness Engineering, enabling agents to work in a stable, controllable, and verifiable environment. It shifts focus from 'writing code' to 'designing feedback loops'."

Four pain points addressed:

  1. Amnesia: agents forget context between sessions → Context Discovery Protocol (5-second recovery via progress.txt + feature_list.json)
  2. Hallucinations: agents write non-running code → TDD + Reflexion Loop (no test evidence = code does not exist)
  3. Architecture Rot: circular dependencies as projects grow → Architecture as Law (custom linter as unbreakable red line)
  4. Blind coding: agents can't see runtime errors → Observability First (logs + screenshots before modifying code)

Manifesto-Style Quotes

From SKILL.md:

"⚠️ Critical Warning: The Reflexion Loop (Design -> Code -> Review -> Test -> Fix) is NOT an optional step. It IS the workflow." "Violation of the 'STOP' protocol (retrying without observation) is a critical failure." "Violation of the 'Clean State' protocol (leaving broken code) is a critical failure."

Iron Laws (7)

  1. Clean State — code must be runnable at end of each session
  2. One Feature At A Time — never allow handling multiple features simultaneously
  3. Knowledge Must Be In Repository — if not in the repo, it doesn't exist
  4. Architecture Is Law — layering and Providers pattern must be enforced
  5. Observability First — agent must "see" system state before modifying code
  6. Test-Driven Development — write the test/reproduction script FIRST
  7. Choose Boring Technology — prefer mature, stable tech stacks
02

Architecture

ldzhouquan Agent Harness Skill — Architecture

Distribution

Manual install:

git clone https://github.com/ldzhouquan/agent-harness-skill.git
ln -s agent-harness-skill/Harness ~/.claude/skills/Harness

Or tell Claude directly:

"Use the Harness skill from https://github.com/ldzhouquan/agent-harness-skill"

Directory Tree

agent-harness-skill/
├── .claude/
│   └── settings.local.json   # Local Claude Code settings
├── Harness/
│   ├── SKILL.md              # Entry point — dispatches to modules
│   ├── workflow.md           # Detailed step-by-step execution guide
│   ├── Tools/                # Tool scripts
│   ├── scripts/              # Automation scripts
│   ├── references/           # Reference materials
│   │   ├── initialization/
│   │   ├── bugfix/
│   │   └── development/
│   └── modules/
│       ├── initialization.md           # Setup + Golden Spike
│       ├── feature-management.md       # Planning + Specs
│       ├── development-workflow.md     # The Loop
│       ├── bug-fix-protocol.md         # TDD bug fix
│       ├── architecture-enforcement.md # Architecture as Law
│       ├── code-merge.md               # Review + Merge
│       ├── autonomous-development.md   # E2E Autonomy
│       ├── technical-debt.md           # Cleanup Protocols
│       └── progress-tracking.md        # Progress tracking
└── README.md

Required Runtime

  • Claude Code (implied by skill format)
  • No additional dependencies

Target AI Tools

Claude Code (.claude/ directory structure).

03

Components

ldzhouquan Agent Harness Skill — Components

Skills (1 entry point + 9 modules)

Name Dispatch trigger Purpose
Harness/SKILL.md All work Entry point — routes to appropriate module
modules/initialization.md New project Setup CI/Lint/Test + Golden Spike (Hello World)
modules/feature-management.md Planning Deconstruct requirements into feature_list.json
modules/development-workflow.md Implementation Locate→Ground→Recall→Verify→Claim loop
modules/bug-fix-protocol.md Bug fix TDD bug fix (reproduction script first)
modules/architecture-enforcement.md Architecture check Layering + Providers pattern enforcement
modules/code-merge.md Review/merge/PR Verify clean state → merge
modules/autonomous-development.md Full autonomy E2E autonomous mode
modules/technical-debt.md Cleanup Technical debt protocols
modules/progress-tracking.md Status progress.txt + feature_list.json management

State Files (mandatory protocol)

File Purpose
progress.txt Current task state — enables 5-second context recovery
feature_list.json Decomposed requirements
AGENTS.md Project agent configuration

Scripts

Located in Harness/scripts/ — automation scripts (content not enumerated in public API).

Tools

Located in Harness/Tools/ — tool scripts (content not enumerated in public API).

05

Prompts

ldzhouquan Agent Harness Skill — Prompts

Excerpt 1: SKILL.md — Unbreakable Iron Laws

Technique: Numbered iron laws with zero-tolerance language

## Unbreakable Iron Laws

1. **Clean State** - Code must be runnable at end of each session.
2. **One Feature At A Time** - Never allow handling multiple features simultaneously.
3. **Knowledge Must Be In Repository** - If it's not in the repo (file/doc), it doesn't exist.
4. **Architecture Is Law** - Layering and Providers pattern must be enforced.
5. **Observability First** - Agent must "see" system state (logs, metrics, screenshots) to verify itself.
6. **Test-Driven Development** - Write the test/reproduction script FIRST. Prove failure before fixing.
7. **Choose Boring Technology** - Prefer mature, stable tech stacks that agents understand best.

Excerpt 2: SKILL.md — Critical Warning

Technique: Emergency-tone critical failure declaration

> **⚠️ Critical Warning:**
> The Reflexion Loop (Design -> Code -> Review -> Test -> Fix) is NOT an optional step. It IS the workflow.
> - **Violation of the "STOP" protocol** (retrying without observation) is a critical failure.
> - **Violation of the "Clean State" protocol** (leaving broken code) is a critical failure.

Excerpt 3: SKILL.md — Architecture Enforcement Protocol

Technique: Zero-tolerance declarative rule

## Architecture & Linter Enforcement Protocol

1. **Linter Error = Fix Instruction**: Do not ask permission. Fix immediately.
2. **Zero Tolerance**: Task is NOT complete until all linter checks pass.
3. **Autonomous Refactoring**: Resolve architecture violations autonomously.
09

Uniqueness

ldzhouquan Agent Harness Skill — Uniqueness

differs_from_seeds

Closest to superpowers (Archetype 1: skills-only behavioral framework with Iron Laws, prompt-iron-law TDD enforcement) but adds two innovations superpowers lacks: (1) explicit session-amnesia recovery via progress.txt + feature_list.json (5-second context recovery); (2) architecture-as-law with custom linter enforcement as a non-negotiable red line. Unlike superpowers which triggers on autonomous signals, this has a single explicit entry point (SKILL.md) that dispatches to 9 workflow modules. Also similar to spec-driver (24 skills, modular dispatch) but spec-driver ships more skills while this ships deeper behavioral guidance per module.

Distinctive Feature

The "Context Discovery Protocol" (5-second recovery of full project context via standardized files) is a direct answer to LLM session amnesia — more explicit than any seed's approach.

Observable Failure Modes

  1. Chinese/English mixed docs: main modules may be in Chinese (README is bilingual, modules may not be).
  2. No license: unknown licensing.
  3. Dormant: last commit 2026-03-17; not actively maintained.
  4. No hooks: no automatic enforcement — relies on the LLM honoring the Iron Laws.
  5. Golden Spike coupling: the Golden Spike ceremony is required before feature development — this can be friction for projects that already have a working baseline.
04

Workflow

ldzhouquan Agent Harness Skill — Workflow

Overview

Init → Plan → Dev → Reflexion Loop → Merge

Phases

Phase Module Artifact Gate
Init initialization.md CI/Lint/Test config + AGENTS.md + Golden Spike (runnable Hello World) Golden Spike must pass
Plan feature-management.md feature_list.json + specs user approval implied
Dev development-workflow.md code + progress.txt updates Reflexion Loop required
Bug Fix bug-fix-protocol.md reproduction test + fix test proves failure before fix
Architecture Check architecture-enforcement.md linter output zero tolerance
Merge code-merge.md PR clean state required

Reflexion Loop (The Main Execution Engine)

Design → Code → Review → Test → Fix

Rules:

  • NOT optional — it IS the workflow
  • STOP protocol: if verification fails, stop and analyze root cause; do NOT retry without observation
  • Clean State: code must be runnable at each session end

Development Workflow Loop

Locate → Ground → Recall → Verify → Claim
  1. Locate: find the relevant code
  2. Ground: understand what exists before modifying
  3. Recall: check progress.txt and feature_list.json
  4. Verify: run tests/linter to confirm working state
  5. Claim: only claim completion when verified

Approval Gates

  1. Golden Spike must pass before moving to feature development
  2. Linter errors are mandatory fixes (zero tolerance)
  3. Architecture violations are mandatory fixes (not suggestions)
06

Memory Context

ldzhouquan Agent Harness Skill — Memory & Context

State Storage

File-based, project-scoped:

  • progress.txt — current task state. Enables 5-second context recovery after session restart.
  • feature_list.json — decomposed requirements
  • AGENTS.md — project agent config

Context Recovery Protocol

On session start, the agent reads progress.txt and feature_list.json in the first 5 seconds to recover full project context. This directly addresses the "amnesia" problem.

Iron Law 3: Knowledge Must Be In Repository

Any information not stored in a repository file does not exist. This enforces complete externalization of all state.

Memory Type

File-based, project-scoped.

Cross-Session Persistence

Via progress.txt and feature_list.json. The protocol is explicit: update these files before ending each session.

07

Orchestration

ldzhouquan Agent Harness Skill — Orchestration

Multi-Agent

The autonomous-development.md module describes "E2E Autonomy" mode but the specifics are not public. The main workflow is single-agent (Reflexion Loop).

Orchestration Pattern

Sequential: Reflexion Loop is the dominant pattern. The development workflow loop (Locate→Ground→Recall→Verify→Claim) is sequential per task.

Isolation Mechanism

None described explicitly.

Multi-Model

No.

Execution Mode

Interactive-loop: user invokes Harness skill; agent dispatches to appropriate module.

Cross-Tool Portability

Low to medium: targets Claude Code (.claude/skills/) but the skill could theoretically be used with other Claude Code-compatible tools.

08

Ui Cli Surface

ldzhouquan Agent Harness Skill — UI & CLI Surface

Dedicated CLI Binary

No.

Local UI

No.

Slash Command

/Harness — triggers the main SKILL.md entry point (or invoke directly by description).

Observability

  • progress.txt — current task state (readable by humans)
  • CI/linter output (external, referenced by architecture enforcement module)

Related frameworks

same archetype · same primary tool · same memory type

BMAD-METHOD ★ 48k

Provides a full agile delivery lifecycle with named expert-persona AI collaborators that elicit the human's best thinking rather…

Agent OS ★ 4.6k

Extracts implicit codebase conventions into token-efficient markdown standards files and injects them selectively into AI agent…

Claude Conductor ★ 367

Gives Claude Code a persistent, cross-linked, auto-analyzed documentation system so it retains codebase context across sessions.

Spec-Driver (Greenfield Spec-Driven Development) ★ 25

Prevents spec rot in AI-assisted development by making implementation changes flow back into evergreen, authoritative specs via…

Anthropic Knowledge Work Plugins ★ 16k

Role-specialized plugin bundles with live MCP connectors that turn Claude into a domain expert for enterprise knowledge workers.

Codex Integration for Claude Code (skill-codex) ★ 1.3k

Single Claude Code skill that handles Codex CLI invocation correctly (stdin blocking, thinking token suppression, session resume)…