Skip to content
/

Harness-Engineering-skills

harness-engineering-skills-phlegon · Phlegonlabs/Harness-Engineering-skills · ★ 17 · last commit 2026-03-29

PRD-to-code repo-backed delivery orchestration with three ceremony levels, 19 named agents, filesystem-enforced phase gates, and cross-session state persistence.

Best whenPlanning and execution state must survive chat sessions — write everything to versioned repository artifacts, not just conversation history.
Skip ifFree-form prompt chains without phase gates, Faking completion without required artifacts on disk
vs seeds
taskmaster-ai(file-based milestone delivery) but implemented as a skills pack (not MCP server) with 3-tier ceremony levels and config…
Primitive shape 21 total
Skills 2 Subagents 19
00

Summary

Harness Engineering Skills (Phlegonlabs) — Summary

A two-skill TypeScript/Bun package providing repo-backed PRD-to-code orchestration for Claude and Codex. The primary skill (harness-engineering-orchestrator) turns an idea or existing codebase into a controlled delivery loop writing state to docs/PRD.md, docs/ARCHITECTURE.md, docs/PROGRESS.md, .harness/state.json, AGENTS.md, and CLAUDE.md. It operates at three ceremony levels (Lite/Standard/Full), auto-detected based on project complexity. The secondary skill (harness-engineering-structure) provides a production-ready Bun+Turborepo monorepo scaffold with machine-readable validation rules, 6-layer dependency model, and planning commands. The system includes 14 named role agents (orchestrator, PRD architect, tech-stack advisor, code reviewer, market researcher, etc.), harness runtime scripts (bun harness:orchestrate, bun harness:advance, bun harness:approve), and explicit approval gates between delivery phases. At v1.8.6, this is actively maintained with CI (GitHub Actions) and releases.

differs_from_seeds: Closest to taskmaster-ai (file-based delivery with milestone/task decomposition) but implemented as a skills pack rather than an MCP server. The three-level ceremony model (Lite/Standard/Full) with configurable guardian overrides is more sophisticated than any seed. The agent roster (14 named role agents) rivals BMAD (34 skills + 6 personas) in scope while being more phase-gate focused. Writes to 6 distinct file types as the source of truth, creating an auditable delivery trail that no seed except kiro approaches.

01

Overview

Harness Engineering Skills (Phlegonlabs) — Overview

Origin

Created by Phlegonlabs. The README states the core thesis explicitly:

"Harness Engineering is built around one idea: planning and execution should survive chat sessions. Instead of keeping important decisions inside prompt history, the workflow writes them back into versioned repository artifacts such as docs/PRD.md, docs/ARCHITECTURE.md, docs/PROGRESS.md, AGENTS.md, CLAUDE.md, and .harness/state.json. That makes delivery stateful, resumable, and auditable across humans, Claude, and Codex."

Philosophy

From the SKILL.md:

"Use it when you want Claude or Codex to operate inside a controlled engineering workflow rather than free-form prompting."

The framework targets teams that want:

  • PRD-first planning instead of chat-only planning
  • Milestone and task execution tied back to repo state
  • Explicit phase gates before implementation advances
  • Staged delivery (V1 → deploy review → V2) instead of one drifting backlog
  • Resumable collaboration across sessions, agents, and humans

Ceremony Levels

A key design insight — the framework auto-detects project complexity and applies proportionate ceremony:

Level When Discovery Pacing Approval Stops
Lite Small projects, quick prototypes Batch 1-2 Q/turn Fast Path, delivery phase, blockers
Standard Most projects (default) Groups 2-3 Q/turn Plan approval, delivery phase, blockers
Full Enterprise/compliance Sequential Q0-Q9 All Standard + deploy review

Config-Driven Defaults

Teams can pre-set project defaults via config.json:

  • defaults.harnessLevel (lite/standard/full)
  • defaults.teamSize (solo/small/large)
  • defaults.ecosystem (bun/python/go)
  • guardianOverrides.warnOnly / guardianOverrides.disabled
  • phaseSkips.skipMarketResearch

This makes the framework team-configurable, not just personal.

02

Architecture

Harness Engineering Skills (Phlegonlabs) — Architecture

Distribution

  • GitHub: Phlegonlabs/Harness-Engineering-skills
  • Install: npx skills add https://github.com/Phlegonlabs/Harness-Engineering-skills --skill harness-engineering-orchestrator
  • Use in target repo: bun <path>/scripts/harness-setup.ts
  • License: MIT
  • Primary language: TypeScript
  • CI: GitHub Actions (CI + Release workflows)
  • Current release target: v1.8.6

Directory Structure

Harness-Engineering-skills/
├── README.md, README.en.md, README.zh-CN.md  # English + Chinese docs
├── LICENSE, CONTRIBUTING.md, SECURITY.md
├── harness-engineering-orchestrator/
│   ├── SKILL.md                      # Main skill contract (orchestrator)
│   ├── agents/
│   │   ├── orchestrator.md           # Orchestrator role
│   │   ├── prd-architect.md          # PRD writing agent
│   │   ├── tech-stack-advisor.md     # Stack recommendation
│   │   ├── design-reviewer.md        # Design review
│   │   ├── code-reviewer.md          # Code review
│   │   ├── market-research.md        # Market research
│   │   ├── scaffold-generator.md     # Project scaffolding
│   │   ├── execution-engine.md       # Task execution
│   │   ├── execution-engine/         # (sub-dir)
│   │   ├── harness-validator.md      # Validation agent
│   │   ├── context-compactor.md      # Context management
│   │   ├── entropy-scanner.md        # Quality/entropy scanning
│   │   ├── fast-path-bootstrap.md    # Lite level fast path
│   │   ├── frontend-designer.md      # UI design
│   │   ├── project-discovery.md      # Discovery phase
│   │   └── openai.yaml               # OpenAI agent config
│   ├── references/                   # Templates, helper docs, type defs
│   ├── scripts/
│   │   ├── harness-setup.ts          # Setup + greenfield/hydration
│   │   ├── harness-upgrade-runtime.ts
│   │   ├── check-skill-contract.mjs
│   │   ├── contract-manifest.json
│   │   ├── run-tracked-tests.mjs
│   │   ├── e2e/                      # E2E test scripts
│   │   └── setup/                    # Setup helpers
│   └── templates/                    # Scaffold files for target projects
├── harness-engineering-structure/
│   ├── SKILL.md                      # Structure validation skill
│   ├── agents/                       # doctor, validator, scaffolder, planner, evaluator
│   ├── references/                   # Machine-readable rules (JSON)
│   ├── scripts/
│   └── templates/
└── scripts/                          # Root-level CI scripts

Runtime Requirements

  • bun (required for harness scripts and Turborepo tasks)
  • git
  • npx skills (install method)
  • Claude Code or Codex (skill consumer)

State Files Written in Target Projects

  • docs/PRD.md — product requirements
  • docs/ARCHITECTURE.md — system design
  • docs/PROGRESS.md — milestone/task status
  • .harness/state.json — runtime machine state
  • AGENTS.md — agent operating instructions
  • CLAUDE.md — Claude-specific instructions
  • docs/ai/ — 6 detailed AI instruction modules
  • docs/adr/ — architecture decision records
03

Components

Harness Engineering Skills (Phlegonlabs) — Components

Skills (2)

Skill Purpose
harness-engineering-orchestrator Turns idea/existing repo into repo-backed delivery workflow: discovery → PRD → architecture → milestones → tasks → execution → validation
harness-engineering-structure Production-ready Bun+Turbo monorepo scaffold with machine-readable validation rules, 6-layer dependency model, and planning commands

Agents — harness-engineering-orchestrator (14)

Agent Role
orchestrator.md Master coordinator — manages ceremony level, pacing, phase transitions
prd-architect.md Writes docs/PRD.md from discovery Q&A
tech-stack-advisor.md Recommends tech stack based on project type/ecosystem
design-reviewer.md Reviews docs/ARCHITECTURE.md
code-reviewer.md Code quality review during execution
market-research.md Market research phase (skippable via config)
scaffold-generator.md Generates project scaffold from templates
execution-engine.md Executes individual milestone tasks
harness-validator.md Validates phase completion, runs check-skill-contract.mjs
context-compactor.md Manages context compaction for long sessions
entropy-scanner.md Quality/entropy scanning — prevents goal drift
fast-path-bootstrap.md Lite level Fast Path (2-turn project setup)
frontend-designer.md UI design for frontend projects
project-discovery.md Discovery phase Q&A orchestration

Agents — harness-engineering-structure (5)

Agent Role
doctor Diagnoses structure issues
validator Validates against machine-readable rules
scaffolder Generates monorepo scaffold
planner Plans structural improvements
evaluator Evaluates compliance with 6-layer model

Scripts

Script Purpose
harness-setup.ts Greenfield or existing-repo initialization
harness-upgrade-runtime.ts Upgrade harness runtime in existing projects
check-skill-contract.mjs Validate skill contract compliance
contract-manifest.json Machine-readable contract definition
run-tracked-tests.mjs Run tests with structured tracking output
e2e/ End-to-end test scripts

Harness Commands (in target projects)

Command Purpose
bun harness:orchestrate Start/resume orchestration
bun harness:advance Advance to next phase (only if current phase outputs exist)
bun harness:approve --plan Approve overall plan
bun harness:approve --phase V1 Approve phase V1 for deployment
bun harness:autoflow Automated phase advancement (only advances if outputs exist on disk)
bun harness:sync-backlog Sync PRD changes to backlog/progress
bun harness:scope-change --apply Apply scope change (PRD update + backlog sync)
bun harness:hooks:install Install local-only harness files after clone/reset
05

Prompts

Harness Engineering Skills (Phlegonlabs) — Prompts

Prompt File 1: harness-engineering-orchestrator SKILL.md (verbatim — ceremony levels)

Technique: Tiered-ceremony table with explicit guardian system and precedence chain.

## Harness Levels

The skill operates at three levels of ceremony, auto-detected or user-specified:

| Level | When | Discovery Pacing | Active Guardians | Approval Stops |
|-------|------|-----------------|------------------|----------------|
| **Lite** | Small projects, quick prototypes | Batch 1-2 Qs/turn | Core (G1,G3,G4,G6,G8; G2/G10 warn-only; G5/G7 off) | Fast Path summary, delivery phase completion, blockers |
| **Standard** | Most projects (default) | Groups of 2-3 Qs/turn | All (G1–G8,G10 active) | Overall plan approval, delivery phase completion, blockers |
| **Full** | Enterprise / compliance projects | Sequential Q0-Q9 | All (G1–G8,G10 active) | Overall plan approval, delivery phase completion, blockers, deploy review |

## Team Configuration

Teams can pre-set project defaults by placing a `config.json` in the installed skill directory:

**Supported fields**:
| Field | Default | Description |
|---|---|---|
| `defaults.harnessLevel` | auto-detect | Starting harness level |
| `defaults.teamSize` | `solo` | Team size (solo/small/large) |
| `defaults.ecosystem` | auto-detect | Toolchain ecosystem |
| `guardianOverrides.warnOnly` | `[]` | Guardians to downgrade to warn |
| `guardianOverrides.disabled` | `[]` | Guardians to disable |

**Precedence chain (highest → lowest)**:
CLI flags → config.json defaults → interactive discovery → state.json (canonical)

Technique: Guardian system with configurable enforcement levels (block vs warn vs disabled). The G1-G10 guardian numbering suggests a named constraint system where each guardian enforces a specific project invariant.

Prompt File 2: harness-engineering-orchestrator SKILL.md (verbatim — orchestrator contract)

Technique: Explicit "Orchestrator Contract" with file-as-truth authority and advancement rules.

## Orchestrator Contract

When this skill runs, act as the **Orchestrator**.

- Use level-aware discovery pacing: Lite batches 1-2 questions per turn, Standard groups
  2-3 related questions per turn, and Full asks one question per turn
- Keep runtime state, documents, backlog, and gates synchronized
- Treat `docs/PRD.md` and `docs/ARCHITECTURE.md` as the only planning source of truth
- Advance phases through the runtime (`bun harness:advance`); do not fake completion
- `bun harness:autoflow` may only advance after the current phase's required outputs
  exist on disk; missing scaffold/runtime artifacts must keep the workflow on the
  current phase
- If the user adds scope outside the current task or milestone, write it back into the
  PRD first
- When `pendingScopeChanges` exist with `status: "pending"`, surface them before
  dispatching any agent

Technique: Explicit anti-fake-completion rules. "Do not fake completion" and "missing artifacts must keep the workflow on the current phase" are Iron Law constraints preventing the agent from claiming a phase is done without filesystem evidence.

09

Uniqueness

Harness Engineering Skills (Phlegonlabs) — Uniqueness

differs_from_seeds

Closest to taskmaster-ai (file-based delivery with milestone/task decomposition) but implemented as a skills pack rather than an MCP server, and with a more sophisticated ceremony model (3 levels with configurable guardians). The G1-G10 guardian system with block/warn/disabled states and config.json team presets is more configurable than any seed. The context-compactor.md dedicated agent for context management is unique in the corpus — no seed has a named agent for this role. The filesystem-enforced phase gate (bun harness:advance only advances when required files exist on disk) is closer to kiro's spec-driven task gates than any of the pure-prompt seeds. At 19 total agents, it rivals BMAD (34 skills + 6 personas) in orchestration depth while being more delivery-phase-focused.

Most Distinctive Feature

Filesystem-enforced phase state machine: bun harness:advance checks for required files on disk before allowing phase transitions. This is not a prompt-based gate ("please confirm") but a code-enforced gate — missing artifacts prevent the harness from advancing regardless of what the agent says. No other framework in the 11 seeds implements this pattern.

Second: The context-compactor.md dedicated agent. Other frameworks treat context compaction as an afterthought or leave it to the model; this framework employs a named specialist agent responsible for managing context during long sessions.

Positioning

  • PRD-first delivery for teams wanting AI agents in a controlled workflow
  • Three-tier ceremony model serves both solo developers (Lite) and enterprise teams (Full)
  • Multilingual docs (English + Chinese) suggests broader audience
  • TypeScript/Bun-native with CI, release automation, and contract validation
  • v1.8.6 with active GitHub Actions pipeline — most professionally maintained skill repo in this batch

Observable Failure Modes

  1. Bun required: The harness scripts require bun, which may not be installed in all environments. Python or npm alternatives would broaden adoption.
  2. Complex setup: harness-setup.ts generates many files (docs/PRD.md, docs/ARCHITECTURE.md, .harness/state.json, AGENTS.md, CLAUDE.md, etc.) — high initial setup cost.
  3. Guardian opacity: The G1-G10 naming is opaque — the SKILL.md doesn't enumerate what each guardian enforces in the examined content.
  4. Scope change ceremony: Every scope addition must go through bun harness:scope-change --apply or manual PRD edit + bun harness:sync-backlog — friction for exploratory development.
  5. PRD-first is a commitment: Teams that prefer emergent design (discover requirements through implementation) may find the PRD-first approach constraining.
04

Workflow

Harness Engineering Skills (Phlegonlabs) — Workflow

Delivery Loop

Project Plan → Delivery Phase → Milestone → Task → Validation

Standard Level Workflow

bun harness:orchestrate

DISCOVERY phase
  ↓ project-discovery agent: 2-3 Q groups per turn
  ↓ market-research agent: competitive landscape (skippable)
  ↓ tech-stack-advisor agent: stack recommendation
  ↓ [APPROVAL GATE: Plan approval]

PRD + ARCHITECTURE phase
  ↓ prd-architect agent writes docs/PRD.md
  ↓ design-reviewer agent writes docs/ARCHITECTURE.md
  ↓ [APPROVAL GATE: Architecture approval]

SCAFFOLD phase
  ↓ scaffold-generator agent creates project structure
  ↓ harness-validator validates scaffold exists on disk
  ↓ bun harness:advance (only advances if scaffold outputs exist)

EXECUTING phase
  ↓ Milestones decomposed from PRD → docs/PROGRESS.md
  ↓ execution-engine agent executes tasks milestone by milestone
  ↓ code-reviewer agent reviews code during execution
  ↓ entropy-scanner detects goal drift
  ↓ context-compactor manages context for long sessions
  ↓ [APPROVAL GATE: Delivery phase completion]
  ↓ [APPROVAL GATE: blockers (any time)]

DEPLOY REVIEW phase (Full level only)
  ↓ [APPROVAL GATE: bun harness:approve --phase V1]

COMPLETE

Fast Path (Lite Level)

Compresses DISCOVERY through SCAFFOLD into 2 turns:

  1. User describes project in one message
  2. User confirms inferred plan
  3. Skill scaffolds immediately → enters EXECUTING

Phase-to-Artifact Map

Phase Artifact
Discovery Q&A output (in session)
PRD docs/PRD.md
Architecture docs/ARCHITECTURE.md
Scaffold Project structure on disk, .harness/state.json
Execution docs/PROGRESS.md (milestone/task status), code changes
Complete Deployed application

Approval Gates

Gate Type When
Overall plan approval yes-no After PRD + Architecture
Delivery phase completion yes-no After each V1/V2/etc.
Scope change approval file-review When PRD is modified mid-project
Deploy review typed-confirm Full level only
Blockers freetext-clarify Any time

State Machine

The bun harness:advance command enforces that phases cannot skip: required output files must exist on disk before the state machine advances. This is a filesystem-enforced phase gate — not just a prompt-based gate.

06

Memory Context

Harness Engineering Skills (Phlegonlabs) — Memory & Context

State Storage (Primary)

.harness/state.json — the canonical runtime state file written by harness scripts. Contains:

  • projectInfo.harnessLevel (lite/standard/full)
  • projectInfo.teamSize
  • projectInfo.ecosystem
  • Current phase
  • Milestone and task status
  • pendingScopeChanges (scope additions waiting for PRD update)

Documentation Files (Human + Agent-Read)

File Content
docs/PRD.md Product requirements — the only planning source of truth
docs/ARCHITECTURE.md System design
docs/PROGRESS.md Milestone and task status
AGENTS.md Agent operating instructions (summary of docs/ai/)
CLAUDE.md Claude-specific instructions
docs/ai/ 6 detailed AI instruction modules (operating principles, project context, guardrails, task execution, commands, context health)
docs/adr/ Architecture decision records

Context Compaction Agent

The context-compactor.md agent is a dedicated role for managing context during long sessions. This is the only framework in the batch with a named agent for context compaction — it treats context management as a role, not a sidebar.

Cross-Session Handoff

Strong — the entire state is written to disk. Sessions can resume from .harness/state.json at any point:

bun harness:orchestrate  # reads .harness/state.json and resumes
bun harness:hooks:install  # restore local-only files after clone/reset

Memory Type

File-based + structured JSON. The combination of PRD.md (human-readable planning) + state.json (machine-readable state) + PROGRESS.md (status tracking) creates a three-layer memory system.

07

Orchestration

Harness Engineering Skills (Phlegonlabs) — Orchestration

Multi-Agent

Yes — 19 named agents across two skills (14 in orchestrator + 5 in structure).

Orchestration Pattern

Hierarchical. The orchestrator.md agent coordinates all other agents:

  • dispatches project-discovery for requirements
  • dispatches prd-architect for PRD writing
  • dispatches tech-stack-advisor for stack selection
  • dispatches scaffold-generator for scaffolding
  • dispatches execution-engine for task execution
  • dispatches harness-validator for phase validation
  • dispatches context-compactor when context pressure detected

Subagent Definition Format

persona-md — each agent is an .md file in the agents/ directory.

Spawn Mechanism

Claude's Task tool (skill dispatches agents via Task).

Isolation Mechanism

None for code changes (in-place). However, the harness scripts (bun harness:advance, bun harness:autoflow) enforce filesystem-based phase gates — phases cannot advance without required files on disk. This is a novel form of state-machine isolation that prevents premature advancement.

Multi-Model

Partial — openai.yaml in the agents directory suggests possible OpenAI model configuration for some agents. The skill description mentions "Added support for AMP and Hermes agents" in context-engineering-kit's changelog, suggesting multi-platform agent dispatch.

Execution Mode

Interactive-loop for Standard/Full. One-shot for Lite Fast Path. The bun harness:orchestrate command provides a continuous loop.

Prompt Chaining

Yes — PRD → Architecture → Scaffold → Execution is an explicit prompt chain. Each phase's output is the next phase's input.

Consensus

No formal consensus mechanism, but scope changes require explicit user approval before any agent proceeds (pendingScopeChanges check).

Guardian System

The G1-G10 guardian numbering suggests a named constraint system where each guardian enforces a project invariant (e.g., G1=PRD exists before coding, G2=tests pass before advance, etc.). Guardians can be demoted to warn-only or disabled via config.json.

Crash Recovery

Strong — all state is file-based. Harness can be fully recovered from .harness/state.json after any interruption.

08

Ui Cli Surface

Harness Engineering Skills (Phlegonlabs) — UI & CLI Surface

CLI Binary

Harness scripts form an operational CLI within target projects:

bun harness:orchestrate       # Start/resume orchestration loop
bun harness:advance           # Advance to next phase (enforces artifacts)
bun harness:approve --plan    # Approve overall delivery plan
bun harness:approve --phase V1 # Approve phase V1 for deployment
bun harness:autoflow          # Automated phase advancement
bun harness:sync-backlog      # Sync PRD changes to backlog
bun harness:scope-change --apply # Apply scope changes
bun harness:hooks:install     # Restore local harness files

These are Bun scripts in the target project (not in the skill repo itself). They are generated by harness-setup.ts during initialization.

Local UI

None. No web dashboard.

IDE Integration

  • Claude Code: Primary target (SKILL.md format)
  • Codex: Secondary target (referenced in description)

Observability

The most observable framework in this batch:

  • docs/PROGRESS.md — milestone/task status visible to humans and agents
  • .harness/state.json — machine-readable state
  • docs/adr/ — architecture decisions tracked
  • docs/ai/ — 6 detailed modules documenting the agent's operating context

CI/CD Integration

The repo includes GitHub Actions CI and Release workflows (for the skill repo itself). The skill's harness-setup.ts generates CI/CD configs for target projects.

Contract Validation

check-skill-contract.mjs validates that the skill's declared contract matches the actual SKILL.md content — a meta-validation layer for the skill itself. The contract-manifest.json defines what the skill promises.

Multilingual Documentation

Unusually, this repo ships README in three languages: English, Chinese (README.zh-CN.md), and a primary README that links to both. This suggests a broader user base than most personal skill repos.

Related frameworks

same archetype · same primary tool · same memory type

Claude-Flow / Ruflo ★ 55k

Eliminates single-agent context limits and sequential bottlenecks by orchestrating fault-tolerant swarms of specialized AI agents…

Hermes Agent (NousResearch) ★ 168k

Self-improving personal AI agent with closed learning loop, 7 terminal backends, and messaging gateway — not tied to any AI…

OpenCode ★ 165k

Terminal-first AI coding agent with multi-model routing, native desktop app, and a typed .opencode/ configuration system for…

OpenHands ★ 75k

Open-source AI software development platform (open-source Devin alternative) with Docker sandbox isolation, 77.6% SWE-bench…

DeerFlow ★ 70k

Long-horizon superagent that researches, codes, and creates by orchestrating parallel sub-agents with isolated contexts in Docker…

oh-my-openagent (omo) ★ 60k

Multi-provider AI agent orchestration for OpenCode: escape vendor lock-in by routing Sisyphus (Claude/Kimi/GLM) and Hephaestus…