revfactory Claude Code Harness — Summary
This is a research repository, not a user-facing framework: it contains a controlled A/B experiment comparing Claude Code output quality with vs without a pre-configured .claude/ harness across 15 software engineering tasks at three difficulty levels. The central finding, reported in a bundled paper, is that harness pre-configuration improves average quality scores from 49.5 to 79.3 (+60%), with the effect scaling with task complexity (Expert +36.2 points). The actual harness delivered is small — 4 slash commands (/experiment, /evaluate, /report, /run-advanced-experiment), 3 skills (experiment-runner, output-evaluator, report-generator), and an experiments/ directory with YAML test cases, worktree-isolated baseline and harness agents, and JSON result tracking. There is no installable plugin or reusable end-user tool — the value is the research evidence. Differs from seeds: this is the only "methodology proof" in the catalog rather than a tool — analogous to agent-os (markdown scaffold + proof of concept) but focused on empirical validation rather than opinionated guidelines. The experiment design resembles the parallel-fan-out orchestration of superpowers and revfactory-harness but instrumentalized for measurement.