trw-mcp (TRW Framework)

trw-mcp · wallter/trw-mcp · ★ 0 · last commit 2026-04-29

Prevent AI coding session state loss and quality drift by enforcing ceremony, capturing learnings, and orchestrating multi-agent teams through a 6-phase agile model.

Best whenOrchestrating (delegating, verifying, learning) produces higher quality than direct implementation — an agent's strategic oversight is more valuable than its…

Skip ifBackground agents (forbidden), Skipping ceremony phases

vs seeds

claude-flow(toolserver with 305 MCP tools), trw-mcp is methodology-first with only 24 targeted tools. The BSL 1.1 license and the p…

Primitive shape 73 total

Skills 24 Subagents 12 Hooks 13 MCP tools 24

Summary

trw-mcp — Summary

trw-mcp is the MCP server component of the TRW (The Real Work) framework, shipping a pip install trw-mcp package that bootstraps any git repository with persistent engineering memory, spec-driven PRD workflows, and a 6-phase agile execution model. It exposes 24 MCP tools covering the full session lifecycle — session start/checkpoint/deliver, knowledge capture/recall, build verification, PRD creation, and ceremony compliance — plus 24 skills (slash-command workflows) and 12 bundled agents for parallel Agent Team execution. State lives in a .trw/ directory and survives context compaction via checkpoint/deliver ceremony; a trw_session_start() call replays captured learnings at every new session. The CLI binary trw-mcp ships subcommands for init-project, update-project, audit, config-reference, and export, making deployment a one-liner. Internally, it is a Python FastMCP server with strict mypy + pytest coverage requirements (90%+) and a documented SWE-bench eval program (iter-0..10 showed null lift, iter-11+ ongoing).

Differs from seeds: Closest to ccmemory (MCP-anchored memory persistence) and spec-driver (skills-driven execution with quality gates), but trw-mcp is uniquely a full-stack agile harness — it combines MCP persistence, 12 named agent roles, a RESEARCH→PLAN→IMPLEMENT→VALIDATE→REVIEW→DELIVER phase model with adversarial auditing, AARE-F PRD format, and ceremony-enforcement hooks, none of which appear in ccmemory or spec-driver. Unlike taskmaster-ai (JSON task file + MCP tools), trw-mcp stores learnings as YAML entries with vector/keyword hybrid recall and auto-compacts via trw_checkpoint, giving it cross-session continuity absent from taskmaster.

Overview

trw-mcp — Overview

Origin

trw-mcp is the MCP server component of the TRW (The Real Work) Framework, authored by wallter (Nico Wallter) and released under BSL 1.1. It was built entirely by AI agents using TRW itself — a dogfooding signal the README cites with measured skepticism. The linked eval bench (docs/eval/iter-notes/) reports null performance lift on SWE-bench single-shot at n=40-47 for iterations 0-10, with iter-11+ prospector analysis ongoing.

Philosophy

The framework's philosophy is captured in its FRAMEWORK.md opening:

"v24.5_TRW — CLAUDE CODE ORCHESTRATED AGILE SWARM. Slim-Persist | Parallel-First | Formation-Driven | Interrupt-Safe | CLI/TDD | YAML-First | Sensible Defaults | MCP-Integrated | Skills-Driven | Agent-Teams"

And from AGENTS.md:

"Sessions where you orchestrate (delegate, verify, learn) rather than implement directly produce higher quality and fewer rework cycles — your strategic oversight is more valuable than your keystrokes."

The README states:

"Every AI coding tool resets to zero. TRW preserves state across sessions via the .trw/ directory — session-start recall replays prior learnings at every new session."

And on empirical claims:

"Whether this yields measurable task-completion lift is an open empirical question; iter-0..10 SWE-bench-single-shot measurements showed null (n=40/47)."

Core Ideas

Ceremony as discipline: The 6-phase model (RESEARCH→PLAN→IMPLEMENT→VALIDATE→REVIEW→DELIVER) enforces phase gates; hooks block delivery if build fails.
Dogfooding at scale: 336 PRDs, 90 sprints, 9,200+ tests, 90% coverage — the repo was built by its own agents.
AARE-F PRDs: A proprietary requirements format (AARE-F = Assumption, Acceptance, Requirements, Evidence, Functional) used for spec creation.
Formation-driven parallel work: The FRAMEWORK.md defines 4+5 named formations (e.g., research wave, implement+review team) that orchestrators select based on task complexity.
Value-oriented framing: All hook messages avoid "CRITICAL/ALWAYS/NEVER" language per Anthropic Claude best practices — instead explain consequences.

Antipatterns (explicit)

From FRAMEWORK.md's "Rationalization Watchlist":

Skipping ceremony (phase gates, build checks)
Starting without calling trw_session_start()
Delivering without trw_deliver()
Background agents (FORBIDDEN in the framework)
Multiple Task() calls without blocking

Architecture

trw-mcp — Architecture

Distribution

Type: MCP server + CLI tool (Python package)
Registry: PyPI (pip install trw-mcp)
Binary name: trw-mcp
License: BSL 1.1 (Business Source License — time-limited proprietary, not pure OSS)
Language: Python 3.10+

Install

pip install trw-mcp
# or with vector search support:
pip install 'trw-mcp[vectors]'

# Deploy TRW to a target project
trw-mcp init-project /path/to/repo

init-project bootstraps:

.trw/ — learning memory, run state, configuration
.mcp.json — MCP server connection
CLAUDE.md / AGENTS.md — project instructions
.claude/hooks/ — ceremony enforcement hooks
.claude/skills/ — workflow automation skills
.claude/agents/ — specialized sub-agents

Directory Tree (deployed project)

.trw/
  config.yaml           # embeddings_enabled, ceremony_mode, learning_max_entries
  learnings/            # YAML-format captured learnings
  runs/                 # Per-run state (run.yaml, phases, shards)
  frameworks/
    FRAMEWORK.md        # 6-phase methodology spec
  context/
    messages.yaml       # Value-oriented AI-facing message registry

.claude/
  hooks/
    session-start.sh
    session-end.sh
    pre-compact.sh
    stop-ceremony.sh
    task-completed.sh
    post-tool-event.sh
    pre-tool-deliver-gate.sh
    validate-prd-write.sh
    subagent-start.sh
    subagent-stop.sh
    teammate-idle.sh
    user-prompt-submit.sh
    phase-cycle-stop.sh
  skills/
    trw-sprint-init/    # Each skill is a directory with SKILL.md
    trw-deliver/
    trw-commit/
    ... (24 total)
  agents/
    trw-lead.md
    trw-implementer.md
    trw-tester.md
    trw-researcher.md
    trw-reviewer.md
    trw-adversarial-auditor.md
    trw-prd-groomer.md
    trw-requirement-writer.md
    trw-requirement-reviewer.md
    trw-traceability-checker.md
    trw-code-simplifier.md
    trw-auditor.md     (12 total)

Source Architecture

src/trw_mcp/
  server/           # FastMCP entry point, middleware chain
  bootstrap.py      # init-project deploy logic
  models/           # Pydantic v2 models
  tools/            # 24 MCP tool implementations
  state/            # Persistence, validation, analytics
  middleware/       # Ceremony, observation masking, response optimizer
  telemetry/        # Opt-in telemetry pipeline
  prompts/          # AARE-F prompts + messaging registry
  data/             # Bundled hooks, skills, agents (deployed by init-project)

Required Runtime

Python 3.10+
Optional: [vectors] extra for vector/hybrid search (sentence-transformers or compatible)

Target AI Tools

Claude Code (primary), Cursor, opencode, Codex — configurable via --ide flag on init-project. Multi-IDE because the bootstrap can generate IDE-specific config files.

Configuration

.trw/config.yaml controls all behavior:

embeddings_enabled: false
learning_max_entries: 5000
build_check_enabled: true
ceremony_mode: "full"  # "full" | "light" | "off"
progressive_disclosure: false

Environment variables with TRW_ prefix override config.

Components

trw-mcp — Components

MCP Tools (24)

Category	Tool	Purpose
Session	`trw_session_start`	Load prior learnings, recover active run
Session	`trw_init`	Initialize a new run
Session	`trw_status`	Show current run state
Session	`trw_checkpoint`	Atomic state snapshot
Session	`trw_pre_compact_checkpoint`	Save before context compaction
Session	`trw_progressive_expand`	Progressively reveal tools
Learning	`trw_learn`	Capture new learning/discovery
Learning	`trw_learn_update`	Update or resolve existing learning
Learning	`trw_recall`	Retrieve relevant learnings
Learning	`trw_knowledge_sync`	Sync knowledge to disk
Learning	`trw_claude_md_sync`	Regenerate CLAUDE.md from learnings
Quality	`trw_build_check`	Run pytest + mypy
Quality	`trw_review`	Code review tool
Quality	`trw_trust_level`	Assess code trust level
Quality	`trw_quality_dashboard`	View quality metrics
Quality	`trw_deliver`	Full delivery ceremony
Requirements	`trw_prd_create`	Create AARE-F PRD
Requirements	`trw_prd_validate`	Validate PRD completeness
Ceremony	`trw_ceremony_status`	Check ceremony compliance
Ceremony	`trw_ceremony_approve`	Approve ceremony step
Ceremony	`trw_ceremony_revert`	Revert ceremony state
Reporting	`trw_run_report`	Run metrics report
Reporting	`trw_analytics_report`	Analytics dashboard
Reporting	`trw_usage_report`	Cost/usage tracking

Skills (24 slash-command workflows)

Sprint & Delivery:

/trw-sprint-init — Initialize sprint with phase plan
/trw-deliver — Full delivery with build check + learning synthesis
/trw-commit — Ceremony-compliant git commit
/trw-sprint-finish — Close sprint, archive artifacts
/trw-sprint-team — Team sprint coordination
/trw-exec-plan — Execute plan from PRD

Requirements:

/trw-prd-new — Create new AARE-F PRD
/trw-prd-ready — Mark PRD as ready for implementation
/trw-prd-groom — Groom PRD backlog
/trw-prd-review — Review PRD quality

Quality:

/trw-audit — Audit TRW configuration
/trw-review-pr — PR review with ADR check
/trw-simplify — Code simplification pass
/trw-dry-check — DRY principle check
/trw-security-check — Security review
/trw-test-strategy — TDD test strategy

Framework:

/trw-framework-check — Verify framework compliance
/trw-project-health — Project health report
/trw-memory-audit — Audit learning quality
/trw-memory-optimize — Prune/optimize learnings
/trw-self-review — Agent self-review pass
/trw-team-playbook — Team coordination guide
/trw-ceremony-guide — Ceremony instructions
/trw-learn — Manual learning capture

Agents (12 named sub-agents)

Agent	Role	Model
`trw-lead`	Team orchestration, phase gating	Opus (high effort)
`trw-implementer`	Single-file/small edits	Sonnet
`trw-tester`	TDD test writing	Sonnet
`trw-researcher`	Evidence gathering, research waves	Sonnet
`trw-reviewer`	Code review, no production writes	Sonnet
`trw-adversarial-auditor`	Spec-vs-code gap finding	Sonnet
`trw-prd-groomer`	PRD backlog grooming	Sonnet
`trw-requirement-writer`	AARE-F requirement authoring	Sonnet
`trw-requirement-reviewer`	Requirement quality review	Sonnet
`trw-traceability-checker`	PRD-to-code traceability	Sonnet
`trw-code-simplifier`	Code health, DRY/complexity	Sonnet
`trw-auditor`	TRW configuration audit	Sonnet

Hooks (lifecycle enforcement)

All hooks are bash scripts in .claude/hooks/:

Hook	Event	Purpose
`session-start.sh`	SessionStart	Load learnings, emit ceremony guidance
`session-end.sh`	Stop	Prompt deliver ceremony
`pre-compact.sh`	PreCompact	Auto-checkpoint before compaction
`stop-ceremony.sh`	Stop	Block stop if ceremony incomplete
`task-completed.sh`	SubagentStop	Record completed task
`post-tool-event.sh`	PostToolUse	Log significant tool events
`pre-tool-deliver-gate.sh`	PreToolUse	Gate deliver before build pass
`validate-prd-write.sh`	PreToolUse(Write)	Validate PRD format on write
`subagent-start.sh`	SessionStart (subagent)	Load subagent context
`subagent-stop.sh`	SubagentStop	Persist subagent learnings
`teammate-idle.sh`	SubagentStop	Detect idle teammate
`user-prompt-submit.sh`	UserPromptSubmit	Inject ceremony context
`phase-cycle-stop.sh`	Stop	Prevent phase boundary stop

CLI Subcommands

trw-mcp <subcommand>:

init-project — Deploy TRW to target repo
update-project — Migrate existing TRW installation
check-instructions — Validate instruction-tool parity (exit 1 on mismatch)
audit — Audit TRW configuration
config-reference — Print all TRW_ env vars
export — Export learnings (JSON format)
uninstall — Remove TRW from project

Templates

Bundled in src/trw_mcp/data/:

prd_template.md — AARE-F PRD template
playbook-template.yaml — Team playbook template
settings.json — Claude Code settings template
aaref.md — AARE-F requirements guide
behavioral_protocol.yaml — Agent behavioral rules
framework.md — Full 6-phase framework spec

Prompts

trw-mcp — Prompts

Excerpt 1: FRAMEWORK.md — Phase Model (from `src/trw_mcp/data/framework.md`)

Technique: Structured constraint + rationalization watchlist (same pattern as superpowers' Iron Law, but with explicit anti-skip safeguards labeled as such).

## PHASES

RESEARCH -> PLAN -> IMPLEMENT -> VALIDATE -> REVIEW -> DELIVER

| Phase | Exit Criteria | Skills | Cap |
|-------|---------------|--------|-----|
| RESEARCH | plan.md draft, >=3 evidence paths, formation selected. | `/trw-framework-check` | 25% |
| PLAN | Acceptance criteria, shards planned, wave_manifest.yaml created. | `/trw-sprint-init`, `/trw-prd-new`, `/trw-prd-ready` | 15% |
| IMPLEMENT | Shards/waves complete OR checkpointed, tests written. | `/trw-test-strategy` | 35% |
| VALIDATE | Coverage >= target, gates pass, no P0. Run `trw_build_check(scope="full")`. | `/trw-test-strategy` | 10% |
| REVIEW | Critic reviewed, simplifications applied, reflection completed. | `/trw-memory-audit` | 10% |
| DELIVER | PR created OR archived, final.md, CLAUDE.md synced. | `/trw-deliver`, `/trw-sprint-finish` | 5% |

ORC tracks elapsed wall-clock against TIMEBOX_HOURS. ORC MUST NOT advance until exit criteria met OR cap exceeded with rationale. Refine plan until stable — fixing a plan is cheaper than rewriting code.

## RATIONALIZATION WATCHLIST

If you catch yourself thinking any of these, stop and follow the process — these are the exact thoughts that precede ceremony skips:

Excerpt 2: trw-lead agent definition (from `src/trw_mcp/data/agents/trw-lead.md`)

Technique: Persona-md with tool whitelist and behavioral framing. Explicit disallowedTools array is unusual — most persona-md frameworks only whitelist.

---
name: trw-lead
description: >
  Team orchestration lead. Use when coordinating multi-agent sprint execution —
  spawning teammates, routing file-ownership messages, managing worktrees,
  gating phase transitions across the 6-phase RESEARCH through DELIVER
  lifecycle. Not for single-file edits (use trw-implementer) or read-only
  reviews (use trw-reviewer). Does not write production code; stays in
  delegate mode during IMPLEMENT.
model: opus
effort: high
maxTurns: 200
memory: project
allowedTools:
  - Read
  - Edit
  - Write
  - Bash
  - Skill
  - TaskCreate
  - TaskUpdate
  - TaskList
  - TeamCreate
  - SendMessage
  - EnterWorktree
  - mcp__trw__trw_session_start
  - mcp__trw__trw_build_check
  - mcp__trw__trw_deliver
disallowedTools:
  - NotebookEdit
---

Excerpt 3: trw-deliver skill (from `src/trw_mcp/data/skills/trw-deliver/SKILL.md`)

Technique: Multi-step workflow with explicit rationalization watchlist — makes agent self-monitor for ceremony bypass.

## Rationalization Watchlist

| Thought | Why it's wrong | Consequence |
|---------|---------------|-------------|
| "The build passed earlier, I can skip the pre-flight check" | Build state changes between VALIDATE and DELIVER — new code may have been added | Shipping with a stale build check is the #1 cause of broken deliveries |
| "Team learning synthesis is optional, I'll skip it" | Duplicate/conflicting learnings from teammates pollute future recall | Next session loads contradictory advice — agents get confused and make worse decisions |
| "I'll just call trw_deliver directly, the skill is overkill" | The skill adds build verification + team synthesis that raw trw_deliver skips | Manual delivery skips validation steps — 3x more defects in audits |

Excerpt 4: session-start.sh hook (value-oriented framing)

Technique: Behavioral framing via hooks — explains value rather than commanding. Anthropic guidance on emphatic language avoidance implemented as shell script.

# PRD-INFRA-002-FR01: Unified SessionStart hook
# Framing: value-oriented — explains what each tool gives the agent.
# Research: Anthropic context engineering, motivation framing, self-interest framing.
# Fail-open: any error silently exits 0.
#
# Performance: ~23ms avg latency (benchmarked 2026-03-29, 3 runs).
# Fires once per session event, not on every tool call.

And from data/messages/messages.yaml (via messaging.py):

# Value-oriented framing examples:
# Do: "Call X to load your prior learnings"
# Don't: "You MUST call X"
# Do: "Without it, you're working without context"
# Don't: "CRITICAL: Skipping means you WILL fail"

Uniqueness

trw-mcp — Uniqueness

Differs from Seeds

Among the 11 seed frameworks, trw-mcp is closest to ccmemory (MCP server for persistent memory) and spec-driver (skills-driven with quality gates), but is architecturally distinct in five ways: (1) it ships a full 6-phase agile execution model with exit criteria, token caps, and formation selection — none of the seeds has this; (2) it ships 12 named agent roles with explicit model assignments (trw-lead on Opus, others on Sonnet) and a disallowedTools whitelist mechanism; (3) it implements worktree-based isolation for parallel agents, shared only with claude-flow among the seeds; (4) its ceremony enforcement hooks (13 lifecycle hooks including pre-tool-deliver-gate.sh and validate-prd-write.sh) are the most granular in the corpus; (5) the BSL 1.1 license makes it the only seed-comparable framework that is not purely open source. Unlike claude-flow (MCP toolserver with 305 tools, hive-mind consensus, IPFS incident), trw-mcp is a methodology-first harness where the MCP tools serve workflow tracking, not the reverse. Unlike taskmaster-ai (JSON tasks file, no memory persistence), trw-mcp compounds knowledge across sessions via YAML learnings with hybrid recall.

Positioning

Target user: Solo engineers or small agent teams who want an opinionated, ceremony-enforced agile workflow on top of Claude Code
Key differentiator: The rationalization watchlist pattern — every skill/hook explicitly enumerates the cognitive shortcuts agents use to bypass ceremony, making anti-patterns visible
Honest caveat: The README explicitly documents that SWE-bench eval showed null performance lift (n=47, iter 0-10), which is rare transparency in this space

Observable Failure Modes

BSL 1.1 licensing: Not purely OSS — users in commercial contexts must evaluate license terms; time-limited but not Apache/MIT
Ceremony overhead: 6 phases, 13 hooks, and mandatory tool calls add friction; ceremony_mode: "off" exists but defeats the framework's value proposition
MCP server startup latency: check-instructions warns about connection errors; ~23ms hook latency is noted in source
Null performance evidence: The framework's own eval shows no measured lift at iter 0-10 (small n, single-shot task type) — adoption requires trust in the methodology's long-term value rather than short-term benchmark improvement
Vector search opt-in: Without [vectors] extra, recall degrades to keyword-only; default config ships embeddings_enabled: false

Novel Patterns

Instruction-tool parity check (check-instructions): validates CLAUDE.md only references real MCP tools — prevents stale instruction drift
AARE-F PRD format: proprietary requirements notation with explicit Assumption/Acceptance/Requirements/Evidence/Functional fields
Value-oriented hook framing: hooks explain consequences rather than command; documented decision backed by Anthropic research on emphatic language causing overtriggering
Confidence-gated phase transitions: confidence levels (high/medium/low) with automatic Critic spawning at low confidence

Workflow

trw-mcp — Workflow

6-Phase Execution Model

RESEARCH → PLAN → IMPLEMENT → VALIDATE → REVIEW → DELIVER

Phase	Exit Criteria	Key Skills	Token Cap
RESEARCH	plan.md draft, ≥3 evidence paths, formation selected	`/trw-framework-check`	25%
PLAN	Acceptance criteria, shards planned, wave_manifest.yaml created	`/trw-sprint-init`, `/trw-prd-new`, `/trw-prd-ready`	15%
IMPLEMENT	Shards/waves complete OR checkpointed, tests written	`/trw-test-strategy`	35%
VALIDATE	Coverage ≥ target, gates pass, no P0; run `trw_build_check(scope="full")`	`/trw-test-strategy`	10%
REVIEW	Critic reviewed, simplifications applied, reflection completed	`/trw-memory-audit`	10%
DELIVER	PR created OR archived, final.md, CLAUDE.md synced	`/trw-deliver`, `/trw-sprint-finish`	5%

Artifacts Per Phase

Phase	Artifact
RESEARCH	`docs/{task}/reports/plan.md` (draft), `shards/wave_manifest.yaml`
PLAN	`docs/{task}/reports/plan.md` (final), `meta/run.yaml`, AARE-F PRD
IMPLEMENT	Source code, tests, per-shard completion markers
VALIDATE	Test coverage report, mypy output, `build_check` result
REVIEW	Adversarial audit report, simplification diff
DELIVER	`docs/{task}/reports/final.md`, updated CLAUDE.md, INDEX.md/ROADMAP.md

Approval Gates

PLAN gate: ORC must not advance to IMPLEMENT until exit criteria met OR cap exceeded with rationale
VALIDATE gate: Build check must pass (pytest + mypy); P0 defects block advance
DELIVER gate: pre-tool-deliver-gate.sh hook blocks the deliver tool call if build hasn't passed
PRD write gate: validate-prd-write.sh blocks PRD file writes that fail AARE-F format
Stop ceremony: stop-ceremony.sh prevents session end if ceremony is incomplete
Confidence gates: low confidence (<70%) blocks progression, spawns Critic subagent

Parallel Execution (Formations)

The FRAMEWORK.md defines named formations for multi-agent work:

Research wave: N parallel researchers, 1 synthesizer (Wave 3 must not parallelize)
Implement+Review team: Lead + implementers + reviewer in parallel shards
Consensus quorum: 2/3 (0.67) agreement threshold for multi-agent decisions
Correlation minimum: 0.7 inter-judge agreement for research findings

Defaults:

PARALLELISM_MAX: 10
MIN_SHARDS_TARGET: 3
MIN_SHARDS_FLOOR: 2
CONSENSUS_QUORUM: 0.67
CORRELATION_MIN: 0.7
TIMEBOX_HOURS: 8
MAX_CHILD_DEPTH: 2
MAX_RESEARCH_WAVES: 3

Confidence Levels

Level	Threshold	Gate
high	≥85%	Pass
medium	70–85%	Review
low	<70%	Block → spawn Critic

Session Boundaries

Every session: call trw_session_start() first → load prior learnings
After milestones: call trw_checkpoint(message)
Session end: call /trw-deliver → persist learnings for future sessions
Pre-compact: pre-compact.sh auto-checkpoints before context window compaction

Memory Context

trw-mcp — Memory & Context

State Storage

Type: File-based YAML (learnings) + YAML (run state) + optional vector index
Persistence scope: Project (.trw/ directory in the target repo)
Search: Keyword by default; hybrid (keyword + vector) if embeddings_enabled: true with [vectors] extra

State Files

Path	Purpose
`.trw/config.yaml`	Framework configuration
`.trw/learnings/`	YAML entries for captured learnings
`.trw/runs/{run_id}/meta/run.yaml`	Current run phase, status, timestamps
`.trw/runs/{run_id}/meta/events.jsonl`	Event audit log (significant events)
`.trw/runs/{run_id}/shards/wave_manifest.yaml`	Parallel wave status
`.trw/runs/{run_id}/reports/plan.md`	Current plan
`.trw/runs/{run_id}/reports/final.md`	Completed run summary
`.trw/context/messages.yaml`	AI-facing message registry
`.trw/frameworks/FRAMEWORK.md`	Full methodology spec
`CLAUDE.md`	Project instructions (auto-regenerated by `trw_claude_md_sync`)

Learning Lifecycle

Capture: trw_learn(summary, detail, tags, impact) during session
Recall: trw_recall(query, max_results) — keyword or hybrid search
Update: trw_learn_update(id, status="resolved") to mark duplicates
Deliver: trw_deliver() synthesizes and persists to disk
Replay: trw_session_start() loads relevant learnings at session open

Compaction Handling

pre-compact.sh hook fires on PreCompact event → calls trw_pre_compact_checkpoint()
Saves full run state before context window compaction
On resume, trw_session_start() with source="compact" recovers active run

Cross-Session Handoff

Yes — by design. The key handoff artifact is the CLAUDE.md file, which trw_claude_md_sync regenerates from high-impact learnings. This means every new session reads the accumulated knowledge even without MCP tool calls.

The trw_instructions_sync tool refreshes the IDE-specific instruction file (CLAUDE.md for Claude Code, AGENTS.md for Codex, etc.) with current learnings.

Audit Log

meta/events.jsonl — JSONL format, per-run, records significant events
Run YAML tracks phase transitions, timestamps, completion status
Not replay-capable (no event sourcing), but provides forensic evidence

Learning Pruning

learning_max_entries: 5000 — learnings auto-pruned when limit reached (FIFO by default). /trw-memory-optimize skill provides manual pruning via LLM-assisted deduplication.

Orchestration

trw-mcp — Orchestration

Multi-Agent: Yes

TRW ships 12 named agents with explicit role boundaries, model assignments, and tool allowlists. The trw-lead agent orchestrates via Agent Teams using TaskCreate, TeamCreate, SendMessage, and EnterWorktree.

Orchestration Pattern: Hierarchical + Parallel-Fan-Out

trw-lead (Opus) is the hierarchical coordinator
Subagents (Sonnet) run in parallel shards within implementation waves
Wave-based execution: wave N+1 doesn't start until wave N completes
Consensus quorum (0.67) gates multi-agent research findings

Isolation Mechanism

Git worktrees — trw-lead uses EnterWorktree to isolate parallel implementation work. Each agent team member works in a separate worktree, preventing file conflicts.

Execution Mode: Interactive-loop / continuous-ralph

Each session is bounded (triggered by trw_session_start()), but the framework is designed for multi-session continuity via checkpointing. Not a background daemon; requires active user invocation.

Consensus Mechanism

Quorum-based (0.67 = 2/3 judges agree). Applied in:

Research wave reconciliation (inter-judge correlation ≥ 0.7)
Multi-agent plan validation
Adversarial audit sign-off

Multi-Model

Agent	Model	Role
trw-lead	claude-opus	Orchestration, phase gating
All other agents	claude-sonnet	Implementation, review, test, research

model: opus is set in the trw-lead agent YAML; default for others is Sonnet. Multi-model routing is agent-role-based, not task-complexity semantic.

Prompt Chaining

Yes — the RESEARCH phase writes plan.md which becomes the explicit input for PLAN phase; PLAN writes wave_manifest.yaml which IMPLEMENT shards consume. Each phase's output is the next phase's prompt context.

Auto-Validators

trw_build_check(scope="full") runs pytest + mypy at VALIDATE phase
pre-tool-deliver-gate.sh enforces build pass before delivery
validate-prd-write.sh validates AARE-F format before PRD file writes
stop-ceremony.sh blocks session stop if ceremony incomplete
TDD enforced via /trw-test-strategy skill (tests-before-code mandate)

Git Automation

trw_deliver() does NOT auto-commit; it persists learnings and syncs instruction files
/trw-commit skill creates ceremony-compliant commits (manual invocation)
PR creation requires explicit user approval (not automated)
Worktree management: EnterWorktree for isolation, manual merge/close

Target Tools

Primary: Claude Code. Also supports Cursor, opencode, Codex via IDE-specific bootstrap templates.

Cross-Tool Portability: Medium

Core skills/hooks are Claude Code-specific (.claude/ directory). The init-project --ide codex flag generates Codex-compatible bootstrap, but hook mechanisms differ by IDE.

Ui Cli Surface

trw-mcp — UI / CLI Surface

CLI Binary

Name: trw-mcp
Is thin wrapper: No — it is the MCP server binary itself, with additional subcommands for project management
Entry point: trw_mcp.server:main
Subcommands: 7
- init-project — Deploy TRW to target repo
- update-project — Migrate existing TRW installation
- check-instructions — Validate instruction-tool parity (exit 1 on mismatch)
- audit — Audit TRW configuration
- config-reference — Print all TRW_ environment variables
- export --format json — Export learnings
- uninstall — Remove TRW from project

# Usage examples
trw-mcp init-project .              # current directory
trw-mcp init-project . --ide codex  # force Codex bootstrap
trw-mcp init-project . --force      # overwrite existing files

Local UI

None. No web dashboard, no TUI, no desktop app. Interaction is entirely through the MCP protocol (tool calls in Claude Code / other clients) and the CLI for setup.

The trw_quality_dashboard and trw_analytics_report MCP tools return structured text/markdown for in-chat display, but there is no standalone UI surface.

IDE Integration

Claude Code: Full support via .claude/hooks/, .claude/skills/, .claude/agents/
Cursor: Supported via .cursor/rules/ and cursor-specific config in data/cursor_ide/
opencode: Supported via data/opencode/ config templates
Codex: Supported via data/codex/ templates (init-project --ide codex)
Copilot: Supported via data/copilot/ templates
GitHub Actions: Not shipped

MCP Server Config

Clients add the server via:

{
  "mcpServers": {
    "trw": {
      "command": "trw-mcp",
      "args": ["--debug"]
    }
  }
}

Observability

trw_analytics_report MCP tool: session/run metrics
trw_usage_report: cost/usage tracking
meta/events.jsonl: per-run JSONL event log
trw_quality_dashboard: coverage and quality gate status
Opt-in telemetry pipeline (anonymized, configurable via .trw/config.yaml)

`check-instructions` Tool

Unique feature: validates that CLAUDE.md/AGENTS.md only references tools that exist in the MCP server. Exits 1 on mismatch — prevents stale instructions referencing removed tools. Used in CI.

Related frameworks

same archetype · same primary tool · same memory type

Taskmaster AI ★ 27k

A3 MCP-anchored

Converts a PRD into a dependency-ordered JSON task graph that AI coding agents execute one task at a time, eliminating context…

ccmemory ★ 1

A3 MCP-anchored

Accumulates decisions, corrections, and failed approaches from Claude Code sessions into a queryable Neo4j graph so each new…

Pimzino spec-workflow-mcp ★ 4.2k

A3 MCP-anchored

MCP server providing spec-driven development workflow with dashboard-backed approval gates, implementation logging, and VSCode…

MCP Shrimp Task Manager ★ 2.1k

A3 MCP-anchored

Convert natural language requests into structured AI development tasks with chain-of-thought enforcement, reflection gates, and…

Bernstein ★ 460

A3 MCP-anchored

Govern parallel CLI coding agents with a deterministic Python scheduler, HMAC-chained audit trail, and compliance-ready signed…

LeanSpec ★ 252

A3 MCP-anchored

Provides a unified spec CLI and MCP server over any existing spec backend (markdown, GitHub Issues, ADO), making spec-driven…

Distribution

Type: mcp-server
License: NOASSERTION (BSL 1.1)
Install: multi-step
Version: v24.5_TRW (from FRAMEWORK.md)

Surfaces

CLI binary: trw-mcp
CLI subcmds: 7
Local UI: No
Tech stack: none

Components

Commands: 0
Skills: 24
Subagents: 12
Hooks: 13
MCP servers: 1
MCP tools: 24
Scripts: 13
Templates: 5

Workflow

Phases: 6
Approval gates: 6
Spec format: markdown
Spec storage: per-feature-folder
Delta or full: whole-file

Orchestration

Multi-agent: Yes
Pattern: hierarchical
Max concurrent: 10
Isolation: git-worktree
Consensus: quorum
Prompt chaining: Yes

Multi-model

Multi-model: Yes
BYOK: Yes
Modal: text

Execution

Mode: interactive-loop
Crash recovery: Yes
Compaction: Yes
Session handoff: Yes
Streaming: No

Memory

Type: file-based
Persistence: project
Search: full-text
State files: 6 files

Quality

TDD: Yes
TDD mechanism: dedicated-skill
Validators: 3
Self-review: adversarial-subagent

Git / Observability

Auto commit: No
Auto PR: No
Auto merge: No
Worktree/feat: Yes
Audit log: Yes
Audit format: jsonl
Replay: No

Tools

Primary: claude-code
Targets: 5
Portability: medium

Signals

Stars: 0
Last commit: 2026-04-29
Contributors: 1
Maintainer: active
Quality score: 9.1/10

Summary

trw-mcp — Summary

Overview

trw-mcp — Overview

Origin

Philosophy

Core Ideas

Antipatterns (explicit)

Architecture

trw-mcp — Architecture

Distribution

Install

Directory Tree (deployed project)

Source Architecture

Required Runtime

Target AI Tools

Configuration

Components

trw-mcp — Components

MCP Tools (24)

Skills (24 slash-command workflows)

Agents (12 named sub-agents)

Hooks (lifecycle enforcement)

CLI Subcommands

Templates

Prompts

trw-mcp — Prompts

Excerpt 1: FRAMEWORK.md — Phase Model (from src/trw_mcp/data/framework.md)

Excerpt 2: trw-lead agent definition (from src/trw_mcp/data/agents/trw-lead.md)

Excerpt 3: trw-deliver skill (from src/trw_mcp/data/skills/trw-deliver/SKILL.md)

Excerpt 4: session-start.sh hook (value-oriented framing)

Uniqueness

trw-mcp — Uniqueness

Differs from Seeds

Positioning

Observable Failure Modes

Novel Patterns

Workflow

trw-mcp — Workflow

6-Phase Execution Model

Artifacts Per Phase

Approval Gates

Parallel Execution (Formations)

Confidence Levels

Session Boundaries

Memory Context

trw-mcp — Memory & Context

State Storage

State Files

Learning Lifecycle

Compaction Handling

Cross-Session Handoff

Audit Log

Learning Pruning

Orchestration

trw-mcp — Orchestration

Multi-Agent: Yes

Orchestration Pattern: Hierarchical + Parallel-Fan-Out

Isolation Mechanism

Execution Mode: Interactive-loop / continuous-ralph

Consensus Mechanism

Multi-Model

Prompt Chaining

Auto-Validators

Git Automation

Target Tools

Cross-Tool Portability: Medium

Ui Cli Surface

trw-mcp — UI / CLI Surface

CLI Binary

Local UI

IDE Integration

MCP Server Config

Observability

check-instructions Tool

Related frameworks

Excerpt 1: FRAMEWORK.md — Phase Model (from `src/trw_mcp/data/framework.md`)

Excerpt 2: trw-lead agent definition (from `src/trw_mcp/data/agents/trw-lead.md`)

Excerpt 3: trw-deliver skill (from `src/trw_mcp/data/skills/trw-deliver/SKILL.md`)

`check-instructions` Tool