Skip to content
/

trw-mcp (TRW Framework)

trw-mcp · wallter/trw-mcp · ★ 0 · last commit 2026-04-29

Prevent AI coding session state loss and quality drift by enforcing ceremony, capturing learnings, and orchestrating multi-agent teams through a 6-phase agile model.

Best whenOrchestrating (delegating, verifying, learning) produces higher quality than direct implementation — an agent's strategic oversight is more valuable than its…
Skip ifBackground agents (forbidden), Skipping ceremony phases
vs seeds
claude-flow(toolserver with 305 MCP tools), trw-mcp is methodology-first with only 24 targeted tools. The BSL 1.1 license and the p…
Primitive shape 73 total
Skills 24 Subagents 12 Hooks 13 MCP tools 24
00

Summary

trw-mcp — Summary

trw-mcp is the MCP server component of the TRW (The Real Work) framework, shipping a pip install trw-mcp package that bootstraps any git repository with persistent engineering memory, spec-driven PRD workflows, and a 6-phase agile execution model. It exposes 24 MCP tools covering the full session lifecycle — session start/checkpoint/deliver, knowledge capture/recall, build verification, PRD creation, and ceremony compliance — plus 24 skills (slash-command workflows) and 12 bundled agents for parallel Agent Team execution. State lives in a .trw/ directory and survives context compaction via checkpoint/deliver ceremony; a trw_session_start() call replays captured learnings at every new session. The CLI binary trw-mcp ships subcommands for init-project, update-project, audit, config-reference, and export, making deployment a one-liner. Internally, it is a Python FastMCP server with strict mypy + pytest coverage requirements (90%+) and a documented SWE-bench eval program (iter-0..10 showed null lift, iter-11+ ongoing).

Differs from seeds: Closest to ccmemory (MCP-anchored memory persistence) and spec-driver (skills-driven execution with quality gates), but trw-mcp is uniquely a full-stack agile harness — it combines MCP persistence, 12 named agent roles, a RESEARCH→PLAN→IMPLEMENT→VALIDATE→REVIEW→DELIVER phase model with adversarial auditing, AARE-F PRD format, and ceremony-enforcement hooks, none of which appear in ccmemory or spec-driver. Unlike taskmaster-ai (JSON task file + MCP tools), trw-mcp stores learnings as YAML entries with vector/keyword hybrid recall and auto-compacts via trw_checkpoint, giving it cross-session continuity absent from taskmaster.

01

Overview

trw-mcp — Overview

Origin

trw-mcp is the MCP server component of the TRW (The Real Work) Framework, authored by wallter (Nico Wallter) and released under BSL 1.1. It was built entirely by AI agents using TRW itself — a dogfooding signal the README cites with measured skepticism. The linked eval bench (docs/eval/iter-notes/) reports null performance lift on SWE-bench single-shot at n=40-47 for iterations 0-10, with iter-11+ prospector analysis ongoing.

Philosophy

The framework's philosophy is captured in its FRAMEWORK.md opening:

"v24.5_TRW — CLAUDE CODE ORCHESTRATED AGILE SWARM. Slim-Persist | Parallel-First | Formation-Driven | Interrupt-Safe | CLI/TDD | YAML-First | Sensible Defaults | MCP-Integrated | Skills-Driven | Agent-Teams"

And from AGENTS.md:

"Sessions where you orchestrate (delegate, verify, learn) rather than implement directly produce higher quality and fewer rework cycles — your strategic oversight is more valuable than your keystrokes."

The README states:

"Every AI coding tool resets to zero. TRW preserves state across sessions via the .trw/ directory — session-start recall replays prior learnings at every new session."

And on empirical claims:

"Whether this yields measurable task-completion lift is an open empirical question; iter-0..10 SWE-bench-single-shot measurements showed null (n=40/47)."

Core Ideas

  1. Ceremony as discipline: The 6-phase model (RESEARCH→PLAN→IMPLEMENT→VALIDATE→REVIEW→DELIVER) enforces phase gates; hooks block delivery if build fails.
  2. Dogfooding at scale: 336 PRDs, 90 sprints, 9,200+ tests, 90% coverage — the repo was built by its own agents.
  3. AARE-F PRDs: A proprietary requirements format (AARE-F = Assumption, Acceptance, Requirements, Evidence, Functional) used for spec creation.
  4. Formation-driven parallel work: The FRAMEWORK.md defines 4+5 named formations (e.g., research wave, implement+review team) that orchestrators select based on task complexity.
  5. Value-oriented framing: All hook messages avoid "CRITICAL/ALWAYS/NEVER" language per Anthropic Claude best practices — instead explain consequences.

Antipatterns (explicit)

From FRAMEWORK.md's "Rationalization Watchlist":

  • Skipping ceremony (phase gates, build checks)
  • Starting without calling trw_session_start()
  • Delivering without trw_deliver()
  • Background agents (FORBIDDEN in the framework)
  • Multiple Task() calls without blocking
02

Architecture

trw-mcp — Architecture

Distribution

  • Type: MCP server + CLI tool (Python package)
  • Registry: PyPI (pip install trw-mcp)
  • Binary name: trw-mcp
  • License: BSL 1.1 (Business Source License — time-limited proprietary, not pure OSS)
  • Language: Python 3.10+

Install

pip install trw-mcp
# or with vector search support:
pip install 'trw-mcp[vectors]'

# Deploy TRW to a target project
trw-mcp init-project /path/to/repo

init-project bootstraps:

  • .trw/ — learning memory, run state, configuration
  • .mcp.json — MCP server connection
  • CLAUDE.md / AGENTS.md — project instructions
  • .claude/hooks/ — ceremony enforcement hooks
  • .claude/skills/ — workflow automation skills
  • .claude/agents/ — specialized sub-agents

Directory Tree (deployed project)

.trw/
  config.yaml           # embeddings_enabled, ceremony_mode, learning_max_entries
  learnings/            # YAML-format captured learnings
  runs/                 # Per-run state (run.yaml, phases, shards)
  frameworks/
    FRAMEWORK.md        # 6-phase methodology spec
  context/
    messages.yaml       # Value-oriented AI-facing message registry

.claude/
  hooks/
    session-start.sh
    session-end.sh
    pre-compact.sh
    stop-ceremony.sh
    task-completed.sh
    post-tool-event.sh
    pre-tool-deliver-gate.sh
    validate-prd-write.sh
    subagent-start.sh
    subagent-stop.sh
    teammate-idle.sh
    user-prompt-submit.sh
    phase-cycle-stop.sh
  skills/
    trw-sprint-init/    # Each skill is a directory with SKILL.md
    trw-deliver/
    trw-commit/
    ... (24 total)
  agents/
    trw-lead.md
    trw-implementer.md
    trw-tester.md
    trw-researcher.md
    trw-reviewer.md
    trw-adversarial-auditor.md
    trw-prd-groomer.md
    trw-requirement-writer.md
    trw-requirement-reviewer.md
    trw-traceability-checker.md
    trw-code-simplifier.md
    trw-auditor.md     (12 total)

Source Architecture

src/trw_mcp/
  server/           # FastMCP entry point, middleware chain
  bootstrap.py      # init-project deploy logic
  models/           # Pydantic v2 models
  tools/            # 24 MCP tool implementations
  state/            # Persistence, validation, analytics
  middleware/       # Ceremony, observation masking, response optimizer
  telemetry/        # Opt-in telemetry pipeline
  prompts/          # AARE-F prompts + messaging registry
  data/             # Bundled hooks, skills, agents (deployed by init-project)

Required Runtime

  • Python 3.10+
  • Optional: [vectors] extra for vector/hybrid search (sentence-transformers or compatible)

Target AI Tools

Claude Code (primary), Cursor, opencode, Codex — configurable via --ide flag on init-project. Multi-IDE because the bootstrap can generate IDE-specific config files.

Configuration

.trw/config.yaml controls all behavior:

embeddings_enabled: false
learning_max_entries: 5000
build_check_enabled: true
ceremony_mode: "full"  # "full" | "light" | "off"
progressive_disclosure: false

Environment variables with TRW_ prefix override config.

03

Components

trw-mcp — Components

MCP Tools (24)

Category Tool Purpose
Session trw_session_start Load prior learnings, recover active run
Session trw_init Initialize a new run
Session trw_status Show current run state
Session trw_checkpoint Atomic state snapshot
Session trw_pre_compact_checkpoint Save before context compaction
Session trw_progressive_expand Progressively reveal tools
Learning trw_learn Capture new learning/discovery
Learning trw_learn_update Update or resolve existing learning
Learning trw_recall Retrieve relevant learnings
Learning trw_knowledge_sync Sync knowledge to disk
Learning trw_claude_md_sync Regenerate CLAUDE.md from learnings
Quality trw_build_check Run pytest + mypy
Quality trw_review Code review tool
Quality trw_trust_level Assess code trust level
Quality trw_quality_dashboard View quality metrics
Quality trw_deliver Full delivery ceremony
Requirements trw_prd_create Create AARE-F PRD
Requirements trw_prd_validate Validate PRD completeness
Ceremony trw_ceremony_status Check ceremony compliance
Ceremony trw_ceremony_approve Approve ceremony step
Ceremony trw_ceremony_revert Revert ceremony state
Reporting trw_run_report Run metrics report
Reporting trw_analytics_report Analytics dashboard
Reporting trw_usage_report Cost/usage tracking

Skills (24 slash-command workflows)

Sprint & Delivery:

  • /trw-sprint-init — Initialize sprint with phase plan
  • /trw-deliver — Full delivery with build check + learning synthesis
  • /trw-commit — Ceremony-compliant git commit
  • /trw-sprint-finish — Close sprint, archive artifacts
  • /trw-sprint-team — Team sprint coordination
  • /trw-exec-plan — Execute plan from PRD

Requirements:

  • /trw-prd-new — Create new AARE-F PRD
  • /trw-prd-ready — Mark PRD as ready for implementation
  • /trw-prd-groom — Groom PRD backlog
  • /trw-prd-review — Review PRD quality

Quality:

  • /trw-audit — Audit TRW configuration
  • /trw-review-pr — PR review with ADR check
  • /trw-simplify — Code simplification pass
  • /trw-dry-check — DRY principle check
  • /trw-security-check — Security review
  • /trw-test-strategy — TDD test strategy

Framework:

  • /trw-framework-check — Verify framework compliance
  • /trw-project-health — Project health report
  • /trw-memory-audit — Audit learning quality
  • /trw-memory-optimize — Prune/optimize learnings
  • /trw-self-review — Agent self-review pass
  • /trw-team-playbook — Team coordination guide
  • /trw-ceremony-guide — Ceremony instructions
  • /trw-learn — Manual learning capture

Agents (12 named sub-agents)

Agent Role Model
trw-lead Team orchestration, phase gating Opus (high effort)
trw-implementer Single-file/small edits Sonnet
trw-tester TDD test writing Sonnet
trw-researcher Evidence gathering, research waves Sonnet
trw-reviewer Code review, no production writes Sonnet
trw-adversarial-auditor Spec-vs-code gap finding Sonnet
trw-prd-groomer PRD backlog grooming Sonnet
trw-requirement-writer AARE-F requirement authoring Sonnet
trw-requirement-reviewer Requirement quality review Sonnet
trw-traceability-checker PRD-to-code traceability Sonnet
trw-code-simplifier Code health, DRY/complexity Sonnet
trw-auditor TRW configuration audit Sonnet

Hooks (lifecycle enforcement)

All hooks are bash scripts in .claude/hooks/:

Hook Event Purpose
session-start.sh SessionStart Load learnings, emit ceremony guidance
session-end.sh Stop Prompt deliver ceremony
pre-compact.sh PreCompact Auto-checkpoint before compaction
stop-ceremony.sh Stop Block stop if ceremony incomplete
task-completed.sh SubagentStop Record completed task
post-tool-event.sh PostToolUse Log significant tool events
pre-tool-deliver-gate.sh PreToolUse Gate deliver before build pass
validate-prd-write.sh PreToolUse(Write) Validate PRD format on write
subagent-start.sh SessionStart (subagent) Load subagent context
subagent-stop.sh SubagentStop Persist subagent learnings
teammate-idle.sh SubagentStop Detect idle teammate
user-prompt-submit.sh UserPromptSubmit Inject ceremony context
phase-cycle-stop.sh Stop Prevent phase boundary stop

CLI Subcommands

trw-mcp <subcommand>:

  • init-project — Deploy TRW to target repo
  • update-project — Migrate existing TRW installation
  • check-instructions — Validate instruction-tool parity (exit 1 on mismatch)
  • audit — Audit TRW configuration
  • config-reference — Print all TRW_ env vars
  • export — Export learnings (JSON format)
  • uninstall — Remove TRW from project

Templates

Bundled in src/trw_mcp/data/:

  • prd_template.md — AARE-F PRD template
  • playbook-template.yaml — Team playbook template
  • settings.json — Claude Code settings template
  • aaref.md — AARE-F requirements guide
  • behavioral_protocol.yaml — Agent behavioral rules
  • framework.md — Full 6-phase framework spec
05

Prompts

trw-mcp — Prompts

Excerpt 1: FRAMEWORK.md — Phase Model (from src/trw_mcp/data/framework.md)

Technique: Structured constraint + rationalization watchlist (same pattern as superpowers' Iron Law, but with explicit anti-skip safeguards labeled as such).

## PHASES

RESEARCH -> PLAN -> IMPLEMENT -> VALIDATE -> REVIEW -> DELIVER

| Phase | Exit Criteria | Skills | Cap |
|-------|---------------|--------|-----|
| RESEARCH | plan.md draft, >=3 evidence paths, formation selected. | `/trw-framework-check` | 25% |
| PLAN | Acceptance criteria, shards planned, wave_manifest.yaml created. | `/trw-sprint-init`, `/trw-prd-new`, `/trw-prd-ready` | 15% |
| IMPLEMENT | Shards/waves complete OR checkpointed, tests written. | `/trw-test-strategy` | 35% |
| VALIDATE | Coverage >= target, gates pass, no P0. Run `trw_build_check(scope="full")`. | `/trw-test-strategy` | 10% |
| REVIEW | Critic reviewed, simplifications applied, reflection completed. | `/trw-memory-audit` | 10% |
| DELIVER | PR created OR archived, final.md, CLAUDE.md synced. | `/trw-deliver`, `/trw-sprint-finish` | 5% |

ORC tracks elapsed wall-clock against TIMEBOX_HOURS. ORC MUST NOT advance until exit criteria met OR cap exceeded with rationale. Refine plan until stable — fixing a plan is cheaper than rewriting code.

## RATIONALIZATION WATCHLIST

If you catch yourself thinking any of these, stop and follow the process — these are the exact thoughts that precede ceremony skips:

Excerpt 2: trw-lead agent definition (from src/trw_mcp/data/agents/trw-lead.md)

Technique: Persona-md with tool whitelist and behavioral framing. Explicit disallowedTools array is unusual — most persona-md frameworks only whitelist.

---
name: trw-lead
description: >
  Team orchestration lead. Use when coordinating multi-agent sprint execution —
  spawning teammates, routing file-ownership messages, managing worktrees,
  gating phase transitions across the 6-phase RESEARCH through DELIVER
  lifecycle. Not for single-file edits (use trw-implementer) or read-only
  reviews (use trw-reviewer). Does not write production code; stays in
  delegate mode during IMPLEMENT.
model: opus
effort: high
maxTurns: 200
memory: project
allowedTools:
  - Read
  - Edit
  - Write
  - Bash
  - Skill
  - TaskCreate
  - TaskUpdate
  - TaskList
  - TeamCreate
  - SendMessage
  - EnterWorktree
  - mcp__trw__trw_session_start
  - mcp__trw__trw_build_check
  - mcp__trw__trw_deliver
disallowedTools:
  - NotebookEdit
---

Excerpt 3: trw-deliver skill (from src/trw_mcp/data/skills/trw-deliver/SKILL.md)

Technique: Multi-step workflow with explicit rationalization watchlist — makes agent self-monitor for ceremony bypass.

## Rationalization Watchlist

| Thought | Why it's wrong | Consequence |
|---------|---------------|-------------|
| "The build passed earlier, I can skip the pre-flight check" | Build state changes between VALIDATE and DELIVER — new code may have been added | Shipping with a stale build check is the #1 cause of broken deliveries |
| "Team learning synthesis is optional, I'll skip it" | Duplicate/conflicting learnings from teammates pollute future recall | Next session loads contradictory advice — agents get confused and make worse decisions |
| "I'll just call trw_deliver directly, the skill is overkill" | The skill adds build verification + team synthesis that raw trw_deliver skips | Manual delivery skips validation steps — 3x more defects in audits |

Excerpt 4: session-start.sh hook (value-oriented framing)

Technique: Behavioral framing via hooks — explains value rather than commanding. Anthropic guidance on emphatic language avoidance implemented as shell script.

# PRD-INFRA-002-FR01: Unified SessionStart hook
# Framing: value-oriented — explains what each tool gives the agent.
# Research: Anthropic context engineering, motivation framing, self-interest framing.
# Fail-open: any error silently exits 0.
#
# Performance: ~23ms avg latency (benchmarked 2026-03-29, 3 runs).
# Fires once per session event, not on every tool call.

And from data/messages/messages.yaml (via messaging.py):

# Value-oriented framing examples:
# Do: "Call X to load your prior learnings"
# Don't: "You MUST call X"
# Do: "Without it, you're working without context"
# Don't: "CRITICAL: Skipping means you WILL fail"
09

Uniqueness

trw-mcp — Uniqueness

Differs from Seeds

Among the 11 seed frameworks, trw-mcp is closest to ccmemory (MCP server for persistent memory) and spec-driver (skills-driven with quality gates), but is architecturally distinct in five ways: (1) it ships a full 6-phase agile execution model with exit criteria, token caps, and formation selection — none of the seeds has this; (2) it ships 12 named agent roles with explicit model assignments (trw-lead on Opus, others on Sonnet) and a disallowedTools whitelist mechanism; (3) it implements worktree-based isolation for parallel agents, shared only with claude-flow among the seeds; (4) its ceremony enforcement hooks (13 lifecycle hooks including pre-tool-deliver-gate.sh and validate-prd-write.sh) are the most granular in the corpus; (5) the BSL 1.1 license makes it the only seed-comparable framework that is not purely open source. Unlike claude-flow (MCP toolserver with 305 tools, hive-mind consensus, IPFS incident), trw-mcp is a methodology-first harness where the MCP tools serve workflow tracking, not the reverse. Unlike taskmaster-ai (JSON tasks file, no memory persistence), trw-mcp compounds knowledge across sessions via YAML learnings with hybrid recall.

Positioning

  • Target user: Solo engineers or small agent teams who want an opinionated, ceremony-enforced agile workflow on top of Claude Code
  • Key differentiator: The rationalization watchlist pattern — every skill/hook explicitly enumerates the cognitive shortcuts agents use to bypass ceremony, making anti-patterns visible
  • Honest caveat: The README explicitly documents that SWE-bench eval showed null performance lift (n=47, iter 0-10), which is rare transparency in this space

Observable Failure Modes

  1. BSL 1.1 licensing: Not purely OSS — users in commercial contexts must evaluate license terms; time-limited but not Apache/MIT
  2. Ceremony overhead: 6 phases, 13 hooks, and mandatory tool calls add friction; ceremony_mode: "off" exists but defeats the framework's value proposition
  3. MCP server startup latency: check-instructions warns about connection errors; ~23ms hook latency is noted in source
  4. Null performance evidence: The framework's own eval shows no measured lift at iter 0-10 (small n, single-shot task type) — adoption requires trust in the methodology's long-term value rather than short-term benchmark improvement
  5. Vector search opt-in: Without [vectors] extra, recall degrades to keyword-only; default config ships embeddings_enabled: false

Novel Patterns

  • Instruction-tool parity check (check-instructions): validates CLAUDE.md only references real MCP tools — prevents stale instruction drift
  • AARE-F PRD format: proprietary requirements notation with explicit Assumption/Acceptance/Requirements/Evidence/Functional fields
  • Value-oriented hook framing: hooks explain consequences rather than command; documented decision backed by Anthropic research on emphatic language causing overtriggering
  • Confidence-gated phase transitions: confidence levels (high/medium/low) with automatic Critic spawning at low confidence
04

Workflow

trw-mcp — Workflow

6-Phase Execution Model

RESEARCH → PLAN → IMPLEMENT → VALIDATE → REVIEW → DELIVER
Phase Exit Criteria Key Skills Token Cap
RESEARCH plan.md draft, ≥3 evidence paths, formation selected /trw-framework-check 25%
PLAN Acceptance criteria, shards planned, wave_manifest.yaml created /trw-sprint-init, /trw-prd-new, /trw-prd-ready 15%
IMPLEMENT Shards/waves complete OR checkpointed, tests written /trw-test-strategy 35%
VALIDATE Coverage ≥ target, gates pass, no P0; run trw_build_check(scope="full") /trw-test-strategy 10%
REVIEW Critic reviewed, simplifications applied, reflection completed /trw-memory-audit 10%
DELIVER PR created OR archived, final.md, CLAUDE.md synced /trw-deliver, /trw-sprint-finish 5%

Artifacts Per Phase

Phase Artifact
RESEARCH docs/{task}/reports/plan.md (draft), shards/wave_manifest.yaml
PLAN docs/{task}/reports/plan.md (final), meta/run.yaml, AARE-F PRD
IMPLEMENT Source code, tests, per-shard completion markers
VALIDATE Test coverage report, mypy output, build_check result
REVIEW Adversarial audit report, simplification diff
DELIVER docs/{task}/reports/final.md, updated CLAUDE.md, INDEX.md/ROADMAP.md

Approval Gates

  1. PLAN gate: ORC must not advance to IMPLEMENT until exit criteria met OR cap exceeded with rationale
  2. VALIDATE gate: Build check must pass (pytest + mypy); P0 defects block advance
  3. DELIVER gate: pre-tool-deliver-gate.sh hook blocks the deliver tool call if build hasn't passed
  4. PRD write gate: validate-prd-write.sh blocks PRD file writes that fail AARE-F format
  5. Stop ceremony: stop-ceremony.sh prevents session end if ceremony is incomplete
  6. Confidence gates: low confidence (<70%) blocks progression, spawns Critic subagent

Parallel Execution (Formations)

The FRAMEWORK.md defines named formations for multi-agent work:

  • Research wave: N parallel researchers, 1 synthesizer (Wave 3 must not parallelize)
  • Implement+Review team: Lead + implementers + reviewer in parallel shards
  • Consensus quorum: 2/3 (0.67) agreement threshold for multi-agent decisions
  • Correlation minimum: 0.7 inter-judge agreement for research findings

Defaults:

PARALLELISM_MAX: 10
MIN_SHARDS_TARGET: 3
MIN_SHARDS_FLOOR: 2
CONSENSUS_QUORUM: 0.67
CORRELATION_MIN: 0.7
TIMEBOX_HOURS: 8
MAX_CHILD_DEPTH: 2
MAX_RESEARCH_WAVES: 3

Confidence Levels

Level Threshold Gate
high ≥85% Pass
medium 70–85% Review
low <70% Block → spawn Critic

Session Boundaries

  • Every session: call trw_session_start() first → load prior learnings
  • After milestones: call trw_checkpoint(message)
  • Session end: call /trw-deliver → persist learnings for future sessions
  • Pre-compact: pre-compact.sh auto-checkpoints before context window compaction
06

Memory Context

trw-mcp — Memory & Context

State Storage

  • Type: File-based YAML (learnings) + YAML (run state) + optional vector index
  • Persistence scope: Project (.trw/ directory in the target repo)
  • Search: Keyword by default; hybrid (keyword + vector) if embeddings_enabled: true with [vectors] extra

State Files

Path Purpose
.trw/config.yaml Framework configuration
.trw/learnings/ YAML entries for captured learnings
.trw/runs/{run_id}/meta/run.yaml Current run phase, status, timestamps
.trw/runs/{run_id}/meta/events.jsonl Event audit log (significant events)
.trw/runs/{run_id}/shards/wave_manifest.yaml Parallel wave status
.trw/runs/{run_id}/reports/plan.md Current plan
.trw/runs/{run_id}/reports/final.md Completed run summary
.trw/context/messages.yaml AI-facing message registry
.trw/frameworks/FRAMEWORK.md Full methodology spec
CLAUDE.md Project instructions (auto-regenerated by trw_claude_md_sync)

Learning Lifecycle

  1. Capture: trw_learn(summary, detail, tags, impact) during session
  2. Recall: trw_recall(query, max_results) — keyword or hybrid search
  3. Update: trw_learn_update(id, status="resolved") to mark duplicates
  4. Deliver: trw_deliver() synthesizes and persists to disk
  5. Replay: trw_session_start() loads relevant learnings at session open

Compaction Handling

  • pre-compact.sh hook fires on PreCompact event → calls trw_pre_compact_checkpoint()
  • Saves full run state before context window compaction
  • On resume, trw_session_start() with source="compact" recovers active run

Cross-Session Handoff

Yes — by design. The key handoff artifact is the CLAUDE.md file, which trw_claude_md_sync regenerates from high-impact learnings. This means every new session reads the accumulated knowledge even without MCP tool calls.

The trw_instructions_sync tool refreshes the IDE-specific instruction file (CLAUDE.md for Claude Code, AGENTS.md for Codex, etc.) with current learnings.

Audit Log

  • meta/events.jsonl — JSONL format, per-run, records significant events
  • Run YAML tracks phase transitions, timestamps, completion status
  • Not replay-capable (no event sourcing), but provides forensic evidence

Learning Pruning

learning_max_entries: 5000 — learnings auto-pruned when limit reached (FIFO by default). /trw-memory-optimize skill provides manual pruning via LLM-assisted deduplication.

07

Orchestration

trw-mcp — Orchestration

Multi-Agent: Yes

TRW ships 12 named agents with explicit role boundaries, model assignments, and tool allowlists. The trw-lead agent orchestrates via Agent Teams using TaskCreate, TeamCreate, SendMessage, and EnterWorktree.

Orchestration Pattern: Hierarchical + Parallel-Fan-Out

  • trw-lead (Opus) is the hierarchical coordinator
  • Subagents (Sonnet) run in parallel shards within implementation waves
  • Wave-based execution: wave N+1 doesn't start until wave N completes
  • Consensus quorum (0.67) gates multi-agent research findings

Isolation Mechanism

Git worktreestrw-lead uses EnterWorktree to isolate parallel implementation work. Each agent team member works in a separate worktree, preventing file conflicts.

Execution Mode: Interactive-loop / continuous-ralph

Each session is bounded (triggered by trw_session_start()), but the framework is designed for multi-session continuity via checkpointing. Not a background daemon; requires active user invocation.

Consensus Mechanism

Quorum-based (0.67 = 2/3 judges agree). Applied in:

  1. Research wave reconciliation (inter-judge correlation ≥ 0.7)
  2. Multi-agent plan validation
  3. Adversarial audit sign-off

Multi-Model

Agent Model Role
trw-lead claude-opus Orchestration, phase gating
All other agents claude-sonnet Implementation, review, test, research

model: opus is set in the trw-lead agent YAML; default for others is Sonnet. Multi-model routing is agent-role-based, not task-complexity semantic.

Prompt Chaining

Yes — the RESEARCH phase writes plan.md which becomes the explicit input for PLAN phase; PLAN writes wave_manifest.yaml which IMPLEMENT shards consume. Each phase's output is the next phase's prompt context.

Auto-Validators

  • trw_build_check(scope="full") runs pytest + mypy at VALIDATE phase
  • pre-tool-deliver-gate.sh enforces build pass before delivery
  • validate-prd-write.sh validates AARE-F format before PRD file writes
  • stop-ceremony.sh blocks session stop if ceremony incomplete
  • TDD enforced via /trw-test-strategy skill (tests-before-code mandate)

Git Automation

  • trw_deliver() does NOT auto-commit; it persists learnings and syncs instruction files
  • /trw-commit skill creates ceremony-compliant commits (manual invocation)
  • PR creation requires explicit user approval (not automated)
  • Worktree management: EnterWorktree for isolation, manual merge/close

Target Tools

Primary: Claude Code. Also supports Cursor, opencode, Codex via IDE-specific bootstrap templates.

Cross-Tool Portability: Medium

Core skills/hooks are Claude Code-specific (.claude/ directory). The init-project --ide codex flag generates Codex-compatible bootstrap, but hook mechanisms differ by IDE.

08

Ui Cli Surface

trw-mcp — UI / CLI Surface

CLI Binary

  • Name: trw-mcp
  • Is thin wrapper: No — it is the MCP server binary itself, with additional subcommands for project management
  • Entry point: trw_mcp.server:main
  • Subcommands: 7
    • init-project — Deploy TRW to target repo
    • update-project — Migrate existing TRW installation
    • check-instructions — Validate instruction-tool parity (exit 1 on mismatch)
    • audit — Audit TRW configuration
    • config-reference — Print all TRW_ environment variables
    • export --format json — Export learnings
    • uninstall — Remove TRW from project
# Usage examples
trw-mcp init-project .              # current directory
trw-mcp init-project . --ide codex  # force Codex bootstrap
trw-mcp init-project . --force      # overwrite existing files

Local UI

None. No web dashboard, no TUI, no desktop app. Interaction is entirely through the MCP protocol (tool calls in Claude Code / other clients) and the CLI for setup.

The trw_quality_dashboard and trw_analytics_report MCP tools return structured text/markdown for in-chat display, but there is no standalone UI surface.

IDE Integration

  • Claude Code: Full support via .claude/hooks/, .claude/skills/, .claude/agents/
  • Cursor: Supported via .cursor/rules/ and cursor-specific config in data/cursor_ide/
  • opencode: Supported via data/opencode/ config templates
  • Codex: Supported via data/codex/ templates (init-project --ide codex)
  • Copilot: Supported via data/copilot/ templates
  • GitHub Actions: Not shipped

MCP Server Config

Clients add the server via:

{
  "mcpServers": {
    "trw": {
      "command": "trw-mcp",
      "args": ["--debug"]
    }
  }
}

Observability

  • trw_analytics_report MCP tool: session/run metrics
  • trw_usage_report: cost/usage tracking
  • meta/events.jsonl: per-run JSONL event log
  • trw_quality_dashboard: coverage and quality gate status
  • Opt-in telemetry pipeline (anonymized, configurable via .trw/config.yaml)

check-instructions Tool

Unique feature: validates that CLAUDE.md/AGENTS.md only references tools that exist in the MCP server. Exits 1 on mismatch — prevents stale instructions referencing removed tools. Used in CI.

Related frameworks

same archetype · same primary tool · same memory type

Taskmaster AI ★ 27k

Converts a PRD into a dependency-ordered JSON task graph that AI coding agents execute one task at a time, eliminating context…

ccmemory ★ 1

Accumulates decisions, corrections, and failed approaches from Claude Code sessions into a queryable Neo4j graph so each new…

Pimzino spec-workflow-mcp ★ 4.2k

MCP server providing spec-driven development workflow with dashboard-backed approval gates, implementation logging, and VSCode…

MCP Shrimp Task Manager ★ 2.1k

Convert natural language requests into structured AI development tasks with chain-of-thought enforcement, reflection gates, and…

Bernstein ★ 460

Govern parallel CLI coding agents with a deterministic Python scheduler, HMAC-chained audit trail, and compliance-ready signed…

LeanSpec ★ 252

Provides a unified spec CLI and MCP server over any existing spec backend (markdown, GitHub Issues, ADO), making spec-driven…