Skip to content
/

do-it

do-it · tdwhere123/do-it · ★ 20 · last commit 2026-05-24

Stop asking AI agents to remember process — install risk-routing, evidence-gated completion, and explicit subagent contracts as automatic hooks.

Best whenDone is an evidence claim, not agent confidence; every stop event requires fresh verification output before completing.
Skip ifSwallowing exceptions, Weakening assertions to pass tests
vs seeds
spec-kit(hooks on every operation). Key differences: do-it adds automatic hook-driven risk routing (router.sh on every UserPromp…
Primitive shape 53 total
Skills 23 Subagents 23 Hooks 7
00

Summary

do-it — Summary

do-it is an installable AI coding workflow for Codex and Claude Code that routes work by risk tier (Light/Standard/Heavy), delegates sub-agents with explicit contracts pinning scope/write-ownership/forbidden-paths/stop-conditions, and requires fresh verification evidence before an agent can claim done. With 20 stars and active maintenance (last commit May 2024), it ships a CLI (do-it), 23 skills, 23 agent TOML definitions, 5 lifecycle hooks (UserPromptSubmit, PreToolUse, PostToolUse, Stop), and a hook-driven automatic workflow: router.sh fires on every prompt, grill-pretool.sh fires on edits, code-map-refresh.sh updates a code map, comments-lint.sh and anti-patterns-lint.sh run on writes, and verification-gate.sh runs at Stop. The framework treats "done" as an evidence claim — fresh verification output is required, not agent confidence. Closest seed comparison: resembles spec-driver (24 skills, workflow-enforcing) in philosophy, but do-it adds automatic hook-driven routing and risk classification that spec-driver lacks, plus the subagent contract model (scope/write-ownership/forbidden-paths) that no other seed implements.

01

Overview

do-it — Overview

Origin

Created by tdwhere123. Single contributor. 20 stars. Active through May 2026. Self-described: "This is the workflow I use every day for real project work."

Philosophy

"Stop asking AI agents to remember process. Install it."

"do-it turns AI coding discipline into an installable workflow for Codex and Claude Code. It routes work by risk, delegates sub-agents with explicit contracts, and requires fresh evidence before an agent can claim done."

The Three Moves

  1. Route the work — Every prompt classified as Light, Standard, or Heavy before action
  2. Contract the delegation — Every delegated slice pins scope, write-ownership, forbidden-paths, must-verify facts, stop-condition, return-schema
  3. Prove the result — "Done" requires fresh verification output, not agent confidence

Risk Tiers

Tier Use When
Light Small local edits, docs tweaks, one-off checks
Standard Normal non-trivial engineering work
Heavy Releases, architecture changes, cross-module policy, multi-agent delivery

Subagent Contract Fields

Field Purpose
scope Single bounded outcome the sub-agent owns
write ownership Which paths the sub-agent is allowed to edit
forbidden paths Which paths must not be touched, even if it would help
must-verify facts Concrete claims to confirm before acting
stop condition Exact event that ends the sub-agent's run
return schema Structured shape of the final report

Integrity Principle

"A failure, error, or surprising result is a clue to investigate — not an obstacle to make disappear. Never make a symptom vanish without explaining it."

02

Architecture

do-it — Architecture

Distribution

  • CLI tool installed from GitHub tarball via npm
  • Codex plugin: codex plugin marketplace add tdwhere123/do-it
  • Claude Code plugin: /plugin marketplace add tdwhere123/do-it && /plugin install do-it

Install

npm install -g https://github.com/tdwhere123/do-it/archive/refs/heads/main.tar.gz
do-it setup

Or for Claude Code:

/plugin marketplace add tdwhere123/do-it
/plugin install do-it

Required Runtime

  • Node.js (npm-based install)
  • Codex CLI or Claude Code
  • Bash (hooks are shell scripts)

Directory Tree

do-it/
├── skills/
│   ├── do-it/
│   │   ├── do-it-router/         — Risk tier classification
│   │   ├── do-it-grill/          — Truth-check, converges Must Resolve items
│   │   ├── do-it-brainstorm/     — Multi-lens divergence (product+architecture cores)
│   │   ├── do-it-handbook/       — Project doc skeleton
│   │   ├── do-it-context/        — Context loading
│   │   ├── do-it-planning/       — Planning workflow
│   │   ├── do-it-slicing/        — Task slicing
│   │   ├── do-it-interface-drill/
│   │   ├── do-it-architecture-scan/
│   │   ├── do-it-domain-language/
│   │   ├── do-it-subagent-orchestration/
│   │   ├── do-it-tdd/
│   │   ├── do-it-debugging/
│   │   ├── do-it-review-loop/
│   │   ├── do-it-fix-loop/
│   │   ├── do-it-verification-gate/
│   │   ├── do-it-worktree-isolation/
│   │   ├── do-it-branch-closeout/
│   │   ├── do-it-visual-planning/
│   │   ├── do-it-skill-authoring/
│   │   ├── do-it-grill-log/
│   │   └── do-it-comments-discipline/
│   └── custom/                   — User custom skills
├── agents/                       — 23 TOML agent definitions
│   ├── tdd-red-writer.toml
│   ├── plan-challenger.toml
│   ├── reviewer.toml
│   ├── red-team-reviewer.toml
│   ├── architecture-strategist.toml
│   ├── product-strategist.toml
│   └── ...
├── hooks/
│   ├── hooks.json                — Claude Code hook definitions
│   ├── router.sh                 — UserPromptSubmit risk router
│   ├── grill-prompt.sh           — UserPromptSubmit truth-check
│   ├── grill-pretool.sh          — PreToolUse discipline gate
│   ├── code-map-refresh.sh       — PostToolUse code map update
│   ├── comments-lint.sh          — PostToolUse comment lint
│   ├── anti-patterns-lint.sh     — PostToolUse anti-pattern lint
│   ├── verification-gate.sh      — Stop evidence requirement
│   ├── lib/                      — Common shell functions
│   └── data/                     — Session state data
├── bin/                          — CLI binary
├── commands/                     — Slash commands
├── manifest.json                 — Package manifest
├── index.json                    — Machine-readable skill inventory
└── plugins/                      — Plugin bundles

Target AI Tools

  • Codex CLI (primary)
  • Claude Code (secondary, also shipped as plugin)

State Root

.do-it/ — session state, code map, brainstorm artifacts, grill state

03

Components

do-it — Components

CLI Binary: do-it

Command Purpose
do-it setup Install skills, agents, hooks, doctor
do-it install Copy managed skills, agents, hooks
do-it install --target=claude Install to Claude Code
do-it doctor Validate installed files against manifest.json
do-it doctor --target=claude Claude-specific validation

Skills (23)

Skill Purpose
do-it-router Risk tier classification (Light/Standard/Heavy) + 5 DIM booleans
do-it-grill Truth-check: converges Must Resolve items; anti-cover-up enforcement
do-it-brainstorm Multi-lens divergence: product+architecture cores + task-fit supplements
do-it-handbook Project doc skeleton (.do-it/handbook/code-map.md)
do-it-context Context/codebase loading
do-it-planning Planning workflow
do-it-slicing Task decomposition into slices
do-it-interface-drill Interface design verification
do-it-architecture-scan Architecture analysis
do-it-domain-language Domain language enforcement
do-it-subagent-orchestration Contract-based subagent delegation
do-it-tdd TDD workflow
do-it-debugging Systematic debugging
do-it-review-loop Review iteration loop
do-it-fix-loop Bug fix iteration
do-it-verification-gate Evidence-based completion check
do-it-worktree-isolation Git worktree per feature
do-it-branch-closeout Branch completion workflow
do-it-visual-planning Visual planning (opt-in)
do-it-skill-authoring Skill authoring discipline
do-it-grill-log Grill session logging
do-it-comments-discipline Comment quality enforcement

Agents (23 TOML files)

architect-reviewer, architecture-strategist, architecture-taste-reviewer, ceo-reviewer, code-mapper, code-quality-cleaner, documentation-engineer, domain-language-reviewer, end-user-advocate, install-release-reviewer, ops-sre, plan-challenger, product-strategist, react-specialist, red-team-reviewer, reviewer, skill-quality-reviewer, spec-compliance-reviewer, sql-pro, tdd-red-writer

Hooks (hooks.json)

Hook Event Trigger
router.sh UserPromptSubmit Every prompt — risk classification
grill-prompt.sh UserPromptSubmit Every prompt — truth-check
grill-pretool.sh PreToolUse (Edit/Write/MultiEdit) Before writes
code-map-refresh.sh PostToolUse (Edit/Write/MultiEdit/NotebookEdit) After writes
comments-lint.sh PostToolUse (Edit/Write/MultiEdit/NotebookEdit) After writes
anti-patterns-lint.sh PostToolUse (Edit/Write/MultiEdit/NotebookEdit) After writes
verification-gate.sh Stop At session end

Router DIM Dimensions (5 booleans)

Dimension Condition
dim_touches_code Prompt names file/extension/snippet/technical noun
dim_crosses_packages 2+ distinct top-level paths named
dim_breaks_interface Breaking change, schema/API rewrite mentioned
dim_needs_tdd Behavior-modifying intent + code object named
dim_needs_review_loop tier=Heavy OR dim_breaks_interface=1
05

Prompts

do-it — Prompts

Verbatim Excerpt 1: do-it-router/SKILL.md (Integrity + Anti-Cover-Up)

Prompting technique: Integrity-first routing; anti-cover-up taxonomy; dimensional analysis before tier

## Integrity

A failure, error, surprising result, or red flag is a clue to investigate — not an obstacle to make disappear. When something does not work:

1. Trace it to a root cause before changing anything.
2. Never make a symptom vanish without explaining it. These are cover-ups, not fixes:
   - swallowing an exception or emptying a `catch`;
   - weakening, loosening, or deleting an assertion so a check passes;
   - deleting, skipping, or `xfail`-ing a failing test instead of fixing the cause;
   - commenting out failing code or returning early past it;
   - adding a fallback or default that hides why the primary path failed;
   - editing the evidence (expected output, snapshot, fixture) instead of the behavior.
3. Report honestly. "I could not verify X" or "this still fails because Y" is a correct, useful answer; a false "done" is a defect.

Verbatim Excerpt 2: do-it-router/SKILL.md (DIM Dimensions)

Prompting technique: Boolean dimensional analysis as router metadata; hook-layer vs agent-layer separation

## Orthogonal Dimensions

In addition to the single tier label, the router writes 5 boolean dimensions into per-session state:

| Dimension | Set when |
|---|---|
| `dim_touches_code` | prompt names a file path, extension, fenced snippet, or curated technical noun |
| `dim_crosses_packages` | ≥ 2 distinct top-level path segments named |
| `dim_breaks_interface` | prompt mentions breaking change, schema/API rewrite, endpoint rename/delete/deprecate |
| `dim_needs_tdd` | prompt names behaviour-modifying intent AND a code object |
| `dim_needs_review_loop` | tier is Heavy OR `dim_breaks_interface=1` |

DIM values live in per-session state written by the router. Two consumption paths:
- **Hook layer**: reads via `do_it_session_state_get "$SESSION_ID" <key>` in hooks/lib/common.sh
- **Agent layer**: agents judge from prompt content, not DIM state files

Verbatim Excerpt 3: hooks.json (Automatic Hook Firing)

Prompting technique: Every prompt automatically routed and truth-checked; every write automatically linted

{
  "hooks": {
    "UserPromptSubmit": [
      {"type": "command", "command": "${CLAUDE_PLUGIN_ROOT}/hooks/router.sh", "timeout": 25},
      {"type": "command", "command": "${CLAUDE_PLUGIN_ROOT}/hooks/grill-prompt.sh", "timeout": 25}
    ],
    "PreToolUse": [
      {"matcher": "Edit|Write|MultiEdit",
       "hooks": [{"type": "command", "command": "${CLAUDE_PLUGIN_ROOT}/hooks/grill-pretool.sh", "timeout": 10}]}
    ],
    "PostToolUse": [
      {"matcher": "Edit|Write|MultiEdit|NotebookEdit",
       "hooks": [
         {"command": "${CLAUDE_PLUGIN_ROOT}/hooks/code-map-refresh.sh"},
         {"command": "${CLAUDE_PLUGIN_ROOT}/hooks/comments-lint.sh"},
         {"command": "${CLAUDE_PLUGIN_ROOT}/hooks/anti-patterns-lint.sh"}
       ]}
    ],
    "Stop": [{"hooks": [{"command": "${CLAUDE_PLUGIN_ROOT}/hooks/verification-gate.sh"}]}]
  }
}
09

Uniqueness

do-it — Uniqueness

Differs From Seeds

Most similar to spec-driver (24 skills, workflow enforcement, TDD emphasis) and spec-kit (hooks on every operation, auto-validation) in the seeds. Key differences: do-it adds automatic hook-driven risk routing (router.sh fires on every prompt) that spec-driver lacks; the explicit subagent contract model (scope/write-ownership/forbidden-paths/stop-condition/return-schema) is not present in any seed; the 5-dimensional DIM boolean state written by the router and consumed by hooks is a novel coordination mechanism. Unlike spec-kit (which mirrors commands and skills), do-it has only skills + hooks — no slash commands beyond setup. The anti-cover-up taxonomy (explicit list of prohibited fix patterns) is more detailed than any seed's TDD enforcement language.

Positioning

  • Primary innovation: automatic hook-driven routing that fires on EVERY prompt without user invocation
  • Subagent contract model unique in batch and corpus — writes explicit contracts for all delegations
  • DIM boolean session state for hook coordination is novel vs all seeds
  • "Done as evidence claim" framing is the most explicit verification philosophy in the batch

Observable Failure Modes

  • Hook timeout risk: 7 hooks per write sequence (router + grill + 3 post-write hooks + grill-pretool + verification-gate) could accumulate latency
  • Single contributor: 1 contributor; no test suite visible for CLI itself
  • Codex feature flag: plugin hooks vs native hooks depends on Codex build (plugin_hooks not yet supported per README)
  • DO_IT_FORCE=1 escape hatch: could be misused to bypass managed file protection
  • Code map staleness: only refreshed on barrel/migration/route edits — other changes don't trigger refresh
04

Workflow

do-it — Workflow

Automatic Flow (hook-triggered)

UserPromptSubmit
  → router.sh (classify Light/Standard/Heavy)
  → grill-prompt.sh (truth-check: converge Must Resolve items)
  → [multi-lens needed?] → do-it-brainstorm (product+architecture cores)
  → [premise stable?]
    → [no] → do-it-grill
    → [yes] → tier routing:
        Light: execute
        Standard: inline modification map → execute
        Heavy: plan → slicing → drills → subagent orchestration → review-loop → fix-loop

PreToolUse (Edit/Write/MultiEdit)
  → grill-pretool.sh (durable plan gate for Heavy or explicit)

PostToolUse (Edit/Write/MultiEdit)
  → code-map-refresh.sh (mark .do-it/handbook/code-map.md stale)
  → comments-lint.sh (comment quality check)
  → anti-patterns-lint.sh (anti-pattern lint)

Stop
  → verification-gate.sh (require fresh evidence before close)

Phases and Artifacts

Phase Skill Artifact Gate
Classification do-it-router Session state (tier + DIM booleans) auto (hook)
Truth-check do-it-grill .do-it/grill/*.md Must Resolve convergence
Brainstorm (opt.) do-it-brainstorm .do-it/brainstorm/<slug>.md Auto-triggered
Planning do-it-planning .do-it/plans/<slug>.md User approval for Heavy
Slicing do-it-slicing Task cards none
Execution (executor) Code changes none
Subagent delegation do-it-subagent-orchestration Contract + return report Contract schema
Verification do-it-verification-gate Fresh test/build output Required at Stop
Branch close do-it-branch-closeout PR or merge none

Heavy Tier Subagent Contract

Every delegated slice requires:

scope: <single bounded outcome>
write ownership: <allowed paths>
forbidden paths: <must not touch, even if helpful>
must-verify facts: <concrete claims to confirm before acting>
stop condition: <exact event that ends the run>
return schema: <structured shape of final report>

Code Map

.do-it/handbook/code-map.md — code-map-refresh.sh marks stale on barrel/migration/route/workspace-manifest edits.

06

Memory Context

do-it — Memory & Context

Session State

.do-it/runtime/ — session state files:

  • pointer — active task slug (names current brainstorm/grill/plans file)
  • DIM boolean state (written by router, read by hooks)

Code Map

.do-it/handbook/code-map.md — refreshed by code-map-refresh.sh hook on barrel/migration/route/workspace-manifest edits

Task Artifacts

.do-it/brainstorm/<slug>.md — brainstorm artifacts
.do-it/grill/<slug>.md — grill/truth-check state
.do-it/plans/<slug>.md — plan artifacts

Hook-Layer State Access

hooks/lib/common.shdo_it_session_state_get "$SESSION_ID" <key> helper
5-level env-var search: CLAUDE_PLUGIN_DATADO_IT_HOOK_DATACODEX_HOME/do-it-data → repo .do-it/runtime/${TMPDIR}/do-it-sessions

Cross-Session Handoff

pointer file in .do-it/runtime/ names the active task slug → read at start of each router invocation
Plan artifacts persist to .do-it/plans/ for session resume

Context Compaction

No explicit handling — relies on artifact persistence in .do-it/

07

Orchestration

do-it — Orchestration

Multi-Agent: Yes (for Heavy tier)

do-it-subagent-orchestration skill + 23 TOML agent definitions
Delegation uses explicit contracts (scope/write-ownership/forbidden-paths/stop-condition/return-schema)
Parent agent remains responsible — no external orchestrator needed

Orchestration Pattern

sequential (Light/Standard) + task-decomposition-tree (Heavy with slicing + subagent delegation)

Multi-Model

No — no model routing documented

Isolation Mechanism

do-it-worktree-isolation — git worktree per feature for Heavy tier
Subagents have explicit write ownership and forbidden paths contract fields

Execution Mode

event-driven (hooks fire on UserPromptSubmit/PreToolUse/PostToolUse/Stop)

Consensus Mechanism

None — parent agent verifies subagent reports

Prompt Chaining

Yes: router output (tier + DIMs) → skill selection → execution → verification-gate is a chained pipeline
Hook chain: router.sh → grill-prompt.sh on every prompt

Context Compaction

Not explicitly handled

08

Ui Cli Surface

do-it — UI & CLI Surface

CLI Binary: do-it

Binary: do-it
Install: npm install -g https://github.com/tdwhere123/do-it/...
Subcommands: setup, install, install --target=claude, doctor, doctor --target=claude

No Dashboard

No web dashboard.

Invocation

Within Codex: $do-it-router, $do-it-grill, etc.
Within Claude Code: /do-it:router, /do-it:grill, etc.

Automatic Workflow (no user invocation needed)

The hook system fires automatically:

  • router.sh — fires on every UserPromptSubmit
  • grill-prompt.sh — fires on every UserPromptSubmit
  • grill-pretool.sh — fires on every Edit/Write/MultiEdit
  • code-map-refresh.sh, comments-lint.sh, anti-patterns-lint.sh — fire on every write
  • verification-gate.sh — fires on every Stop

Users don't need to invoke skills manually — the hooks enforce the workflow.

Observability

  • .do-it/grill-log/ — grill session logs
  • .do-it/handbook/code-map.md — codebase map
  • manifest.json and index.json — machine-readable skill inventory

Codex-Specific

DO_IT_FORCE=1 env var to override managed file conflict check. CODEX_HOME override for testing.

Related frameworks

same archetype · same primary tool · same memory type

Claude-Flow / Ruflo ★ 55k

Eliminates single-agent context limits and sequential bottlenecks by orchestrating fault-tolerant swarms of specialized AI agents…

Hermes Agent (NousResearch) ★ 168k

Self-improving personal AI agent with closed learning loop, 7 terminal backends, and messaging gateway — not tied to any AI…

OpenCode ★ 165k

Terminal-first AI coding agent with multi-model routing, native desktop app, and a typed .opencode/ configuration system for…

OpenHands ★ 75k

Open-source AI software development platform (open-source Devin alternative) with Docker sandbox isolation, 77.6% SWE-bench…

DeerFlow ★ 70k

Long-horizon superagent that researches, codes, and creates by orchestrating parallel sub-agents with isolated contexts in Docker…

oh-my-openagent (omo) ★ 60k

Multi-provider AI agent orchestration for OpenCode: escape vendor lock-in by routing Sisyphus (Claude/Kimi/GLM) and Hephaestus…