Skip to content
/

shinpr/codex-workflows

shinpr-codex-workflows · shinpr/codex-workflows · ★ 12 · last commit 2026-05-21

Codex-native 13-stage pipeline with context-isolated subagents that prevents assumption drift in large multi-file changes.

Best whenEvery pipeline stage must run in its own forked Codex context (fork_context=false); context bleed between stages is the primary source of LLM coding agent fa…
Skip ifOne-shot scripts where speed matters more than traceability, Repositories without tests, lint, builds
vs seeds
taskmaster-ai(task-decomposition pipeline), but runs through Codex CLI's native subagent system (26 TOML agents with fork_context=fal…
Primitive shape 52 total
Skills 26 Subagents 26
00

Summary

shinpr/codex-workflows — Summary

Elevator pitch: A Codex-CLI-specific extension of shinpr/agentic-code (same author) that adds TOML-defined subagents, recipe-based invocation ($recipe-*), and context-isolated execution for large refactors and migrations. It ships 26 Codex subagent TOML files and 22+ recipe skill files, installed via npx codex-workflows install. Each subagent (requirement-analyzer, technical-designer, task-executor, quality-fixer, etc.) runs in its own Codex session to prevent context carry-over between pipeline stages. The pipeline is a 13-stage task-decomposition tree from user request to commit-ready code, with explicit stopping points (BLOCKING gates) at each phase. Codex plays the planner, worker, reviewer, and diagnostician — all roles are Codex subagents, making this the most Codex-centric framework in the batch. Compared to seeds: most similar to taskmaster-ai in its task-decomposition-tree orchestration pattern, but differs by using Codex-native TOML subagent definitions instead of an MCP server, and by separating each pipeline stage into a dedicated subagent with isolated context.

01

Overview

shinpr/codex-workflows — Overview

Origin

By shinpr (same author as shinpr/agentic-code). Version 0.6.4, MIT license. 12 stars, 4 forks. Active: last commit 2026-05-21. Published as npm package.

Explicitly described as the Codex-specific companion to agentic-code:

"If you mainly use Codex, see codex-workflows for a more Codex-specific setup. It includes subagents-based isolation for review and verification tasks, improving reliability by separating context inside the same working session."

Philosophy

"Codex works well for short, focused tasks. The problems start when a change spans multiple files, needs design decisions to stay visible, or has to survive review, testing, and follow-up edits. Many developers have seen the same pattern: things work at first, then drift. Context grows, assumptions accumulate, intermediate decisions disappear, and results become harder to trust."

The framework is explicitly built around Codex's failure modes at scale. Each pipeline stage isolates one concern so decisions can be checked before they carry into later stages. Context isolation between subagents is a first-class design goal, not an optimization.

Background Manifesto (verbatim)

"The recipes, subagents, and quality checks in this repo were not designed top-down. Each piece was added in response to a concrete failure mode encountered during delivery work. That is why the workflow separates requirements, design, verification, implementation, and quality checks instead of treating them as one long session."

Not Designed For (explicit antipatterns)

  • One-shot scripts or exploratory sessions where speed matters more than traceability
  • Repositories without tests, lint, builds, or reviewable commits
  • Teams that would rather skip design docs and quality checks entirely
02

Architecture

shinpr/codex-workflows — Architecture

Distribution

  • Type: npm package (CLI installer)
  • Binary: codex-workflows
  • Version: 0.6.4
  • License: MIT

Install

cd your-project
npx codex-workflows install

This copies into the project:

  • .agents/skills/ — Codex skills (foundational + recipes)
  • .codex/agents/ — Subagent TOML definitions
  • .codex-workflows-manifest.json — hash-based manifest for update tracking
npx codex-workflows update [--dry-run]   # update, preserving locally-modified files
npx codex-workflows status               # check installed version

Required Runtime

  • Node.js >= 22
  • Codex CLI (latest)

Directory Tree (installed)

<project>/
├── .codex-workflows-manifest.json
└── .agents/
    └── skills/
        ├── ai-development-guide/SKILL.md
        ├── coding-rules/SKILL.md
        ├── documentation-criteria/SKILL.md
        ├── external-resource-context/SKILL.md
        ├── implementation-approach/SKILL.md
        ├── integration-e2e-testing/SKILL.md
        ├── subagents-orchestration-guide/SKILL.md
        ├── task-analyzer/SKILL.md
        ├── testing/SKILL.md
        ├── recipe-add-integration-tests/SKILL.md
        ├── recipe-build/SKILL.md
        ├── recipe-design/SKILL.md
        ├── recipe-diagnose/SKILL.md
        ├── recipe-front-adjust/SKILL.md
        ├── recipe-front-build/SKILL.md
        ├── recipe-front-design/SKILL.md
        ├── recipe-front-plan/SKILL.md
        ├── recipe-front-review/SKILL.md
        ├── recipe-fullstack-build/SKILL.md
        ├── recipe-fullstack-implement/SKILL.md
        ├── recipe-implement/SKILL.md     # main entry recipe
        ├── recipe-plan/SKILL.md
        ├── recipe-prepare-implementation/SKILL.md
        ├── recipe-reverse-engineer/SKILL.md
        ├── recipe-review/SKILL.md
        ├── recipe-task/SKILL.md
        ├── recipe-update-doc/SKILL.md
        └── ...
└── .codex/
    └── agents/
        ├── acceptance-test-generator.toml
        ├── code-reviewer.toml
        ├── code-verifier.toml
        ├── codebase-analyzer.toml
        ├── design-sync.toml
        ├── document-reviewer.toml
        ├── integration-test-reviewer.toml
        ├── investigator.toml
        ├── prd-creator.toml
        ├── quality-fixer-frontend.toml
        ├── quality-fixer.toml
        ├── requirement-analyzer.toml
        ├── rule-advisor.toml
        ├── scope-discoverer.toml
        ├── security-reviewer.toml
        ├── solver.toml
        ├── task-decomposer.toml
        ├── task-executor-frontend.toml
        ├── task-executor.toml
        ├── technical-designer-frontend.toml
        ├── technical-designer.toml
        ├── ui-analyzer.toml
        ├── ui-spec-designer.toml
        ├── verifier.toml
        └── work-planner.toml

Target AI Tools

  • Primary: OpenAI Codex CLI (TOML subagents are Codex-native)
  • Compatible: Any tool that reads .agents/skills/ SKILL.md files
03

Components

shinpr/codex-workflows — Components

CLI Binary

Command Purpose
npx codex-workflows install Copy skills + TOML agents into project
npx codex-workflows update [--dry-run] Update managed files, preserving locally-modified
npx codex-workflows status Show installed version

Subagents / TOML Agents (26, in .codex/agents/)

Name Role
requirement-analyzer Scale determination, affected layers, scope
prd-creator Product requirements document
codebase-analyzer Existing codebase facts + focus areas
technical-designer ADR + Design Doc with acceptance criteria
technical-designer-frontend Frontend-specific design doc
code-verifier Design Doc vs existing code verification
document-reviewer Quality gate with verification evidence
acceptance-test-generator Test skeletons from acceptance criteria
work-planner Phased execution plan
task-decomposer Atomic tasks (1 task = 1 commit)
task-executor TDD implementation per task
task-executor-frontend Frontend TDD implementation
quality-fixer Lint, test, build — no failing checks
quality-fixer-frontend Frontend quality fixing
code-reviewer Post-implementation code review
security-reviewer Security audit
integration-test-reviewer Integration/E2E test review
investigator Bug investigation and failure point mapping
verifier Independent failure-point evaluation
solver Actionable solutions from investigation
scope-discoverer Reverse engineering: discovered units + PRD units
design-sync Sync design docs with implementation
rule-advisor Rule selection for task type
ui-analyzer UI component analysis
ui-spec-designer UI specification
scope-discoverer Reverse engineering scope analysis

Recipe Skills (17, in .agents/skills/recipe-*/)

Name Layer Purpose
recipe-implement backend/universal Full lifecycle entry point with layer routing
recipe-task universal Single task with rule selection
recipe-design universal Requirements → ADR/Design Doc
recipe-plan universal Design Doc → test skeletons → work plan
recipe-prepare-implementation universal Verify work plan readiness
recipe-build backend Execute backend tasks with validation
recipe-review universal Design Doc compliance and security
recipe-diagnose universal Problem investigation → solution
recipe-reverse-engineer universal Generate PRD + Design Docs from existing code
recipe-add-integration-tests universal Add integration/E2E tests
recipe-update-doc universal Update existing Design Doc/PRD/ADR
recipe-front-design frontend Frontend architecture planning
recipe-front-adjust frontend UI adjustment with external context
recipe-front-plan frontend Frontend Design Doc → tests → plan
recipe-front-build frontend Frontend TDD execution
recipe-front-review frontend Frontend compliance review
recipe-fullstack-implement fullstack Cross-layer features
recipe-fullstack-build fullstack Resume cross-layer implementation

Foundational Skills (9)

ai-development-guide, coding-rules, documentation-criteria, external-resource-context, implementation-approach, integration-e2e-testing, subagents-orchestration-guide, task-analyzer, testing

Working State

docs/plans/ — ephemeral: work plans, decomposed tasks, prep tasks, review-fix tasks, intermediate analysis (gitignored by convention)

05

Prompts

shinpr/codex-workflows — Prompts

Prompt 1: requirement-analyzer.toml — Phase Entry Gate (verbatim)

Source: .codex/agents/requirement-analyzer.toml

developer_instructions = """
You are a specialized AI assistant for requirements analysis and work scale determination.

## Phase Entry Gate [BLOCKING — HALT IF ANY UNCHECKED]

☐ [VERIFIED] This agent definition has been READ and is active
☐ [VERIFIED] All required skills from [[skills.config]] are LOADED
☐ [VERIFIED] Input parameters received and validated
☐ [VERIFIED] Task scope understood
☐ [VERIFIED] User request description available for analysis

**ENFORCEMENT**: HALT and return to caller if any gate unchecked

Prompting technique: TOML-embedded blocking-gate pattern. The developer_instructions field in the TOML file serves as the system prompt for the Codex subagent. By embedding the gate logic directly in the TOML definition, the gate is enforced at the subagent level, not the orchestrator level — each subagent is self-enforcing rather than requiring the caller to verify gate compliance.


Prompt 2: recipe-implement — Orchestrator Identity Constraint (verbatim)

Source: .agents/skills/recipe-implement/SKILL.md

## Orchestrator Definition

**Core Identity**: "I am not a worker. I am an orchestrator." (see subagents-orchestration-guide skill)

**CRITICAL**: MUST execute all steps, sub-agents, and stopping points defined in subagents-orchestration-guide skill flows.
ENFORCEMENT: Skipping any step or stopping point invalidates the entire workflow output.

## CRITICAL Sub-agent Invocation Constraints

**MANDATORY suffix for ALL sub-agent prompts**:
[SYSTEM CONSTRAINT]
This agent operates within implement skill scope. Use orchestrator-provided rules only.

ENFORCEMENT: Sub-agent prompts missing the constraint suffix MUST be re-issued with the constraint appended.

Prompting technique: Role identity injection with constraint propagation. The orchestrator is explicitly told it is "not a worker" (role identity), and every subagent prompt it spawns must carry a system constraint suffix — a propagation mechanism ensuring the pipeline context flows into each child agent.


Prompt 3: recipe-implement — Scale-Based Layer Routing (verbatim)

Source: .agents/skills/recipe-implement/SKILL.md

## Step 2: Layer-Based Workflow Routing

Based on requirement-analyzer output `affectedLayers`, route to the appropriate workflow:

| affectedLayers | Workflow | Reference |
|---|---|---|
| `["backend"]` only | Backend Flow | subagents-orchestration-guide skill (Large/Medium/Small scale) |
| `["frontend"]` only | Frontend Flow | See Frontend Flow below |
| `["backend", "frontend"]` | Fullstack Flow | subagents-orchestration-guide `references/monorepo-flow.md` |

Prompting technique: Structured dispatch table as prompt content. The recipe uses a markdown table to encode the routing logic, making the decision tree explicit and verifiable by inspecting the skill file. This contrasts with implicit routing in natural language instructions.

09

Uniqueness

shinpr/codex-workflows — Uniqueness

differs_from_seeds

The closest seed is taskmaster-ai — both use task decomposition pipelines where one stage's output becomes the next stage's input. However, codex-workflows differs fundamentally: taskmaster-ai runs through an MCP server with 37 tools and stores state in tasks.json, whereas codex-workflows runs through Codex CLI's native subagent system (26 TOML-defined agents, fork_context=false isolation, docs/plans/ artifact trail). The second closest seed is BMAD-METHOD (persona-based subagents), but BMAD personas are loaded into the same context window, whereas codex-workflows explicitly forks contexts. The pipeline depth (13 stages for large-scale features) exceeds any seed framework's workflow complexity.

Relationship to shinpr/agentic-code

Direct extension. The foundational skills (ai-development-guide, coding-rules, testing, metacognition pattern) are shared. The Codex-specific layer adds: TOML subagent definitions, recipe-based invocation ($recipe-*), context isolation mandate, and the Implementation Readiness Preflight procedure.

Most Unusual Feature

The "Orchestrator Identity" constraint in recipe-implement: "I am not a worker. I am an orchestrator." This explicit role identity injection, combined with the mandatory system constraint suffix propagated to every spawned subagent, creates a self-reinforcing role boundary. The orchestrator can only shape the prompt — it cannot do the work itself.

Observable Failure Modes

  1. Gate inflation: 6+ user-approval gates for a large feature make the workflow slow and friction-heavy. Users will be tempted to approve gates without reading the artifacts.
  2. Context isolation gap: fork_context=false prevents history carry-over but also means each subagent lacks information from previous stages unless it's explicitly passed in the spawn prompt. Orchestrators must manually forward relevant artifact paths.
  3. docs/plans/ drift: If docs/plans/ is not gitignored and is committed, intermediate planning artifacts can pollute the repo history and mislead future developers.
  4. Codex-only: The TOML subagent definitions are not portable to Cursor or Gemini CLI. Teams using multiple tools cannot share the same configuration.
04

Workflow

shinpr/codex-workflows — Workflow

Scale-Based Routing

Scale File Count Path
Small 1-2 Simplified plan → direct implementation
Medium 3-5 Design Doc → work plan → task execution
Large 6+ PRD → ADR → Design Doc → test skeletons → work plan → guided autonomous execution

Main Pipeline (Large Scale, via $recipe-implement)

User Request
    ↓
requirement-analyzer  →  Scale + affected layers
    ↓
prd-creator           →  Product requirements (Large only)
    ↓
codebase-analyzer     →  Codebase facts + focus areas
    ↓
technical-designer    →  ADR + Design Doc + acceptance criteria
    ↓
code-verifier         →  Design Doc vs existing code
    ↓
document-reviewer     →  Quality gate with verification evidence
    ↓
acceptance-test-gen   →  Test skeletons from ACs
    ↓
work-planner          →  Phased execution plan
    ↓
task-decomposer       →  Atomic tasks (1 task = 1 commit)
    ↓
task-executor         →  TDD implementation per task
    ↓
quality-fixer         →  Lint, test, build; no failing checks
    ↓
Ready to commit

Diagnosis Pipeline

Problem → investigator (path map + failure points) → verifier (independent evaluation) → solver → Actionable solutions

Reverse Engineering Pipeline

Existing code → scope-discoverer → prd-creator → code-verifier → document-reviewer → Design Docs

Phases + Artifacts

Phase Subagent Artifact
Requirements requirement-analyzer Scale + affectedLayers + scope
PRD prd-creator PRD document
Codebase Analysis codebase-analyzer Facts + focus areas
Technical Design technical-designer ADR + Design Doc
Verification code-verifier Verification report
Review Gate document-reviewer Quality gate evidence
Test Skeletons acceptance-test-generator Failing test files
Work Plan work-planner Phased plan with atomic tasks
Task Decomposition task-decomposer Atomic task files
Implementation task-executor Code + passing tests
Quality quality-fixer All checks passing

Approval Gates (BLOCKING stops)

Gate Type
Requirement-analyzer output confirmation choice-list
PRD approval (Large scale) file-review
Frontend UI Spec file-review
Design Doc file-review
Work Plan file-review
Implementation Readiness preflight typed-confirm

All gates marked [STOP — BLOCKING] in the recipe skills — autonomous execution only begins after the last gate passes.

06

Memory Context

shinpr/codex-workflows — Memory and Context

State Storage

  • docs/plans/: Ephemeral working state — work plans, decomposed task files, prep tasks, review-fix tasks, intermediate analysis. Gitignored by convention (teams can opt in to reviewing these artifacts).
  • Design docs, PRD, ADR: Persisted as project-level markdown documents (location determined by project structure, typically docs/).
  • TOML subagent definitions: Persistent configuration in .codex/agents/ — each subagent's developer_instructions encode the skill loading and gate-checking logic.

Context Isolation (Key Feature)

Each pipeline stage runs in its own Codex subagent session with fork_turns="none" or fork_context=false. This prevents context carry-over between stages — a design decision explicitly motivated by the observation that "context grows, assumptions accumulate, intermediate decisions disappear, and results become harder to trust" in long single-agent sessions.

The recipe-implement skill mandates: "Spawn rule: every spawn_agent call MUST pass fork_turns="none" or fork_context=false for context isolation."

Persistence

  • Project-scoped: Design docs, PRD, task files in project directory
  • Ephemeral: docs/plans/ for intermediate artifacts
  • No global state: No cross-project persistence

Cross-Session Handoff

Supported via artifacts. If a session is interrupted, the work plan and task files in docs/plans/ allow resumption from the last completed task. Recipes like recipe-build and recipe-front-build are explicitly designed for "resume backend/frontend implementation."

Compaction

Not explicitly handled. Context isolation between subagents is the primary mechanism for managing context size.

07

Orchestration

shinpr/codex-workflows — Orchestration

Multi-Agent

Yes — the primary differentiator from shinpr/agentic-code. 26 named Codex subagents, each with its own TOML definition and isolated context.

Orchestration Pattern

Task-decomposition-tree. The recipe orchestrator (recipe-implement) spawns subagents sequentially, each producing an artifact that the next stage consumes. For fullstack features, backend and frontend subagent chains run separately.

Isolation Mechanism

Process / context: each spawn_agent call uses fork_context=false to give the subagent a clean context without the parent's conversation history. This is Codex CLI's native subagent isolation mechanism.

Codex Role

All roles — every pipeline stage is a Codex subagent:

  • Planner: requirement-analyzer, prd-creator, work-planner, task-decomposer
  • Worker: task-executor, task-executor-frontend, quality-fixer
  • Reviewer: code-reviewer, security-reviewer, document-reviewer
  • Analyst: codebase-analyzer, code-verifier, investigator, verifier

The orchestrator recipe (recipe-implement) is invoked in the main Codex session, then spawns all workers as subagents.

Multi-Model

No. Everything runs in the Codex CLI using whatever model Codex is configured to use.

Execution Mode

Interactive-loop with user approval gates, then autonomous execution after the last gate. The recipe-implement skill enters "guided autonomous execution" after work plan approval.

Prompt Chaining

Yes — requirement-analyzer output (scale, affectedLayers, scope) is the input to the routing decision, which determines which subagent pipeline to spawn. Each stage's output artifact is the next stage's input context.

Consensus Mechanism

None. Linear pipeline; no parallel reviewer agreement required.

Parallel Execution

Not used in the main pipeline. Subagents run sequentially. Backend and frontend flows for fullstack features are separate sequential pipelines.

08

Ui Cli Surface

shinpr/codex-workflows — UI and CLI Surface

Dedicated CLI Binary

Yes — codex-workflows binary via npm.

Subcommand Purpose
install Copy .agents/skills/ and .codex/agents/ into project, write manifest
update [--dry-run] Update managed files preserving locally-modified (hash-based detection)
status Show installed version

The CLI is an installer/updater, not a runtime. The runtime is Codex CLI itself.

Invocation in Codex

Recipes are invoked using Codex CLI's $skill-name syntax:

$recipe-implement Add user authentication with JWT
$recipe-diagnose investigate flaky test in user service
$recipe-design Plan the database migration

Tab completion on $recipe- shows all available recipes.

Local UI

None.

IDE Integration

Codex CLI only. Skills install to .codex/agents/ (project-scoped subagents) and .agents/skills/ (recipes and foundational skills).

Observability

  • Progress through explicit BLOCKING gate evidence visible in Codex output
  • Artifact trail: PRD → Design Doc → test skeletons → work plan → task files → commits
  • docs/plans/ acts as a traceable intermediate artifact store (if not gitignored)

Cross-Tool Portability

Low/medium. TOML subagent definitions are Codex-native. The recipe SKILL.md files could be used by any AGENTS.md-compatible tool, but the subagent coordination (spawn_agent, fork_context=false) requires Codex CLI's subagent API.

Related frameworks

same archetype · same primary tool · same memory type

Claude-Flow / Ruflo ★ 55k

Eliminates single-agent context limits and sequential bottlenecks by orchestrating fault-tolerant swarms of specialized AI agents…

Hermes Agent (NousResearch) ★ 168k

Self-improving personal AI agent with closed learning loop, 7 terminal backends, and messaging gateway — not tied to any AI…

OpenCode ★ 165k

Terminal-first AI coding agent with multi-model routing, native desktop app, and a typed .opencode/ configuration system for…

OpenHands ★ 75k

Open-source AI software development platform (open-source Devin alternative) with Docker sandbox isolation, 77.6% SWE-bench…

DeerFlow ★ 70k

Long-horizon superagent that researches, codes, and creates by orchestrating parallel sub-agents with isolated contexts in Docker…

oh-my-openagent (omo) ★ 60k

Multi-provider AI agent orchestration for OpenCode: escape vendor lock-in by routing Sisyphus (Claude/Kimi/GLM) and Hephaestus…