Skip to content
/

doodledood/codex-workflow

doodledood-codex-workflow · doodledood/codex-workflow · ★ 2 · last commit 2026-01-20

Full-lifecycle Codex CLI skill pack built around documented LLM limitations, with 8-type specialized review orchestration and dynamic requirements discovery.

Best whenWorkflows must be designed around how LLMs actually work (documented failure modes), not how we wish they worked; quality over speed means investing in spec+…
Skip ifChasing the latest game-changing prompt, Suppressing errors with @ts-ignore
vs seeds
spec-driver(comprehensive skill pack with explicit workflow phases), but targets Codex CLI, stores all artifacts in /tmp/ (session-…
Primitive shape 18 total
Skills 18
00

Summary

doodledood/codex-workflow — Summary

Elevator pitch: A Codex CLI skill pack built by a practitioner who documented it as "ported from claude-code-plugins" — the same author has a Claude Code plugin variant. It ships 18 $skill-name skills for the Codex CLI covering the full development lifecycle: interactive requirements spec ($spec), implementation planning ($plan), in-place execution with auto-fix loops ($implement), and a specialized review suite with 8 orthogonal review types ($review-bugs, $review-simplicity, $review-testability, etc.). The core philosophy is "first-principles workflows designed around how LLMs actually work, not how we wish they worked." Codex plays all roles — planner, worker, reviewer — but each role is isolated in its own skill to prevent context contamination. Unlike shinpr/codex-workflows, there are no TOML subagent definitions; all skills invoke the Codex agent directly via $skill-name. Compared to seeds: most similar to spec-driver (comprehensive skill pack with explicit workflow phases and TDD enforcement), but targets Codex CLI instead of Claude Code, and adds a multi-type review orchestration pattern that no seed framework replicates.

01

Overview

doodledood/codex-workflow — Overview

Origin

By GitHub user doodledood. 2 stars, 1 fork. MIT license. Last commit: 2026-01-20. Explicitly ported from the author's Claude Code plugin (doodledood/claude-code-plugins) with note: "Ported from claude-code-plugins vibe-workflow".

Target Audience (from CUSTOMER.md)

"Experienced developers frustrated by hype-driven AI coding tools. If you're tired of chasing the latest 'game-changing' prompt that produces code you spend hours debugging, these skills offer a grounded alternative."

Philosophy (verbatim from README)

"Our approach:

  • Workflows designed around how LLMs actually work, not how we wish they worked
  • Quality over speed — invest upfront, ship with confidence
  • Simple to use, sophisticated under the hood"

Key Design Insight (from LLM_CODING_CAPABILITIES.md)

The framework includes a first-principles analysis of LLM coding capabilities and limitations in docs/LLM_CODING_CAPABILITIES.md. This is unusual — most frameworks do not document the failure modes of their underlying model. The workflows are explicitly designed around documented LLM weaknesses.

Connection to Claude Code

The README states this is "Ported from claude-code-plugins" — meaning all skills were originally Claude Code commands/skills. The Codex port preserves the same workflow structure but adapts invocation to Codex CLI's $skill-name convention.

Skill Installation

Individual skill installation is supported via $skill-installer skill:

$skill-installer install https://github.com/doodledood/codex-workflow/tree/main/skills/spec
$skill-installer install https://github.com/doodledood/codex-workflow/tree/main/skills/implement
02

Architecture

doodledood/codex-workflow — Architecture

Distribution

  • Type: Codex CLI skill pack (standalone-repo, clone-and-configure)
  • Language: Markdown (SKILL.md files)
  • License: MIT

Install (All Skills)

git clone https://github.com/doodledood/codex-workflow.git
cp -r codex-workflow/skills/* ~/.codex/skills/
# restart Codex
codex --enable skills  # if not already enabled

Install (Individual)

# Via $skill-installer (built-in Codex skill)
$skill-installer install https://github.com/doodledood/codex-workflow/tree/main/skills/spec

Required Runtime

  • Codex CLI with skills enabled (codex --enable skills)

Directory Tree

codex-workflow/
├── skills/                 # Codex skills (SKILL.md files)
│   ├── spec/SKILL.md       # Requirements interview
│   ├── plan/SKILL.md       # Implementation planning
│   ├── implement/SKILL.md  # Execution with auto-fix loops
│   ├── review/SKILL.md     # Review orchestrator
│   ├── review-bugs/SKILL.md
│   ├── review-type-safety/SKILL.md
│   ├── review-maintainability/SKILL.md
│   ├── review-simplicity/SKILL.md
│   ├── review-testability/SKILL.md
│   ├── review-coverage/SKILL.md
│   ├── review-docs/SKILL.md
│   ├── review-agents-md-adherence/SKILL.md
│   ├── bugfix/SKILL.md     # Systematic bug investigation
│   ├── explore-codebase/SKILL.md
│   ├── fix-review-issues/SKILL.md
│   ├── research-web/SKILL.md
│   └── web-research/SKILL.md
├── docs/
│   ├── CUSTOMER.md
│   └── LLM_CODING_CAPABILITIES.md
├── CLAUDE.md               # Development guidelines
├── AGENTS.md               # AGENTS.md for Codex
└── README.md

Target AI Tools

  • Primary: OpenAI Codex CLI (skills in ~/.codex/skills/)
  • Compatible: Any tool supporting the Codex skills specification
03

Components

doodledood/codex-workflow — Components

Skills (18, in skills/*/SKILL.md)

Core Workflow Skills (3)

Skill Trigger Purpose
spec $spec Interactive requirements builder through structured discovery interview
plan $plan Create implementation plans with codebase research
implement $implement Execute plans in-place with auto-fix loops and optional review

Review Skills (9)

Skill Trigger Purpose
review $review Orchestrator: runs all applicable reviews, consolidates findings
review-bugs $review-bugs Logical bugs, race conditions, edge cases
review-type-safety $review-type-safety TypeScript/typed language type safety audit
review-maintainability $review-maintainability DRY violations, dead code, complexity
review-simplicity $review-simplicity Over-engineering and complexity audit
review-testability $review-testability Testability design patterns
review-coverage $review-coverage Test coverage verification
review-docs $review-docs Documentation accuracy
review-agents-md-adherence $review-agents-md-adherence AGENTS.md compliance

Debugging + Research Skills (4)

Skill Trigger Purpose
bugfix $bugfix Systematic bug investigation and fix workflow
explore-codebase $explore-codebase Comprehensive codebase exploration
research-web $research-web Multi-wave web research with strategic source selection
web-research $web-research Structured web research with hypothesis tracking

Utility Skills (2)

Skill Trigger Purpose
fix-review-issues $fix-review-issues Orchestrate fixing issues found by $review
skill-installer $skill-installer Install individual skills from URL

Subagents

None — no TOML subagent definitions. Skills are invoked directly in Codex sessions.

Hooks

None.

Key Pattern: Auto-Fix Gates in $implement

The implement skill runs quality gates (typecheck → tests → lint) in order. On failure it iterates (analyze → fix → re-run), up to 3 distinct fix strategies before escalating to the user. After implementation, it can invoke $review automatically unless --no-review flag is set.

Review Detection Logic

The $review orchestrator auto-detects which specialized review types to run:

  • TypeScript: tsconfig.json exists
  • Typed Python: pyproject.toml with mypy config
  • Test coverage review: test files (*.test.*, __tests__/) exist
  • AGENTS.md review: AGENTS.md file exists in project
05

Prompts

doodledood/codex-workflow — Prompts

Prompt 1: $spec — Discovery Loop Structure (verbatim)

Source: skills/spec/SKILL.md

**Loop**: Research → Expand todos → Ask questions → Write findings → Repeat until complete

**Role**: Senior Product Manager - questions that uncover hidden requirements, edge cases, and assumptions the user hasn't considered. Reduce ambiguity through concrete options.

**Todo Evolution Example**

After user says "needs to work across mobile and web":
- [x] Initial context research → found existing notification system for admin alerts
- [ ] Scope & target users
- [ ] Mobile notification delivery (push vs in-app)
- [ ] Web notification delivery (browser vs in-app)
- [ ] Cross-platform sync behavior

After user mentions "also needs email digest option":
- [ ] Email digest frequency options
- [ ] Email vs real-time preferences

**Key**: Todos grow as user reveals complexity. Never prune prematurely.

Prompting technique: Dynamic todo expansion pattern (Memento pattern). The agent is instructed to continuously add to its todo list as user answers reveal new requirement areas — the opposite of most frameworks that predefine a fixed workflow. This matches the open-ended nature of requirements discovery.


Prompt 2: $implement — Auto-Fix Gate and Commit Protocol (verbatim)

Source: skills/implement/SKILL.md

**Run gates in order**: typecheck, then tests, then lint. Stop at first failure and iterate on that gate until it passes before proceeding to the next gate.

**On failure—iterate**:
1. Analyze: parse errors, identify files/lines, understand root cause
2. Fix by addressing root cause (not by suppressing errors, skipping tests, or adding `// @ts-ignore`)
3. Re-run the failing gate
4. Track attempts per issue by error message and file:line; if same error persists after 3 distinct fix strategies, escalate per "Pause ONLY when" rules

Commit chunk: `git add [files created/modified] && git commit -m "feat(plan): implement chunk N - [Name]"` (do NOT push)

Prompting technique: Explicit anti-suppression rule (not by suppressing errors, skipping tests, or adding // @ts-ignore) combined with escalation trigger (3 distinct fix strategies). The "track attempts per issue by error message and file:line" instruction forces the agent to maintain a structured error registry rather than applying random fixes.


Prompt 3: $review — Detection-Based Orchestration (verbatim)

Source: skills/review/SKILL.md

| Review Type | Skill | When to Include |
|-------------|-------|-----------------|
| Bugs | $review-bugs | Always (unless skipped) |
| Type Safety | $review-type-safety | TypeScript/typed Python detected |
| AGENTS.md | $review-agents-md-adherence | AGENTS.md file exists |

Detection logic:
- TypeScript: tsconfig.json exists
- Typed Python: pyproject.toml with mypy config OR py.typed marker
- Test files: *.test.*, *.spec.*, __tests__/, tests/ exist

Prompting technique: Context-sensitive dispatch table. The orchestrator uses filesystem inspection to decide which specialized reviews to run, making the review suite adaptive to the project's tech stack without requiring user configuration.

09

Uniqueness

doodledood/codex-workflow — Uniqueness

differs_from_seeds

Most similar to spec-driver (comprehensive skill pack with explicit workflow phases). However, doodledood/codex-workflow targets Codex CLI instead of Claude Code, adds a multi-type review orchestration layer ($review dispatching 8 specialized sub-reviews) that no seed framework replicates, and stores all working artifacts in /tmp/ (session-scoped) rather than the project directory. The review specialization pattern — running orthogonal analyses (bugs vs. type safety vs. simplicity vs. maintainability) and consolidating — is the most review-focused architecture in the entire batch. The $spec skill's dynamic todo expansion (Memento pattern) is more sophisticated than any seed's requirements-gathering mechanism. Also notable: the documentation of LLM failure modes in docs/LLM_CODING_CAPABILITIES.md as first-class design artifacts — the workflows are explicitly built around documented limitations.

Positioning

A practitioner-built "quality over speed" skill set for Codex CLI, distinguished by its deep review specialization and its philosophical grounding in LLM limitations rather than LLM hype.

Connection to Claude Code

Direct port from doodledood/claude-code-plugins. This is the only framework in the batch that is explicitly a cross-tool port from Claude Code to Codex CLI. The same workflow design works for both tools; only the invocation syntax differs.

Observable Failure Modes

  1. Tmp file loss: All spec and plan files in /tmp/ are lost on session/system restart. No recovery mechanism.
  2. Review inflation: Running 8 review types sequentially in $review is slow for large codebases. The --autonomous flag helps, but there is no parallel execution.
  3. Gate detection errors: Review type selection depends on filesystem detection (e.g., tsconfig.json existence). Missing config files will cause wrong reviews to run or expected reviews to be skipped.
  4. Low adoption: 2 stars suggests minimal community validation — the framework may have edge cases that only emerge at scale.
04

Workflow

doodledood/codex-workflow — Workflow

Standard Development Flow

$spec <task>     →  Spec file: /tmp/spec-YYYYMMDD-HHMMSS-name.md
    ↓
$plan <spec>     →  Plan file: /tmp/plan-YYYYMMDD-HHMMSS-name.md
    ↓
$implement <plan> →  Code (auto-fix gates: typecheck → tests → lint)
    ↓
$review          →  Consolidated review (8 specialized reviews)
    ↓
$fix-review-issues →  Fix flagged issues
    ↓
Commit

Phases + Artifacts

Phase Skill Artifact
Requirements $spec /tmp/spec-{timestamp}-{name}.md + interview log
Planning $plan /tmp/plan-{timestamp}-{name}.md
Implementation $implement Code + passing quality gates
Review $review Consolidated review report
Fix Review Issues $fix-review-issues Fixed issues

$spec Discovery Loop

The $spec skill runs a continuous discovery loop:

  1. Research (read existing code for context)
  2. Expand todos (add newly-discovered requirement areas)
  3. Ask questions (Senior PM framing: uncover hidden requirements, edge cases)
  4. Write findings to interview log
  5. Repeat until complete

Todos expand dynamically as user answers reveal complexity. "Finalize spec" is the fixed anchor.

$implement Auto-Fix Gates

Gate Order On Failure
Typecheck (tsc, mypy, etc.) 1st Iterate: analyze → fix → re-run (max 3 distinct strategies)
Tests (jest, pytest, etc.) 2nd Iterate: analyze → fix → re-run
Lint (eslint, ruff, etc.) 3rd Iterate: analyze → fix → re-run

Gate commands detected from AGENTS.md or by config file sniffing (tsconfig.json, pyproject.toml, etc.).

Approval Gates

Gate Type
$spec initial context missing freetext-clarify
$plan: scope unclear freetext-clarify
$implement: stuck after 3 fix strategies freetext-clarify (escalate)
Size-based routing in $implement freetext-clarify (announce routing decision)

Review Orchestration

$review runs multiple specialized skills sequentially and consolidates findings. Supports:

  • --autonomous: no user prompts, return report immediately
  • --skip <types>: skip specific review types
  • --only <types>: run only specific types
06

Memory Context

doodledood/codex-workflow — Memory and Context

State Storage

  • Spec files: /tmp/spec-{YYYYMMDD-HHMMSS}-{name}.md — spec document, created by $spec
  • Interview log: /tmp/spec-interview-{YYYYMMDD-HHMMSS}-{name}.md — external memory for the discovery loop
  • Plan files: /tmp/plan-{YYYYMMDD-HHMMSS}-{name}.md — implementation plan
  • Progress tracking: /tmp/implement-progress.md — checkbox-based progress file (when update_plan tool unavailable)

The Timestamp Pattern

Files are written with a timestamp at skill start (YYYYMMDD-HHMMSS, generated once and reused for both spec + interview log files). Running $spec again creates new files — no overwrite, no resume of prior sessions.

External Memory for Discovery Loop

The $spec skill uses the interview log as explicit external memory: "Read full interview log (context refresh before output)" is a fixed anchor in the todo list, ensuring the agent re-reads all prior answers before writing the final spec. This is a deliberate mechanism to overcome context window limits in long discovery sessions.

Persistence

  • Session-level (/tmp/): spec files, plan files, progress tracker
  • No project-level persistence: the framework writes to /tmp/, not the project directory
  • No cross-session resume: new $spec run creates new files; prior spec files are not automatically loaded

Compaction

Not explicitly handled. The interview log as external memory is the primary mechanism for maintaining context across a long spec session.

Memory Type

File-based, session-scoped (tmp directory).

07

Orchestration

doodledood/codex-workflow — Orchestration

Multi-Agent

No. All skills run in the same Codex session. No subagent spawning, no TOML agent definitions.

Orchestration Pattern

Sequential within each skill. The $review skill runs multiple specialized review sub-skills in sequence and consolidates, but this is within one agent context (not spawned subagents).

Isolation Mechanism

None. All skills operate in the same Codex session and on the same working tree files.

Codex Role

All roles — Codex is the sole agent:

  • Requirements Analyst: $spec
  • Planner: $plan
  • Worker: $implement
  • Reviewer: $review + 8 specialized review skills
  • Debugger: $bugfix
  • Researcher: $research-web, $web-research

Multi-Model

No. Single Codex CLI session.

Execution Mode

Interactive-loop for $spec and $plan (approval gates at key milestones). Autonomous for $implement (no pauses except for 3-strategy escalation). Autonomous with optional flags for $review (--autonomous removes all pauses).

Prompt Chaining

Yes — $spec output (spec file) is input to $plan; $plan output (plan file) is input to $implement. Each stage's /tmp/*.md file is passed as the argument to the next stage.

Cross-Tool

The skill files use Codex's $skill-name convention. Not directly compatible with Claude Code's /command or Cursor's @rule conventions. The original Claude Code version is in doodledood/claude-code-plugins.

08

Ui Cli Surface

doodledood/codex-workflow — UI and CLI Surface

Dedicated CLI Binary

No standalone binary. Skills are invoked via Codex CLI's $skill-name syntax within a Codex session.

Invocation

$spec Add user notifications
$plan /tmp/spec-20260526-143052-user-notifications.md
$implement /tmp/plan-20260526-143052-user-notifications.md
$review
$fix-review-issues

Individual Skill Installation

$skill-installer install https://github.com/doodledood/codex-workflow/tree/main/skills/spec
$skill-installer install https://github.com/doodledood/codex-workflow/tree/main/skills/implement

Local UI

None.

IDE Integration

Codex CLI only. No Cursor, Claude Code, or Gemini CLI integration (though a parallel Claude Code version exists in doodledood/claude-code-plugins).

Observability

  • Progress tracking in $implement via checkbox-based /tmp/implement-progress.md
  • Quality gate output visible in Codex session output
  • No structured logging or dashboards

Cross-Tool Portability

Low — Codex CLI-specific skill convention. The parallel Claude Code version in doodledood/claude-code-plugins suggests the author maintains two tool-specific versions rather than a shared portable format.

Related frameworks

same archetype · same primary tool · same memory type

BMAD-METHOD ★ 48k

Provides a full agile delivery lifecycle with named expert-persona AI collaborators that elicit the human's best thinking rather…

Agent OS ★ 4.6k

Extracts implicit codebase conventions into token-efficient markdown standards files and injects them selectively into AI agent…

Claude Conductor ★ 367

Gives Claude Code a persistent, cross-linked, auto-analyzed documentation system so it retains codebase context across sessions.

Spec-Driver (Greenfield Spec-Driven Development) ★ 25

Prevents spec rot in AI-assisted development by making implementation changes flow back into evergreen, authoritative specs via…

Anthropic Knowledge Work Plugins ★ 16k

Role-specialized plugin bundles with live MCP connectors that turn Claude into a domain expert for enterprise knowledge workers.

Codex Integration for Claude Code (skill-codex) ★ 1.3k

Single Claude Code skill that handles Codex CLI invocation correctly (stdin blocking, thinking token suppression, session resume)…