doodledood/codex-workflow

doodledood-codex-workflow · doodledood/codex-workflow · ★ 2 · last commit 2026-01-20

Full-lifecycle Codex CLI skill pack built around documented LLM limitations, with 8-type specialized review orchestration and dynamic requirements discovery.

Best whenWorkflows must be designed around how LLMs actually work (documented failure modes), not how we wish they worked; quality over speed means investing in spec+…

Skip ifChasing the latest game-changing prompt, Suppressing errors with @ts-ignore

vs seeds

spec-driver(comprehensive skill pack with explicit workflow phases), but targets Codex CLI, stores all artifacts in /tmp/ (session-…

Primitive shape 18 total

Skills 18

Summary

doodledood/codex-workflow — Summary

Elevator pitch: A Codex CLI skill pack built by a practitioner who documented it as "ported from claude-code-plugins" — the same author has a Claude Code plugin variant. It ships 18 $skill-name skills for the Codex CLI covering the full development lifecycle: interactive requirements spec ($spec), implementation planning ($plan), in-place execution with auto-fix loops ($implement), and a specialized review suite with 8 orthogonal review types ($review-bugs, $review-simplicity, $review-testability, etc.). The core philosophy is "first-principles workflows designed around how LLMs actually work, not how we wish they worked." Codex plays all roles — planner, worker, reviewer — but each role is isolated in its own skill to prevent context contamination. Unlike shinpr/codex-workflows, there are no TOML subagent definitions; all skills invoke the Codex agent directly via $skill-name. Compared to seeds: most similar to spec-driver (comprehensive skill pack with explicit workflow phases and TDD enforcement), but targets Codex CLI instead of Claude Code, and adds a multi-type review orchestration pattern that no seed framework replicates.

Overview

doodledood/codex-workflow — Overview

Origin

By GitHub user doodledood. 2 stars, 1 fork. MIT license. Last commit: 2026-01-20. Explicitly ported from the author's Claude Code plugin (doodledood/claude-code-plugins) with note: "Ported from claude-code-plugins vibe-workflow".

Target Audience (from CUSTOMER.md)

"Experienced developers frustrated by hype-driven AI coding tools. If you're tired of chasing the latest 'game-changing' prompt that produces code you spend hours debugging, these skills offer a grounded alternative."

Philosophy (verbatim from README)

"Our approach:

Workflows designed around how LLMs actually work, not how we wish they worked

Quality over speed — invest upfront, ship with confidence

Simple to use, sophisticated under the hood"

Key Design Insight (from LLM_CODING_CAPABILITIES.md)

The framework includes a first-principles analysis of LLM coding capabilities and limitations in docs/LLM_CODING_CAPABILITIES.md. This is unusual — most frameworks do not document the failure modes of their underlying model. The workflows are explicitly designed around documented LLM weaknesses.

Connection to Claude Code

The README states this is "Ported from claude-code-plugins" — meaning all skills were originally Claude Code commands/skills. The Codex port preserves the same workflow structure but adapts invocation to Codex CLI's $skill-name convention.

Skill Installation

Individual skill installation is supported via $skill-installer skill:

$skill-installer install https://github.com/doodledood/codex-workflow/tree/main/skills/spec
$skill-installer install https://github.com/doodledood/codex-workflow/tree/main/skills/implement

Architecture

doodledood/codex-workflow — Architecture

Distribution

Type: Codex CLI skill pack (standalone-repo, clone-and-configure)
Language: Markdown (SKILL.md files)
License: MIT

Install (All Skills)

git clone https://github.com/doodledood/codex-workflow.git
cp -r codex-workflow/skills/* ~/.codex/skills/
# restart Codex
codex --enable skills  # if not already enabled

Install (Individual)

# Via $skill-installer (built-in Codex skill)
$skill-installer install https://github.com/doodledood/codex-workflow/tree/main/skills/spec

Required Runtime

Codex CLI with skills enabled (codex --enable skills)

Directory Tree

codex-workflow/
├── skills/                 # Codex skills (SKILL.md files)
│   ├── spec/SKILL.md       # Requirements interview
│   ├── plan/SKILL.md       # Implementation planning
│   ├── implement/SKILL.md  # Execution with auto-fix loops
│   ├── review/SKILL.md     # Review orchestrator
│   ├── review-bugs/SKILL.md
│   ├── review-type-safety/SKILL.md
│   ├── review-maintainability/SKILL.md
│   ├── review-simplicity/SKILL.md
│   ├── review-testability/SKILL.md
│   ├── review-coverage/SKILL.md
│   ├── review-docs/SKILL.md
│   ├── review-agents-md-adherence/SKILL.md
│   ├── bugfix/SKILL.md     # Systematic bug investigation
│   ├── explore-codebase/SKILL.md
│   ├── fix-review-issues/SKILL.md
│   ├── research-web/SKILL.md
│   └── web-research/SKILL.md
├── docs/
│   ├── CUSTOMER.md
│   └── LLM_CODING_CAPABILITIES.md
├── CLAUDE.md               # Development guidelines
├── AGENTS.md               # AGENTS.md for Codex
└── README.md

Target AI Tools

Primary: OpenAI Codex CLI (skills in ~/.codex/skills/)
Compatible: Any tool supporting the Codex skills specification

Components

doodledood/codex-workflow — Components

Skills (18, in `skills/*/SKILL.md`)

Core Workflow Skills (3)

Skill	Trigger	Purpose
`spec`	`$spec`	Interactive requirements builder through structured discovery interview
`plan`	`$plan`	Create implementation plans with codebase research
`implement`	`$implement`	Execute plans in-place with auto-fix loops and optional review

Review Skills (9)

Skill	Trigger	Purpose
`review`	`$review`	Orchestrator: runs all applicable reviews, consolidates findings
`review-bugs`	`$review-bugs`	Logical bugs, race conditions, edge cases
`review-type-safety`	`$review-type-safety`	TypeScript/typed language type safety audit
`review-maintainability`	`$review-maintainability`	DRY violations, dead code, complexity
`review-simplicity`	`$review-simplicity`	Over-engineering and complexity audit
`review-testability`	`$review-testability`	Testability design patterns
`review-coverage`	`$review-coverage`	Test coverage verification
`review-docs`	`$review-docs`	Documentation accuracy
`review-agents-md-adherence`	`$review-agents-md-adherence`	AGENTS.md compliance

Debugging + Research Skills (4)

Skill	Trigger	Purpose
`bugfix`	`$bugfix`	Systematic bug investigation and fix workflow
`explore-codebase`	`$explore-codebase`	Comprehensive codebase exploration
`research-web`	`$research-web`	Multi-wave web research with strategic source selection
`web-research`	`$web-research`	Structured web research with hypothesis tracking

Utility Skills (2)

Skill	Trigger	Purpose
`fix-review-issues`	`$fix-review-issues`	Orchestrate fixing issues found by `$review`
`skill-installer`	`$skill-installer`	Install individual skills from URL

Subagents

None — no TOML subagent definitions. Skills are invoked directly in Codex sessions.

Hooks

None.

Key Pattern: Auto-Fix Gates in `$implement`

The implement skill runs quality gates (typecheck → tests → lint) in order. On failure it iterates (analyze → fix → re-run), up to 3 distinct fix strategies before escalating to the user. After implementation, it can invoke $review automatically unless --no-review flag is set.

Review Detection Logic

The $review orchestrator auto-detects which specialized review types to run:

TypeScript: tsconfig.json exists
Typed Python: pyproject.toml with mypy config
Test coverage review: test files (*.test.*, __tests__/) exist
AGENTS.md review: AGENTS.md file exists in project

Prompts

doodledood/codex-workflow — Prompts

Prompt 1: `$spec` — Discovery Loop Structure (verbatim)

Source: skills/spec/SKILL.md

**Loop**: Research → Expand todos → Ask questions → Write findings → Repeat until complete

**Role**: Senior Product Manager - questions that uncover hidden requirements, edge cases, and assumptions the user hasn't considered. Reduce ambiguity through concrete options.

**Todo Evolution Example**

After user says "needs to work across mobile and web":
- [x] Initial context research → found existing notification system for admin alerts
- [ ] Scope & target users
- [ ] Mobile notification delivery (push vs in-app)
- [ ] Web notification delivery (browser vs in-app)
- [ ] Cross-platform sync behavior

After user mentions "also needs email digest option":
- [ ] Email digest frequency options
- [ ] Email vs real-time preferences

**Key**: Todos grow as user reveals complexity. Never prune prematurely.

Prompting technique: Dynamic todo expansion pattern (Memento pattern). The agent is instructed to continuously add to its todo list as user answers reveal new requirement areas — the opposite of most frameworks that predefine a fixed workflow. This matches the open-ended nature of requirements discovery.

Prompt 2: `$implement` — Auto-Fix Gate and Commit Protocol (verbatim)

Source: skills/implement/SKILL.md

**Run gates in order**: typecheck, then tests, then lint. Stop at first failure and iterate on that gate until it passes before proceeding to the next gate.

**On failure—iterate**:
1. Analyze: parse errors, identify files/lines, understand root cause
2. Fix by addressing root cause (not by suppressing errors, skipping tests, or adding `// @ts-ignore`)
3. Re-run the failing gate
4. Track attempts per issue by error message and file:line; if same error persists after 3 distinct fix strategies, escalate per "Pause ONLY when" rules

Commit chunk: `git add [files created/modified] && git commit -m "feat(plan): implement chunk N - [Name]"` (do NOT push)

Prompting technique: Explicit anti-suppression rule (not by suppressing errors, skipping tests, or adding // @ts-ignore) combined with escalation trigger (3 distinct fix strategies). The "track attempts per issue by error message and file:line" instruction forces the agent to maintain a structured error registry rather than applying random fixes.

Prompt 3: `$review` — Detection-Based Orchestration (verbatim)

Source: skills/review/SKILL.md

| Review Type | Skill | When to Include |
|-------------|-------|-----------------|
| Bugs | $review-bugs | Always (unless skipped) |
| Type Safety | $review-type-safety | TypeScript/typed Python detected |
| AGENTS.md | $review-agents-md-adherence | AGENTS.md file exists |

Detection logic:
- TypeScript: tsconfig.json exists
- Typed Python: pyproject.toml with mypy config OR py.typed marker
- Test files: *.test.*, *.spec.*, __tests__/, tests/ exist

Prompting technique: Context-sensitive dispatch table. The orchestrator uses filesystem inspection to decide which specialized reviews to run, making the review suite adaptive to the project's tech stack without requiring user configuration.

Uniqueness

doodledood/codex-workflow — Uniqueness

differs_from_seeds

Most similar to spec-driver (comprehensive skill pack with explicit workflow phases). However, doodledood/codex-workflow targets Codex CLI instead of Claude Code, adds a multi-type review orchestration layer ($review dispatching 8 specialized sub-reviews) that no seed framework replicates, and stores all working artifacts in /tmp/ (session-scoped) rather than the project directory. The review specialization pattern — running orthogonal analyses (bugs vs. type safety vs. simplicity vs. maintainability) and consolidating — is the most review-focused architecture in the entire batch. The $spec skill's dynamic todo expansion (Memento pattern) is more sophisticated than any seed's requirements-gathering mechanism. Also notable: the documentation of LLM failure modes in docs/LLM_CODING_CAPABILITIES.md as first-class design artifacts — the workflows are explicitly built around documented limitations.

Positioning

A practitioner-built "quality over speed" skill set for Codex CLI, distinguished by its deep review specialization and its philosophical grounding in LLM limitations rather than LLM hype.

Connection to Claude Code

Direct port from doodledood/claude-code-plugins. This is the only framework in the batch that is explicitly a cross-tool port from Claude Code to Codex CLI. The same workflow design works for both tools; only the invocation syntax differs.

Observable Failure Modes

Tmp file loss: All spec and plan files in /tmp/ are lost on session/system restart. No recovery mechanism.
Review inflation: Running 8 review types sequentially in $review is slow for large codebases. The --autonomous flag helps, but there is no parallel execution.
Gate detection errors: Review type selection depends on filesystem detection (e.g., tsconfig.json existence). Missing config files will cause wrong reviews to run or expected reviews to be skipped.
Low adoption: 2 stars suggests minimal community validation — the framework may have edge cases that only emerge at scale.

Workflow

doodledood/codex-workflow — Workflow

Standard Development Flow

$spec <task>     →  Spec file: /tmp/spec-YYYYMMDD-HHMMSS-name.md
    ↓
$plan <spec>     →  Plan file: /tmp/plan-YYYYMMDD-HHMMSS-name.md
    ↓
$implement <plan> →  Code (auto-fix gates: typecheck → tests → lint)
    ↓
$review          →  Consolidated review (8 specialized reviews)
    ↓
$fix-review-issues →  Fix flagged issues
    ↓
Commit

Phases + Artifacts

Phase	Skill	Artifact
Requirements	`$spec`	`/tmp/spec-{timestamp}-{name}.md` + interview log
Planning	`$plan`	`/tmp/plan-{timestamp}-{name}.md`
Implementation	`$implement`	Code + passing quality gates
Review	`$review`	Consolidated review report
Fix Review Issues	`$fix-review-issues`	Fixed issues

$spec Discovery Loop

The $spec skill runs a continuous discovery loop:

Research (read existing code for context)
Expand todos (add newly-discovered requirement areas)
Ask questions (Senior PM framing: uncover hidden requirements, edge cases)
Write findings to interview log
Repeat until complete

Todos expand dynamically as user answers reveal complexity. "Finalize spec" is the fixed anchor.

$implement Auto-Fix Gates

Gate	Order	On Failure
Typecheck (tsc, mypy, etc.)	1st	Iterate: analyze → fix → re-run (max 3 distinct strategies)
Tests (jest, pytest, etc.)	2nd	Iterate: analyze → fix → re-run
Lint (eslint, ruff, etc.)	3rd	Iterate: analyze → fix → re-run

Gate commands detected from AGENTS.md or by config file sniffing (tsconfig.json, pyproject.toml, etc.).

Approval Gates

Gate	Type
$spec initial context missing	`freetext-clarify`
$plan: scope unclear	`freetext-clarify`
$implement: stuck after 3 fix strategies	`freetext-clarify` (escalate)
Size-based routing in $implement	`freetext-clarify` (announce routing decision)

Review Orchestration

$review runs multiple specialized skills sequentially and consolidates findings. Supports:

--autonomous: no user prompts, return report immediately
--skip <types>: skip specific review types
--only <types>: run only specific types

Memory Context

doodledood/codex-workflow — Memory and Context

State Storage

Spec files: /tmp/spec-{YYYYMMDD-HHMMSS}-{name}.md — spec document, created by $spec
Interview log: /tmp/spec-interview-{YYYYMMDD-HHMMSS}-{name}.md — external memory for the discovery loop
Plan files: /tmp/plan-{YYYYMMDD-HHMMSS}-{name}.md — implementation plan
Progress tracking: /tmp/implement-progress.md — checkbox-based progress file (when update_plan tool unavailable)

The Timestamp Pattern

Files are written with a timestamp at skill start (YYYYMMDD-HHMMSS, generated once and reused for both spec + interview log files). Running $spec again creates new files — no overwrite, no resume of prior sessions.

External Memory for Discovery Loop

The $spec skill uses the interview log as explicit external memory: "Read full interview log (context refresh before output)" is a fixed anchor in the todo list, ensuring the agent re-reads all prior answers before writing the final spec. This is a deliberate mechanism to overcome context window limits in long discovery sessions.

Persistence

Session-level (/tmp/): spec files, plan files, progress tracker
No project-level persistence: the framework writes to /tmp/, not the project directory
No cross-session resume: new $spec run creates new files; prior spec files are not automatically loaded

Compaction

Not explicitly handled. The interview log as external memory is the primary mechanism for maintaining context across a long spec session.

Memory Type

File-based, session-scoped (tmp directory).

Orchestration

doodledood/codex-workflow — Orchestration

Multi-Agent

No. All skills run in the same Codex session. No subagent spawning, no TOML agent definitions.

Orchestration Pattern

Sequential within each skill. The $review skill runs multiple specialized review sub-skills in sequence and consolidates, but this is within one agent context (not spawned subagents).

Isolation Mechanism

None. All skills operate in the same Codex session and on the same working tree files.

Codex Role

All roles — Codex is the sole agent:

Requirements Analyst: $spec
Planner: $plan
Worker: $implement
Reviewer: $review + 8 specialized review skills
Debugger: $bugfix
Researcher: $research-web, $web-research

Multi-Model

No. Single Codex CLI session.

Execution Mode

Interactive-loop for $spec and $plan (approval gates at key milestones). Autonomous for $implement (no pauses except for 3-strategy escalation). Autonomous with optional flags for $review (--autonomous removes all pauses).

Prompt Chaining

Yes — $spec output (spec file) is input to $plan; $plan output (plan file) is input to $implement. Each stage's /tmp/*.md file is passed as the argument to the next stage.

Cross-Tool

The skill files use Codex's $skill-name convention. Not directly compatible with Claude Code's /command or Cursor's @rule conventions. The original Claude Code version is in doodledood/claude-code-plugins.

Ui Cli Surface

doodledood/codex-workflow — UI and CLI Surface

Dedicated CLI Binary

No standalone binary. Skills are invoked via Codex CLI's $skill-name syntax within a Codex session.

Invocation

$spec Add user notifications
$plan /tmp/spec-20260526-143052-user-notifications.md
$implement /tmp/plan-20260526-143052-user-notifications.md
$review
$fix-review-issues

Individual Skill Installation

$skill-installer install https://github.com/doodledood/codex-workflow/tree/main/skills/spec
$skill-installer install https://github.com/doodledood/codex-workflow/tree/main/skills/implement

Local UI

None.

IDE Integration

Codex CLI only. No Cursor, Claude Code, or Gemini CLI integration (though a parallel Claude Code version exists in doodledood/claude-code-plugins).

Observability

Progress tracking in $implement via checkbox-based /tmp/implement-progress.md
Quality gate output visible in Codex session output
No structured logging or dashboards

Cross-Tool Portability

Low — Codex CLI-specific skill convention. The parallel Claude Code version in doodledood/claude-code-plugins suggests the author maintains two tool-specific versions rather than a shared portable format.

Related frameworks

same archetype · same primary tool · same memory type

BMAD-METHOD ★ 48k

A4 Markdown scaffold

Provides a full agile delivery lifecycle with named expert-persona AI collaborators that elicit the human's best thinking rather…

Agent OS ★ 4.6k

A4 Markdown scaffold

Extracts implicit codebase conventions into token-efficient markdown standards files and injects them selectively into AI agent…

Claude Conductor ★ 367

A4 Markdown scaffold

Gives Claude Code a persistent, cross-linked, auto-analyzed documentation system so it retains codebase context across sessions.

Spec-Driver (Greenfield Spec-Driven Development) ★ 25

A4 Markdown scaffold

Prevents spec rot in AI-assisted development by making implementation changes flow back into evergreen, authoritative specs via…

Anthropic Knowledge Work Plugins ★ 16k

A4 Markdown scaffold

Role-specialized plugin bundles with live MCP connectors that turn Claude into a domain expert for enterprise knowledge workers.

Codex Integration for Claude Code (skill-codex) ★ 1.3k

A4 Markdown scaffold

Single Claude Code skill that handles Codex CLI invocation correctly (stdin blocking, thinking token suppression, session resume)…

Distribution

Type: standalone-repo
License: MIT
Install: clone-and-configure
Version: unknown (no semver tag; last commit 2026-01-20)

Surfaces

CLI binary: No
CLI subcmds: 0
Local UI: No
Tech stack: none

Components

Commands: 0
Skills: 18
Subagents: 0
Hooks: 0
MCP servers: 0
MCP tools: 0
Scripts: 0
Templates: 0

Workflow

Phases: 5
Approval gates: 3
Spec format: markdown
Spec storage: flat-files
Delta or full: whole-file

Orchestration

Multi-agent: No
Pattern: sequential
Max concurrent: 1
Isolation: none
Consensus: none
Prompt chaining: Yes

Multi-model

Multi-model: No
BYOK: Yes
Modal: text

Execution

Mode: interactive-loop
Crash recovery: No
Compaction: No
Session handoff: No
Streaming: No

Memory

Type: file-based
Persistence: session
Search: none
State files: 4 files

Quality

TDD: Optional
TDD mechanism: none
Validators: 3
Self-review: adversarial-subagent

Git / Observability

Auto commit: Yes
Auto PR: No
Auto merge: No
Worktree/feat: No
Audit log: No
Audit format: none
Replay: No

Tools

Primary: openai-codex
Targets: 1
Portability: low

Signals

Stars: 2
Last commit: 2026-01-20
Contributors: 2
Maintainer: dormant
Quality score: 2/10

Summary

doodledood/codex-workflow — Summary

Overview

doodledood/codex-workflow — Overview

Origin

Target Audience (from CUSTOMER.md)

Philosophy (verbatim from README)

Key Design Insight (from LLM_CODING_CAPABILITIES.md)

Connection to Claude Code

Skill Installation

Architecture

doodledood/codex-workflow — Architecture

Distribution

Install (All Skills)

Install (Individual)

Required Runtime

Directory Tree

Target AI Tools

Components

doodledood/codex-workflow — Components

Skills (18, in skills/*/SKILL.md)

Core Workflow Skills (3)

Review Skills (9)

Debugging + Research Skills (4)

Utility Skills (2)

Subagents

Hooks

Key Pattern: Auto-Fix Gates in $implement

Review Detection Logic

Prompts

doodledood/codex-workflow — Prompts

Prompt 1: $spec — Discovery Loop Structure (verbatim)

Prompt 2: $implement — Auto-Fix Gate and Commit Protocol (verbatim)

Prompt 3: $review — Detection-Based Orchestration (verbatim)

Uniqueness

doodledood/codex-workflow — Uniqueness

differs_from_seeds

Positioning

Connection to Claude Code

Observable Failure Modes

Workflow

doodledood/codex-workflow — Workflow

Standard Development Flow

Phases + Artifacts

$spec Discovery Loop

$implement Auto-Fix Gates

Approval Gates

Review Orchestration

Memory Context

doodledood/codex-workflow — Memory and Context

State Storage

The Timestamp Pattern

External Memory for Discovery Loop

Persistence

Compaction

Memory Type

Orchestration

doodledood/codex-workflow — Orchestration

Multi-Agent

Orchestration Pattern

Isolation Mechanism

Codex Role

Multi-Model

Execution Mode

Prompt Chaining

Cross-Tool

Ui Cli Surface

doodledood/codex-workflow — UI and CLI Surface

Dedicated CLI Binary

Invocation

Individual Skill Installation

Local UI

IDE Integration

Observability

Cross-Tool Portability

Related frameworks

Skills (18, in `skills/*/SKILL.md`)

Key Pattern: Auto-Fix Gates in `$implement`

Prompt 1: `$spec` — Discovery Loop Structure (verbatim)

Prompt 2: `$implement` — Auto-Fix Gate and Commit Protocol (verbatim)

Prompt 3: `$review` — Detection-Based Orchestration (verbatim)