codex-spec (shenli)

codex-spec · shenli/codex-spec · ★ 45 · last commit 2025-09-11

Automates the spec-to-code pipeline for OpenAI Codex by generating specifications, requirements, plans, and tasks from natural language feature descriptions.

Best whenSpecifications should be generated by the AI itself (not written by the human) and stored as files that guide subsequent AI execution — spec and implementati…

Skip ifRunning execute without context-setup and create first, Skipping requirements before plan

vs seeds

spec-kit(Python, hooks, TDD enforcement), codex-spec has zero hooks, no multi-AI integration, and no TDD discipline. It is the o…

Primitive shape

No installable primitives

Summary

codex-spec (shenli) — Summary

codex-spec is a Node.js CLI tool that automates spec-driven development workflows for OpenAI Codex, wrapping the OpenAI API to turn high-level feature intent into a structured artifact chain: product context → specification → requirements → implementation plan → task list → per-task execution. Unlike methodology documents, it actually invokes the OpenAI API at each stage, persisting artifacts to a .codex-specs/ directory tree. It targets the Codex CLI ecosystem (OpenAI Codex) as its primary runtime, though the CLI itself is a standalone npm package with no hard dependency on the codex binary. The framework is small (1 contributor, 45 stars) and purpose-built as a Codex-era counterpart to Kiro/OpenSpec patterns, shipping 10 CLI subcommands that mirror the spec pipeline stages. Compared to seeds, codex-spec differs from openspec (which is multi-tool and ships slash-commands + skills) in that it is a pure CLI tool that calls the OpenAI API directly, and differs from spec-kit (which ships a Python CLI with hooks and AI-tool integrations) in that it has no hooks, no multi-agent orchestration, and no TDD enforcement — it is the leanest possible CLI wrapper around a sequential spec-to-code pipeline.

Overview

codex-spec — Origin & Philosophy

Origin

Created by shenli (single maintainer), last commit September 2025. The repo description is "Automated workflows for OpenAI Codex. Features spec-driven development for new features: Product Requirement → Design → Development Plan → Implementation." It was built when OpenAI's Codex CLI was gaining traction as an autonomous coding agent and the author wanted a specification scaffold to feed it structured context before it executed code.

Core Philosophy

Make specifications the source of truth so AI agents work from shared, documented intent rather than free-form prompts. The README states the goal as: "Align teams on intent before coding · Preserve evolving project context · Generate detailed requirements and plans automatically · Execute tasks with dependency awareness and progress tracking."

Manifesto-style quotes (verbatim from README)

"AI is great at generating code, but results can be inconsistent without clear intent and shared context. codex-spec makes specifications the source of truth so you can: Align teams on intent before coding."

"This reduces rework, accelerates delivery, and keeps documentation in lockstep with the codebase."

Context

The framework is explicitly framed around the Codex CLI ecosystem (requires OPENAI_API_KEY). It optionally integrates with the codex binary on PATH but falls back to the OpenAI API for generation when codex CLI is unavailable. It is a minimal proof-of-concept level implementation: 1 contributor, 45 stars, 6 forks.

Architecture

codex-spec — Architecture

Distribution

npm package (codex-spec), installed globally via npm install -g codex-spec.

Install

npm install -g codex-spec
export OPENAI_API_KEY=your_api_key_here
codex-spec context-setup --force

Required Runtime

Node.js >= 16
OPENAI_API_KEY environment variable
Optional: codex CLI binary on PATH (falls back to API)

Directory Structure Created

.codex-specs/
├── context/
│   ├── product.md       # product context (what the project does)
│   ├── tech.md          # technology stack context
│   └── structure.md     # codebase structure context
├── current/             # alias to most recent spec
│   ├── specification.md
│   └── AGENTS.md
└── <YYYY-MM-DD_feature_name>/
    ├── specification.md  # feature spec
    ├── requirements.md   # EARS-style requirements
    ├── plan.md           # implementation plan
    └── tasks.json        # extracted tasks with IDs, phases, status

Source Structure

src/
├── cli.js               # CLI entry point (commander.js)
├── commands/            # 10 command handlers
│   ├── spec-context-setup.js
│   ├── spec-context-update.js
│   ├── spec-context-refresh.js
│   ├── spec-create.js
│   ├── spec-requirements.js
│   ├── spec-plan.js
│   ├── spec-tasks.js
│   ├── spec-execute.js
│   ├── spec-status.js
│   └── index.js
└── utils/
    ├── codex-client.js  # OpenAI API wrapper
    └── prompt-builder.js # prompt templates

Target AI Tools

Primary: OpenAI Codex CLI / OpenAI API
The execute command can invoke the local codex binary if available on PATH

Components

codex-spec — Components

CLI Commands (10 total)

All commands are subcommands of the codex-spec binary.

Command	Purpose
`context-setup`	Initialize `.codex-specs/context/product.md`, `tech.md`, `structure.md` via OpenAI API
`context-update [component]`	Update one or all context files; `--auto` uses `git diff` to detect changes
`context-refresh`	Regenerate all context files from scratch
`create <feature-name> [description]`	Create a comprehensive feature specification in `.codex-specs/<slug>/specification.md`
`requirements [spec-name]`	Generate EARS-format requirements from the current spec
`plan [spec-name]`	Create implementation plan and extract tasks to `tasks.json`
`plan-summary`	View plan overview (also runs automatically after `plan`)
`tasks`	List tasks with IDs, titles, phase, and status
`execute <task-id>`	Execute a single task with context; writes enabled by default (`--read-only` disables)
`execute-phase <phase>`	Execute all tasks in a given phase
`status`	View progress and plan overview (alias to plan-summary)

No Skills / Hooks / MCP Servers

codex-spec ships zero Claude Code skills, hooks, or MCP servers. It is a pure standalone CLI.

Utility Modules

CodexClient — wraps OpenAI API with system prompt injection
PromptBuilder — builds prompts for each stage (spec creation, requirements, plan, execute)

Prompts

codex-spec — Prompt Files

Excerpt 1: Feature Specification Creation (from `src/commands/spec-create.js`)

The create command invokes the OpenAI API with this system prompt:

const spec = await codexClient.generateWithAPI(
  prompt,
  'You are a senior product engineer writing precise, actionable specifications.'
);

The promptBuilder.buildSpecCreationPrompt(featureName, description) constructs the user prompt. The system prompt positions Claude/GPT as a "senior product engineer," an example of persona assignment technique.

The command also generates a per-spec AGENTS.md file for AI agent guidance:

const agents = promptBuilder.buildAgentsFile(featureName, description);
await fs.writeFile(path.join(baseDir, 'AGENTS.md'), agents);

Technique: Single-shot generation with role persona + artifact persistence pattern.

Excerpt 2: Spec Slug Generation (from `src/commands/spec-create.js`)

const suggestion = await codexClient.generateWithAPI(
  `Generate a concise, clear snake_case slug (3-6 words) for this feature. Letters, numbers, and underscores only. No prefix/suffix.\n\nFeature: ${featureName}`,
  'Return ONLY the snake_case slug, nothing else.'
);

Technique: Constrained output extraction — system prompt restricts model to a single token-class output. A classic few-shot-free extraction pattern.

Excerpt 3: Context Update (from README)

The context-update --auto command uses git diff to detect recent changes:

codex-spec context-update --auto

This is an example of context-from-delta prompting: feeding the git diff as evidence to the LLM so it updates the context files with only what changed, rather than regenerating from scratch.

Prompting Techniques Used

Persona assignment — "You are a senior product engineer..."
Constrained output extraction — "Return ONLY the snake_case slug, nothing else"
Context-from-delta — feed git diff to update context files incrementally
Sequential artifact chaining — each stage's output is the next stage's input (spec → requirements → plan → tasks)

Uniqueness

codex-spec — Uniqueness & Positioning

differs_from_seeds

Closest seed is openspec (both are npm CLIs implementing a spec pipeline), but codex-spec differs fundamentally: it calls the OpenAI API itself at each stage (context setup, spec creation, requirements, plan, execution) rather than generating prompt files for a human-driven AI tool session. Where openspec ships slash-commands and skills that a human pastes into Claude Code, codex-spec is fully autonomous — run a command, get an artifact, no AI tool UI required. It also differs from spec-kit (Python CLI, hooks, multi-AI integrations) by having zero hooks and no TDD enforcement. Against agent-os (bash scaffold that writes CLAUDE.md files), codex-spec is lower-ceremony but narrower: it writes to .codex-specs/ not to Claude-specific config files.

Unique Positioning

codex-spec is the only framework in this batch targeting OpenAI Codex as primary runtime and calling the OpenAI API directly (not Claude Code). This makes it a non-Claude-first spec framework — it was built specifically when OpenAI launched the Codex CLI and authors wanted structured context to feed it.

Observable Failure Modes

No human gates: the pipeline runs from context-setup to execute with no review checkpoints — spec quality depends entirely on the initial description.
No TDD: the execute command writes code with no failing test first.
Context drift: product.md/tech.md/structure.md can go stale; --auto refresh requires git diff and manual invocation.
Requires OpenAI: hard dependency on OPENAI_API_KEY; no Claude, no Gemini support.
Single maintainer: 1 contributor, last commit Sep 2025, risk of abandonment.

Explicit Antipatterns (inferred from design)

Writing code without first running context-setup and create
Skipping requirements before plan

Workflow

codex-spec — Workflow

Phases & Artifacts

Phase	Command	Artifact
1. Context Setup	`codex-spec context-setup`	`.codex-specs/context/product.md`, `tech.md`, `structure.md`
2. Feature Specification	`codex-spec create <name>`	`.codex-specs/<slug>/specification.md`
3. Requirements	`codex-spec requirements`	`.codex-specs/<slug>/requirements.md`
4. Plan	`codex-spec plan`	`.codex-specs/<slug>/plan.md`
5. Task Extraction	(automatic after plan)	`.codex-specs/<slug>/tasks.json`
6. Execution	`codex-spec execute <task-id>`	Code changes in workspace
7. Status Check	`codex-spec status`	Progress report (stdout)
8. Maintenance	`codex-spec context-update --auto`	Updated context files

Approval Gates

codex-spec has no explicit human approval gates. It runs each command to completion without pausing. The human triggers each stage manually by running the next command. No "wait for approval" prompts are issued.

Flow Diagram (from README)

Setup → Context Creation → Feature Specification → Requirements
→ Implementation Plan → Execute Tasks → Progress & Status
→ Maintenance: Context Update/Refresh

Task Execution Notes

execute <task-id> enables writes by default; --read-only prevents writes
execute-phase <phase> runs all tasks in a given phase sequentially
Phase names with spaces must be quoted: codex-spec execute-phase "Core Features"
Spec directory naming: YYYY-MM-DD_name_of_the_spec (auto-generated slug), overridable with --title

Memory Context

codex-spec — Memory & Context

State Storage

File-based, stored under .codex-specs/ in the project root.

Persistence Scope

Project-scoped — all artifacts live in the .codex-specs/ directory within the current repository.

Handoff Mechanism

A .codex-specs/current/ symlink/alias directory holds copies of the most recent spec's specification.md and AGENTS.md, enabling follow-up commands (requirements, plan) to operate without specifying the spec name.
Each spec directory is named YYYY-MM-DD_<slug> for temporal ordering.

Context Files

product.md — product description and goals
tech.md — technology stack
structure.md — codebase structure overview

These three files are injected as context when executing tasks (execute command reads them to give the LLM project context before generating code).

Cross-session Handoff

Yes — all artifacts are files on disk. Any new session can pick up where the previous one left off by running codex-spec tasks to see current status, then codex-spec execute <next-task-id>.

Compaction

None. No context compaction or summarization built in. Context files can be refreshed via context-refresh.

Memory Type

File-based markdown + JSON (tasks.json for task state).

Orchestration

codex-spec — Orchestration

Multi-Agent

No. codex-spec is single-agent — the CLI calls the OpenAI API sequentially. No subagent spawning.

Orchestration Pattern

Sequential. Each command in the pipeline (create → requirements → plan → execute) runs one at a time, triggered by the human.

Execution Mode

One-shot per command. Human triggers each stage by running the next CLI command. No background daemon, no continuous loop.

Isolation Mechanism

None. All file writes happen in-place in the project working directory. No git branching or worktrees.

Multi-Model

No. Single model: OpenAI API (model configurable via OPENAI_API_KEY, defaults to whatever OpenAI API returns). No model routing per role.

Consensus Mechanism

None.

Auto-Validators

None. No linting, testing, or self-review runs automatically. The --read-only flag prevents writes during execute, but no automated quality gates exist.

TDD Enforcement

None.

Git Automation

None. No automatic commits, PRs, or merges.

Prompt Chaining

Yes — explicit. The plan command reads specification.md and requirements.md, the execute command reads plan.md and tasks.json. One stage's output is the next stage's prompt input.

Ui Cli Surface

codex-spec — UI & CLI Surface

CLI Binary

Name: codex-spec
Type: Dedicated CLI (npm global binary, not a thin wrapper over Claude/Codex)
Package: codex-spec on npm
Subcommands: 10 (context-setup, context-update, context-refresh, create, requirements, plan, plan-summary, tasks, execute, execute-phase, status)

Local UI

None. No web dashboard, no TUI, no desktop app.

IDE Integration

None. No VS Code extension, no Cursor integration, no Claude Code plugin.

Observability

codex-spec status / codex-spec plan-summary — prints current plan and task progress to stdout
codex-spec tasks — lists all tasks with IDs, titles, phases, and status
No structured logging, no audit log

Installation Check

codex-spec --help          # verify installation
codex-spec tasks           # see task progress
codex-spec status          # current plan summary

Cross-Tool Portability

Low-medium. The CLI works with any OpenAI API key and optionally integrates with the Codex binary. It does not integrate with Claude Code, Cursor, or other AI tools beyond the optional codex CLI hook.

Related frameworks

same archetype · same primary tool · same memory type

Superpowers Marketplace ★ 1.0k

A99 Unclassified

Single-endpoint Claude Code plugin marketplace for the superpowers plugin ecosystem.

pydantic-ai-harness ★ 354

A99 Unclassified

Official capability extension library for Pydantic AI agents — bundles tools, hooks, and lifecycle primitives into…

backgrounder.dev

A99 Unclassified

Hosted background coding agent interface (closed SaaS — insufficient public material for full analysis).

oh-my-claudecode (mazenyassergithub) ★ 5

A99 Unclassified

Claims multi-agent orchestration with 28 agents and 28 skills but README content is SEO-style with ZIP download instructions…

Anthropic Claude Plugins Official ★ 0

A99 Unclassified

Repo not found — no public content to analyze.

OpenAI Codex CLI ★ 86k

A2 Mirror cmd+skill

Give developers a sandboxed, locally-running OpenAI coding agent with approval gates and skill orchestration.

Distribution

Type: cli-tool
License: Apache-2.0
Install: npm-install

Surfaces

CLI binary: codex-spec
CLI subcmds: 11
Local UI: No
Tech stack: none

Components

Commands: 0
Skills: 0
Subagents: 0
Hooks: 0
MCP servers: 0
MCP tools: 0
Scripts: 0
Templates: 3

Workflow

Phases: 7
Approval gates: 0
Spec format: markdown
Spec storage: per-feature-folder
Delta or full: whole-file

Orchestration

Multi-agent: No
Pattern: sequential
Max concurrent: 1
Isolation: none
Consensus: none
Prompt chaining: Yes

Multi-model

Multi-model: No
BYOK: Yes
Modal: text

Execution

Mode: one-shot
Crash recovery: No
Compaction: No
Session handoff: Yes
Streaming: No

Memory

Type: file-based
Persistence: project
Search: none
State files: 7 files

Quality

TDD: No
TDD mechanism: none
Self-review: none

Git / Observability

Auto commit: No
Auto PR: No
Auto merge: No
Worktree/feat: No
Audit log: No
Audit format: none
Replay: No

Tools

Primary: codex-cli
Targets: 2
Portability: low

Signals

Stars: 45
Last commit: 2025-09-11
Contributors: 1
Maintainer: dormant
Quality score: 0.5/10

Summary

codex-spec (shenli) — Summary

Overview

codex-spec — Origin & Philosophy

Origin

Core Philosophy

Manifesto-style quotes (verbatim from README)

Context

Architecture

codex-spec — Architecture

Distribution

Install

Required Runtime

Directory Structure Created

Source Structure

Target AI Tools

Components

codex-spec — Components

CLI Commands (10 total)

No Skills / Hooks / MCP Servers

Utility Modules

Prompts

codex-spec — Prompt Files

Excerpt 1: Feature Specification Creation (from src/commands/spec-create.js)

Excerpt 2: Spec Slug Generation (from src/commands/spec-create.js)

Excerpt 3: Context Update (from README)

Prompting Techniques Used

Uniqueness

codex-spec — Uniqueness & Positioning

differs_from_seeds

Unique Positioning

Observable Failure Modes

Explicit Antipatterns (inferred from design)

Workflow

codex-spec — Workflow

Phases & Artifacts

Approval Gates

Flow Diagram (from README)

Task Execution Notes

Memory Context

codex-spec — Memory & Context

State Storage

Persistence Scope

Handoff Mechanism

Context Files

Cross-session Handoff

Compaction

Memory Type

Orchestration

codex-spec — Orchestration

Multi-Agent

Orchestration Pattern

Execution Mode

Isolation Mechanism

Multi-Model

Consensus Mechanism

Auto-Validators

TDD Enforcement

Git Automation

Prompt Chaining

Ui Cli Surface

codex-spec — UI & CLI Surface

CLI Binary

Local UI

IDE Integration

Observability

Installation Check

Cross-Tool Portability

Related frameworks

Excerpt 1: Feature Specification Creation (from `src/commands/spec-create.js`)

Excerpt 2: Spec Slug Generation (from `src/commands/spec-create.js`)