Skip to content
/

codex-spec (shenli)

codex-spec · shenli/codex-spec · ★ 45 · last commit 2025-09-11

Automates the spec-to-code pipeline for OpenAI Codex by generating specifications, requirements, plans, and tasks from natural language feature descriptions.

Best whenSpecifications should be generated by the AI itself (not written by the human) and stored as files that guide subsequent AI execution — spec and implementati…
Skip ifRunning execute without context-setup and create first, Skipping requirements before plan
vs seeds
spec-kit(Python, hooks, TDD enforcement), codex-spec has zero hooks, no multi-AI integration, and no TDD discipline. It is the o…
Primitive shape
No installable primitives
00

Summary

codex-spec (shenli) — Summary

codex-spec is a Node.js CLI tool that automates spec-driven development workflows for OpenAI Codex, wrapping the OpenAI API to turn high-level feature intent into a structured artifact chain: product context → specification → requirements → implementation plan → task list → per-task execution. Unlike methodology documents, it actually invokes the OpenAI API at each stage, persisting artifacts to a .codex-specs/ directory tree. It targets the Codex CLI ecosystem (OpenAI Codex) as its primary runtime, though the CLI itself is a standalone npm package with no hard dependency on the codex binary. The framework is small (1 contributor, 45 stars) and purpose-built as a Codex-era counterpart to Kiro/OpenSpec patterns, shipping 10 CLI subcommands that mirror the spec pipeline stages. Compared to seeds, codex-spec differs from openspec (which is multi-tool and ships slash-commands + skills) in that it is a pure CLI tool that calls the OpenAI API directly, and differs from spec-kit (which ships a Python CLI with hooks and AI-tool integrations) in that it has no hooks, no multi-agent orchestration, and no TDD enforcement — it is the leanest possible CLI wrapper around a sequential spec-to-code pipeline.

01

Overview

codex-spec — Origin & Philosophy

Origin

Created by shenli (single maintainer), last commit September 2025. The repo description is "Automated workflows for OpenAI Codex. Features spec-driven development for new features: Product Requirement → Design → Development Plan → Implementation." It was built when OpenAI's Codex CLI was gaining traction as an autonomous coding agent and the author wanted a specification scaffold to feed it structured context before it executed code.

Core Philosophy

Make specifications the source of truth so AI agents work from shared, documented intent rather than free-form prompts. The README states the goal as: "Align teams on intent before coding · Preserve evolving project context · Generate detailed requirements and plans automatically · Execute tasks with dependency awareness and progress tracking."

Manifesto-style quotes (verbatim from README)

"AI is great at generating code, but results can be inconsistent without clear intent and shared context. codex-spec makes specifications the source of truth so you can: Align teams on intent before coding."

"This reduces rework, accelerates delivery, and keeps documentation in lockstep with the codebase."

Context

The framework is explicitly framed around the Codex CLI ecosystem (requires OPENAI_API_KEY). It optionally integrates with the codex binary on PATH but falls back to the OpenAI API for generation when codex CLI is unavailable. It is a minimal proof-of-concept level implementation: 1 contributor, 45 stars, 6 forks.

02

Architecture

codex-spec — Architecture

Distribution

npm package (codex-spec), installed globally via npm install -g codex-spec.

Install

npm install -g codex-spec
export OPENAI_API_KEY=your_api_key_here
codex-spec context-setup --force

Required Runtime

  • Node.js >= 16
  • OPENAI_API_KEY environment variable
  • Optional: codex CLI binary on PATH (falls back to API)

Directory Structure Created

.codex-specs/
├── context/
│   ├── product.md       # product context (what the project does)
│   ├── tech.md          # technology stack context
│   └── structure.md     # codebase structure context
├── current/             # alias to most recent spec
│   ├── specification.md
│   └── AGENTS.md
└── <YYYY-MM-DD_feature_name>/
    ├── specification.md  # feature spec
    ├── requirements.md   # EARS-style requirements
    ├── plan.md           # implementation plan
    └── tasks.json        # extracted tasks with IDs, phases, status

Source Structure

src/
├── cli.js               # CLI entry point (commander.js)
├── commands/            # 10 command handlers
│   ├── spec-context-setup.js
│   ├── spec-context-update.js
│   ├── spec-context-refresh.js
│   ├── spec-create.js
│   ├── spec-requirements.js
│   ├── spec-plan.js
│   ├── spec-tasks.js
│   ├── spec-execute.js
│   ├── spec-status.js
│   └── index.js
└── utils/
    ├── codex-client.js  # OpenAI API wrapper
    └── prompt-builder.js # prompt templates

Target AI Tools

  • Primary: OpenAI Codex CLI / OpenAI API
  • The execute command can invoke the local codex binary if available on PATH
03

Components

codex-spec — Components

CLI Commands (10 total)

All commands are subcommands of the codex-spec binary.

Command Purpose
context-setup Initialize .codex-specs/context/product.md, tech.md, structure.md via OpenAI API
context-update [component] Update one or all context files; --auto uses git diff to detect changes
context-refresh Regenerate all context files from scratch
create <feature-name> [description] Create a comprehensive feature specification in .codex-specs/<slug>/specification.md
requirements [spec-name] Generate EARS-format requirements from the current spec
plan [spec-name] Create implementation plan and extract tasks to tasks.json
plan-summary View plan overview (also runs automatically after plan)
tasks List tasks with IDs, titles, phase, and status
execute <task-id> Execute a single task with context; writes enabled by default (--read-only disables)
execute-phase <phase> Execute all tasks in a given phase
status View progress and plan overview (alias to plan-summary)

No Skills / Hooks / MCP Servers

codex-spec ships zero Claude Code skills, hooks, or MCP servers. It is a pure standalone CLI.

Utility Modules

  • CodexClient — wraps OpenAI API with system prompt injection
  • PromptBuilder — builds prompts for each stage (spec creation, requirements, plan, execute)
05

Prompts

codex-spec — Prompt Files

Excerpt 1: Feature Specification Creation (from src/commands/spec-create.js)

The create command invokes the OpenAI API with this system prompt:

const spec = await codexClient.generateWithAPI(
  prompt,
  'You are a senior product engineer writing precise, actionable specifications.'
);

The promptBuilder.buildSpecCreationPrompt(featureName, description) constructs the user prompt. The system prompt positions Claude/GPT as a "senior product engineer," an example of persona assignment technique.

The command also generates a per-spec AGENTS.md file for AI agent guidance:

const agents = promptBuilder.buildAgentsFile(featureName, description);
await fs.writeFile(path.join(baseDir, 'AGENTS.md'), agents);

Technique: Single-shot generation with role persona + artifact persistence pattern.

Excerpt 2: Spec Slug Generation (from src/commands/spec-create.js)

const suggestion = await codexClient.generateWithAPI(
  `Generate a concise, clear snake_case slug (3-6 words) for this feature. Letters, numbers, and underscores only. No prefix/suffix.\n\nFeature: ${featureName}`,
  'Return ONLY the snake_case slug, nothing else.'
);

Technique: Constrained output extraction — system prompt restricts model to a single token-class output. A classic few-shot-free extraction pattern.

Excerpt 3: Context Update (from README)

The context-update --auto command uses git diff to detect recent changes:

codex-spec context-update --auto

This is an example of context-from-delta prompting: feeding the git diff as evidence to the LLM so it updates the context files with only what changed, rather than regenerating from scratch.

Prompting Techniques Used

  1. Persona assignment — "You are a senior product engineer..."
  2. Constrained output extraction — "Return ONLY the snake_case slug, nothing else"
  3. Context-from-delta — feed git diff to update context files incrementally
  4. Sequential artifact chaining — each stage's output is the next stage's input (spec → requirements → plan → tasks)
09

Uniqueness

codex-spec — Uniqueness & Positioning

differs_from_seeds

Closest seed is openspec (both are npm CLIs implementing a spec pipeline), but codex-spec differs fundamentally: it calls the OpenAI API itself at each stage (context setup, spec creation, requirements, plan, execution) rather than generating prompt files for a human-driven AI tool session. Where openspec ships slash-commands and skills that a human pastes into Claude Code, codex-spec is fully autonomous — run a command, get an artifact, no AI tool UI required. It also differs from spec-kit (Python CLI, hooks, multi-AI integrations) by having zero hooks and no TDD enforcement. Against agent-os (bash scaffold that writes CLAUDE.md files), codex-spec is lower-ceremony but narrower: it writes to .codex-specs/ not to Claude-specific config files.

Unique Positioning

codex-spec is the only framework in this batch targeting OpenAI Codex as primary runtime and calling the OpenAI API directly (not Claude Code). This makes it a non-Claude-first spec framework — it was built specifically when OpenAI launched the Codex CLI and authors wanted structured context to feed it.

Observable Failure Modes

  1. No human gates: the pipeline runs from context-setup to execute with no review checkpoints — spec quality depends entirely on the initial description.
  2. No TDD: the execute command writes code with no failing test first.
  3. Context drift: product.md/tech.md/structure.md can go stale; --auto refresh requires git diff and manual invocation.
  4. Requires OpenAI: hard dependency on OPENAI_API_KEY; no Claude, no Gemini support.
  5. Single maintainer: 1 contributor, last commit Sep 2025, risk of abandonment.

Explicit Antipatterns (inferred from design)

  • Writing code without first running context-setup and create
  • Skipping requirements before plan
04

Workflow

codex-spec — Workflow

Phases & Artifacts

Phase Command Artifact
1. Context Setup codex-spec context-setup .codex-specs/context/product.md, tech.md, structure.md
2. Feature Specification codex-spec create <name> .codex-specs/<slug>/specification.md
3. Requirements codex-spec requirements .codex-specs/<slug>/requirements.md
4. Plan codex-spec plan .codex-specs/<slug>/plan.md
5. Task Extraction (automatic after plan) .codex-specs/<slug>/tasks.json
6. Execution codex-spec execute <task-id> Code changes in workspace
7. Status Check codex-spec status Progress report (stdout)
8. Maintenance codex-spec context-update --auto Updated context files

Approval Gates

codex-spec has no explicit human approval gates. It runs each command to completion without pausing. The human triggers each stage manually by running the next command. No "wait for approval" prompts are issued.

Flow Diagram (from README)

Setup → Context Creation → Feature Specification → Requirements
→ Implementation Plan → Execute Tasks → Progress & Status
→ Maintenance: Context Update/Refresh

Task Execution Notes

  • execute <task-id> enables writes by default; --read-only prevents writes
  • execute-phase <phase> runs all tasks in a given phase sequentially
  • Phase names with spaces must be quoted: codex-spec execute-phase "Core Features"
  • Spec directory naming: YYYY-MM-DD_name_of_the_spec (auto-generated slug), overridable with --title
06

Memory Context

codex-spec — Memory & Context

State Storage

File-based, stored under .codex-specs/ in the project root.

Persistence Scope

Project-scoped — all artifacts live in the .codex-specs/ directory within the current repository.

Handoff Mechanism

  • A .codex-specs/current/ symlink/alias directory holds copies of the most recent spec's specification.md and AGENTS.md, enabling follow-up commands (requirements, plan) to operate without specifying the spec name.
  • Each spec directory is named YYYY-MM-DD_<slug> for temporal ordering.

Context Files

  • product.md — product description and goals
  • tech.md — technology stack
  • structure.md — codebase structure overview

These three files are injected as context when executing tasks (execute command reads them to give the LLM project context before generating code).

Cross-session Handoff

Yes — all artifacts are files on disk. Any new session can pick up where the previous one left off by running codex-spec tasks to see current status, then codex-spec execute <next-task-id>.

Compaction

None. No context compaction or summarization built in. Context files can be refreshed via context-refresh.

Memory Type

File-based markdown + JSON (tasks.json for task state).

07

Orchestration

codex-spec — Orchestration

Multi-Agent

No. codex-spec is single-agent — the CLI calls the OpenAI API sequentially. No subagent spawning.

Orchestration Pattern

Sequential. Each command in the pipeline (create → requirements → plan → execute) runs one at a time, triggered by the human.

Execution Mode

One-shot per command. Human triggers each stage by running the next CLI command. No background daemon, no continuous loop.

Isolation Mechanism

None. All file writes happen in-place in the project working directory. No git branching or worktrees.

Multi-Model

No. Single model: OpenAI API (model configurable via OPENAI_API_KEY, defaults to whatever OpenAI API returns). No model routing per role.

Consensus Mechanism

None.

Auto-Validators

None. No linting, testing, or self-review runs automatically. The --read-only flag prevents writes during execute, but no automated quality gates exist.

TDD Enforcement

None.

Git Automation

None. No automatic commits, PRs, or merges.

Prompt Chaining

Yes — explicit. The plan command reads specification.md and requirements.md, the execute command reads plan.md and tasks.json. One stage's output is the next stage's prompt input.

08

Ui Cli Surface

codex-spec — UI & CLI Surface

CLI Binary

  • Name: codex-spec
  • Type: Dedicated CLI (npm global binary, not a thin wrapper over Claude/Codex)
  • Package: codex-spec on npm
  • Subcommands: 10 (context-setup, context-update, context-refresh, create, requirements, plan, plan-summary, tasks, execute, execute-phase, status)

Local UI

None. No web dashboard, no TUI, no desktop app.

IDE Integration

None. No VS Code extension, no Cursor integration, no Claude Code plugin.

Observability

  • codex-spec status / codex-spec plan-summary — prints current plan and task progress to stdout
  • codex-spec tasks — lists all tasks with IDs, titles, phases, and status
  • No structured logging, no audit log

Installation Check

codex-spec --help          # verify installation
codex-spec tasks           # see task progress
codex-spec status          # current plan summary

Cross-Tool Portability

Low-medium. The CLI works with any OpenAI API key and optionally integrates with the Codex binary. It does not integrate with Claude Code, Cursor, or other AI tools beyond the optional codex CLI hook.

Related frameworks

same archetype · same primary tool · same memory type

Superpowers Marketplace ★ 1.0k

Single-endpoint Claude Code plugin marketplace for the superpowers plugin ecosystem.

pydantic-ai-harness ★ 354

Official capability extension library for Pydantic AI agents — bundles tools, hooks, and lifecycle primitives into…

backgrounder.dev

Hosted background coding agent interface (closed SaaS — insufficient public material for full analysis).

oh-my-claudecode (mazenyassergithub) ★ 5

Claims multi-agent orchestration with 28 agents and 28 skills but README content is SEO-style with ZIP download instructions…

Anthropic Claude Plugins Official ★ 0

Repo not found — no public content to analyze.

OpenAI Codex CLI ★ 86k

Give developers a sandboxed, locally-running OpenAI coding agent with approval gates and skill orchestration.