Codex Integration for Claude Code (skill-codex)

codex-integration-skill · skills-directory/skill-codex · ★ 1.3k · last commit 2026-05-25

Single Claude Code skill that handles Codex CLI invocation correctly (stdin blocking, thinking token suppression, session resume) with a peer-AI critical evaluation protocol.

Best whenTreat Codex as a colleague, not an authority — Claude Code must critically evaluate Codex output and dispute incorrect findings using its actual model identity.

Skip ifBlind deference to Codex output, Forgetting </dev/null for non-TTY stdin (causes hung process)

vs seeds

ccmemory(single-skill Claude Code plugin for external tool integration). Differs by: explicit peer-AI disagreement protocol (Cla…

Primitive shape 1 total

Skills 1

Summary

Codex Integration for Claude Code (skill-codex) — Summary

Elevator pitch: A single-skill Claude Code plugin that enables Claude Code to invoke the Codex CLI (codex exec and session resumes) for automated code analysis, refactoring, and editing. Installed via Claude Code's plugin marketplace (/plugin marketplace add skills-directory/skill-codex). The skill handles model selection, sandbox mode configuration, stdin blocking (a subtle Codex CLI gotcha), thinking-token suppression, and session resume. 1,269 stars despite being a single-skill plugin — the highest-starred single-skill plugin in this batch. Codex's role is strictly worker — Claude Code remains the orchestrator and critical evaluator, explicitly instructed to dispute incorrect Codex findings. Compared to seeds: most similar to ccmemory (single-skill Claude Code plugin for an external tool), but targets Codex CLI execution rather than graph memory, and adds a documented "critical evaluation" protocol that positions Claude Code as the authority over Codex output.

Overview

Codex Integration for Claude Code (skill-codex) — Overview

Origin

By skills-directory. Version unknown (no semver tag). MIT license. 1,269 stars, 94 forks. Active: last commit 2026-05-25. 11 contributors.

README cross-references klaudworks/ralph-meets-rex for more autonomous setups:

"Note: If you want a more autonomous setup for agentic workflows, check out klaudworks/ralph-meets-rex."

Philosophy

The skill is functionally minimal but operationally careful. Its three design concerns:

stdin blocking prevention: codex exec reads stdin even when a positional prompt is provided. If stdin is not closed in a non-TTY context, Codex blocks forever. The skill documents this explicitly and mandates </dev/null in all non-TTY invocations.
Thinking token suppression: By default, Codex emits thinking tokens to stderr. The skill uses 2>/dev/null by default to prevent context bloat. Users can opt-in to seeing them.
Critical evaluation: Claude Code is explicitly instructed not to defer to Codex. If Codex is wrong, Claude Code should say so:

"Trust your own knowledge when confident. If Codex claims something you know is incorrect, push back directly."

Manifesto Quote (verbatim)

"Codex is powered by OpenAI models with their own knowledge cutoffs and limitations. Treat Codex as a colleague, not an authority."

"When Codex is Wrong: Identify yourself as Claude so Codex knows it's a peer AI discussion. Use your actual model name..."

echo "This is Claude (<your current model name>) following up. I disagree with [X] because [evidence]. What's your take on this?" | codex exec --skip-git-repo-check resume --last 2>/dev/null

Architecture

Codex Integration for Claude Code — Architecture

Distribution

Type: Claude Code plugin (marketplace)
Marketplace slug: skills-directory/skill-codex
Plugin name: skill-codex@skill-codex
License: MIT

Install

# Plugin installation (recommended):
/plugin marketplace add skills-directory/skill-codex
/plugin install skill-codex@skill-codex

# Standalone skill install:
git clone --depth 1 git@github.com:skills-directory/skill-codex.git /tmp/skills-temp && \
mkdir -p ~/.claude/skills && \
cp -r /tmp/skills-temp/plugins/skill-codex/skills/codex ~/.claude/skills/codex && \
rm -rf /tmp/skills-temp

Required Runtime

Claude Code
codex CLI on PATH
Codex configured with valid credentials

Directory Tree

.claude-plugin/          # Claude Code plugin manifest + marketplace.json
plugins/
└── skill-codex/
    ├── skills/
    │   └── codex/
    │       └── SKILL.md    # The single skill
    └── marketplace.json

Core Invocation Pattern

# Standard execution (suppress thinking tokens):
codex exec -m <model> \
  --config model_reasoning_effort="<level>" \
  --sandbox <mode> \
  --full-auto \
  --skip-git-repo-check \
  "prompt here" 2>/dev/null

# Non-TTY / background (close stdin to prevent blocking):
codex exec ... "prompt here" </dev/null 2>/dev/null

# Session resume:
echo "new prompt" | codex exec --skip-git-repo-check resume --last 2>/dev/null

Target AI Tools

Primary: Claude Code (as host/orchestrator)
Worker: Codex CLI (invoked via Bash)

Components

Codex Integration for Claude Code — Components

Skills (1)

Name	Description	Trigger
`codex`	Use when the user asks to run Codex CLI (codex exec, codex resume) or references OpenAI Codex for code analysis, refactoring, or automated editing	Skill activation phrase or explicit Codex request

Commands

None — the skill is the sole interface.

Hooks

None.

Scripts

None.

The Single SKILL.md

The entire framework is one file: plugins/skill-codex/skills/codex/SKILL.md. It covers:

Model selection — ask user for model + reasoning effort in one prompt (gpt-5.5, gpt-5.4, gpt-5.4-mini, gpt-5.3-codex-spark, or gpt-5.3-codex) + effort level (xhigh/high/medium/low)
Sandbox mode selection — default read-only, workspace-write for edits, danger-full-access for network
Command assembly — with all required flags (--full-auto, --skip-git-repo-check, 2>/dev/null, </dev/null for non-TTY)
Session resume — echo "prompt" | codex exec --skip-git-repo-check resume --last 2>/dev/null
AskUserQuestion follow-up — mandatory after every codex command
Critical evaluation protocol — how Claude should handle Codex disagreements
Error handling — non-zero exit codes, permission confirmations before high-impact flags

Quick Reference Table (from SKILL.md)

Use case	Sandbox mode	Key flags
Read-only review	`read-only`	`--sandbox read-only 2>/dev/null`
Apply local edits	`workspace-write`	`--sandbox workspace-write --full-auto 2>/dev/null`
Permit network	`danger-full-access`	`--sandbox danger-full-access --full-auto 2>/dev/null`
Resume session	Inherited	`echo "prompt" \| codex exec --skip-git-repo-check resume --last 2>/dev/null`
Run from directory	Match needs	`-C <DIR>` plus other flags

Prompts

Codex Integration for Claude Code — Prompts

Prompt 1: SKILL.md — stdin Blocking Warning (verbatim)

Source: plugins/skill-codex/skills/codex/SKILL.md

**IMPORTANT (stdin)**: `codex exec` always reads stdin and concatenates it with the positional prompt -- even when the prompt is fully supplied as a positional argument. If stdin is not closed, codex blocks forever. When invoking from a harness (background tasks, hooks, scripts where stdin is not a TTY but also not closed), explicitly redirect stdin: append `</dev/null` to the command, e.g. `codex exec ... "prompt" </dev/null 2>/dev/null`. Symptom of getting this wrong: zero bytes of stdout, zero CPU accumulated, process appears hung indefinitely.

Prompting technique: Failure-mode documentation with diagnostic signature. The skill documents the exact observable symptom of the bug ("zero bytes of stdout, zero CPU accumulated, process appears hung indefinitely"), making it self-diagnosable. This level of operational detail is unusual in skill files.

Prompt 2: SKILL.md — Peer AI Disagreement Protocol (verbatim)

Source: plugins/skill-codex/skills/codex/SKILL.md

### When Codex is Wrong
1. State your disagreement clearly to the user
2. Provide evidence (your own knowledge, web search, docs)
3. Optionally resume the Codex session to discuss the disagreement. **Identify yourself as Claude** so Codex knows it's a peer AI discussion. Use your actual model name (e.g., the model you are currently running as) instead of a hardcoded name:
   ```bash
   echo "This is Claude (<your current model name>) following up. I disagree with [X] because [evidence]. What's your take on this?" | codex exec --skip-git-repo-check resume --last 2>/dev/null

Frame disagreements as discussions, not corrections - either AI could be wrong
Let the user decide how to proceed if there's genuine ambiguity


**Prompting technique**: Identity-injection for peer AI communication. Claude Code is instructed to identify itself by name when resuming a Codex session to dispute findings — creating a named peer-AI dialogue. This is the most sophisticated AI-to-AI communication protocol in the corpus.

---

## Prompt 3: SKILL.md — Thinking Token Suppression (verbatim)

Source: `plugins/skill-codex/skills/codex/SKILL.md`

IMPORTANT: By default, append 2>/dev/null to all codex exec commands to suppress thinking tokens (stderr). Only show stderr if the user explicitly requests to see thinking tokens or if debugging is needed.


**Prompting technique**: Default-behavior override with opt-in exception. By making `2>/dev/null` the default and requiring explicit user request to show thinking tokens, the skill prevents the most common source of context bloat in Codex integrations.

Uniqueness

Codex Integration for Claude Code — Uniqueness

differs_from_seeds

The closest seed is ccmemory (single-skill Claude Code plugin for external tool integration). However, codex-integration-skill differs by: (1) targeting Codex CLI execution rather than graph memory; (2) the explicit critical evaluation protocol — Claude Code is instructed to dispute Codex if wrong, not just relay its output; (3) the stdin-blocking documentation (a genuine operational gotcha undocumented anywhere in the seeds). The codex-plugin-cc (not a seed) is the closest analog — both are Claude Code plugins that invoke Codex — but codex-integration-skill uses direct codex exec Bash invocation rather than the codex-companion.mjs script abstraction, making it simpler but less feature-complete (no background job tracking, no status/cancel commands).

Most Unusual Feature

The identity-injection peer-AI disagreement protocol: Claude Code is instructed to identify itself by model name when resuming a Codex session to dispute findings ("This is Claude (<your current model name>) following up..."). This creates a documented AI-to-AI peer discussion protocol where neither model is definitively the authority — unique in the corpus.

Why 1,269 Stars?

The skill solves three practical Codex CLI integration problems that every developer hits:

stdin blocking (the most common Codex-as-subprocess bug)
thinking token context bloat
session resume syntax

The skill is essentially a curated operational guide for Codex-in-Claude integration, which explains its outsized popularity relative to its simplicity.

Observable Failure Modes

--skip-git-repo-check permission gate: The skill requires user confirmation before using this flag, but once confirmed, subsequent uses may not prompt again, potentially running Codex outside git repos unintentionally.
Model version staleness: The model list (gpt-5.5, gpt-5.4, etc.) in SKILL.md may become stale as OpenAI releases new models — the skill doesn't have a mechanism to auto-update this list.
Resume flag restriction: codex exec resume --last cannot accept model/effort flags, meaning resumed sessions inherit original settings even if the user wants to change them.

Workflow

Codex Integration for Claude Code — Workflow

Standard Flow

1. User asks Claude Code to use Codex (or skill activates automatically)
2. Skill asks: which model? which reasoning effort? (single AskUserQuestion with 2 questions)
3. Skill determines appropriate sandbox mode
4. Skill assembles codex exec command with all flags
5. Execute: codex exec ... "prompt" </dev/null 2>/dev/null
6. Capture stdout, summarize for user
7. Ask: next steps? (AskUserQuestion mandatory)
8. If resume: echo "new prompt" | codex exec --skip-git-repo-check resume --last 2>/dev/null

Phases + Artifacts

Phase	Artifact
Configuration	Model + effort + sandbox selection
Execution	codex exec stdout
Evaluation	Claude's critical review of Codex output
Resume (optional)	Continued Codex session

Approval Gates

Gate	Type
Model selection	`choice-list` (AskUserQuestion)
High-impact flag confirmation (`--full-auto`, `danger-full-access`, `--skip-git-repo-check`)	`yes-no`
Non-zero exit / warnings	`freetext-clarify`

Critical Evaluation Protocol

After every Codex run, Claude Code is instructed to:

Trust its own knowledge — if Codex says something incorrect, push back
Research disagreements via WebSearch before accepting
If disputing: identify as "Claude ()" in a resume call
Frame as peer discussion, not correction
Let user decide if there's genuine ambiguity

This makes every Codex run a two-step process: execution + Claude's own evaluation.

Session Resume

The skill explicitly encourages resume-based follow-up:

"After every codex command, immediately use AskUserQuestion to confirm next steps... When resuming, pipe the new prompt via stdin."

Memory Context

Codex Integration for Claude Code — Memory and Context

State Storage

Codex session state: Codex CLI maintains its own internal session state. The skill accesses this via codex exec --skip-git-repo-check resume --last (stdin-piped prompt to the most recent session).
No Claude-side persistence: The skill does not write any files or maintain state outside Codex CLI's native session management.

Session Resume

The skill explicitly documents the resume pattern:

echo "new prompt" | codex exec --skip-git-repo-check resume --last 2>/dev/null

Key constraint: --last only works without other config flags (-m, --config, etc.) — the resumed session inherits the original settings.

Context Bloat Prevention

Two mechanisms:

2>/dev/null suppresses Codex thinking tokens (stderr) by default
The skill instructs Claude to summarize Codex output rather than returning it raw — though the SKILL.md notes users can opt-in to seeing raw output

Compaction

Not handled. The skill defers to Codex CLI's native session management.

Memory Type

None (skill-level). Codex CLI maintains session state internally.

Orchestration

Codex Integration for Claude Code — Orchestration

Multi-Agent

Two agents: Claude Code (orchestrator) and Codex (worker). Simple two-agent delegation — no subagent spawning.

Orchestration Pattern

Sequential (Claude asks → Codex executes → Claude evaluates → user decides next step).

Isolation Mechanism

Process — Codex runs as a separate OS process via Bash invocation.

Multi-Model

Yes — Claude (user's active CC model) orchestrates; Codex (user-selected model: gpt-5.5, gpt-5.4, gpt-5.4-mini, gpt-5.3-codex-spark, or gpt-5.3-codex) executes.

Codex Role

Worker only — Codex executes analysis, refactoring, or editing tasks under Claude Code's direction. Claude Code is explicitly the authority; Codex is the "colleague."

Execution Mode

Interactive-loop (single task per skill activation, with optional resume).

Consensus Mechanism

None. If Claude and Codex disagree, Claude has authority; user makes the final call on genuine ambiguities.

Prompt Chaining

Minimal — the skill can resume a Codex session and pipe a new prompt, but there is no multi-stage pipeline.

Ui Cli Surface

Codex Integration for Claude Code — UI and CLI Surface

Dedicated CLI Binary

None. Installed as a Claude Code plugin; invoked via skill activation in Claude Code.

Skill Activation

The skill triggers on phrases like:

"Use codex to analyze..."
"Run Codex on..."
"Ask Codex to refactor..."
Any mention of codex exec or Codex CLI

Local UI

None.

IDE Integration

Claude Code only. The skill is structured as a Claude Code plugin with a standard plugin marketplace manifest.

Observability

Model and effort selection surfaced to user before every run
Codex stdout summarized in Claude session
AskUserQuestion mandatory after every Codex command — keeps user in the loop
Sandbox mode selection made explicit (read-only / workspace-write / danger-full-access)

Cross-Tool Portability

Low — Claude Code only (plugin system). However, the single SKILL.md could be extracted and used as a standalone Codex skill file in other contexts.

Related frameworks

same archetype · same primary tool · same memory type

BMAD-METHOD ★ 48k

A4 Markdown scaffold

Provides a full agile delivery lifecycle with named expert-persona AI collaborators that elicit the human's best thinking rather…

Agent OS ★ 4.6k

A4 Markdown scaffold

Extracts implicit codebase conventions into token-efficient markdown standards files and injects them selectively into AI agent…

Claude Conductor ★ 367

A4 Markdown scaffold

Gives Claude Code a persistent, cross-linked, auto-analyzed documentation system so it retains codebase context across sessions.

Spec-Driver (Greenfield Spec-Driven Development) ★ 25

A4 Markdown scaffold

Prevents spec rot in AI-assisted development by making implementation changes flow back into evergreen, authoritative specs via…

Anthropic Knowledge Work Plugins ★ 16k

A4 Markdown scaffold

Role-specialized plugin bundles with live MCP connectors that turn Claude into a domain expert for enterprise knowledge workers.

mini-coding-agent ★ 882

A4 Markdown scaffold

A single-file zero-dependency Python coding agent that demonstrates the six core components of coding agents for educational…

Distribution

Type: claude-plugin
License: MIT
Install: multi-step
Version: unknown (no semver; last commit 2026-05-25)

Surfaces

CLI binary: No
CLI subcmds: 0
Local UI: No
Tech stack: none

Components

Commands: 0
Skills: 1
Subagents: 0
Hooks: 0
MCP servers: 0
MCP tools: 0
Scripts: 0
Templates: 0

Workflow

Phases: 5
Approval gates: 3
Spec format: none
Spec storage: none
Delta or full: none

Orchestration

Multi-agent: Yes
Pattern: sequential
Max concurrent: 2
Isolation: process
Consensus: none
Prompt chaining: No

Multi-model

Multi-model: Yes
BYOK: Yes
Modal: text

Execution

Mode: interactive-loop
Crash recovery: No
Compaction: No
Session handoff: Yes
Streaming: No

Memory

Type: none
Persistence: session
Search: none

Quality

TDD: No
TDD mechanism: none
Self-review: inline-self

Git / Observability

Auto commit: No
Auto PR: No
Auto merge: No
Worktree/feat: No
Audit log: No
Audit format: none
Replay: No

Tools

Primary: claude-code
Targets: 2
Portability: low

Signals

Stars: 1.3k
Last commit: 2026-05-25
Contributors: 11
Maintainer: active
Quality score: 3/10