agentic-python-coder

szeider-python-coder · szeider/agentic-python-coder · ★ 3 · last commit 2026-02-15

Primitive shape 4 total

MCP tools 4

Summary

szeider-python-coder — Summary

agentic-python-coder is a dual-mode Python execution framework: a coder CLI (ReAct agent using LangGraph over OpenRouter) and an ipython_mcp MCP server that exposes a persistent IPython kernel to any MCP-compatible client. The coder agent takes a task (inline, file, or interactive), executes Python iteratively with python_exec, verifies solutions against every constraint before calling save_code, and terminates. The MCP server supports multi-kernel operation for parallel agents. Both share the same IPython kernel for stateful execution. Published on PyPI as agentic-python-coder. With only 3 stars and last commit February 2026, this is primarily notable for its multi-kernel MCP design and its academic research context (associated with a published arxiv paper).

Differs from seeds: No seed ships a standalone Python execution agent. Closest to claude-flow's MCP-anchored toolserver but the MCP server is minimal (4 tools: execute, reset, status, interrupt) and the primary value is the persistence across executions (IPython kernel state). Unlike all seeds, this is a LangGraph-based agent, not a Claude Code plugin.

Overview

szeider-python-coder — Overview

Origin

Created by Stefan Szeider (szeider), a computer science professor at TU Wien. Associated with a published arxiv paper: [Szeider 2025, arxiv-2508.07468] on constraint modelling applications. Licensed Apache-2.0.

Philosophy

The system prompt is explicit:

"You are a Python coding assistant designed to solve focused problems efficiently."

"Verify Before Saving: Before calling save_code, you MUST verify your solution: Execute the full script via python_exec and confirm it produces correct output. For constraint/logic problems: write a verification function that checks the output against EVERY constraint in the problem statement using plain Python asserts, independent of your solver model."

The agent is deliberately minimalist:

One task at a time
One tool to execute code (python_exec)
One tool to save code (save_code)
Mandatory verification before saving
Stop when done — no extra features

Research Context

The coder agent is designed for computational constraint problems (CPMpy, Clingo, regex) — examples show it solving the 8-queens problem and CSP instances. The arxiv paper describes using this agent for automated constraint model generation and verification.

Dual-Mode Design

Mode A (coder CLI): Autonomous ReAct agent, runs to completion, saves one Python file
Mode B (ipython_mcp): Stateful IPython kernel exposed as MCP tools — usable from Claude Desktop, Claude Code, or any MCP client for interactive Python sessions

Architecture

szeider-python-coder — Architecture

Distribution

PyPI package.

# Install as CLI tool
uv tool install agentic-python-coder

# Or as MCP server (no install needed)
uvx --from agentic-python-coder ipython_mcp

Directory Tree

.
├── coder/
│   ├── src/
│   │   └── (LangGraph ReAct agent implementation)
│   ├── prompts/
│   │   ├── system.md          # System prompt for fileless mode
│   │   └── system_todo.md     # System prompt for file-with-todo mode
│   └── tests/
├── pyproject.toml              # Package: agentic-python-coder, Apache-2.0
├── README.md
└── LICENSE

MCP Server Configuration

{
  "mcpServers": {
    "ipython": {
      "command": "uvx",
      "args": ["--from", "agentic-python-coder", "ipython_mcp"]
    }
  }
}

Required Runtime

Python 3.13
UV package manager
OpenRouter API key (for coder CLI)
MCP client (for ipython_mcp server)

Supported Models (via OpenRouter)

deepseek31, gemini25, gemini3flash, gemini3pro, gpt52, grok41, opus45, qwen3, sonnet45

Components

szeider-python-coder — Components

CLI Tools (coder agent)

Tool	Description
`python_exec`	Execute Python code in persistent IPython kernel; returns JSON: success, stdout, result, stderr, error
`save_code`	Save final verified solution to `{basename}_code.py`; call ONCE after verification passes

MCP Server Tools (ipython_mcp)

Tool	Description
`python_exec`	Execute Python code; auto-starts session if needed; default 30s timeout
`python_reset`	Create new kernel (no kernel_id) OR reset existing kernel (with kernel_id); optionally install packages
`python_status`	Check session state: active flag, kernel IDs, Python version, packages, variables
`python_interrupt`	Send interrupt signal to stop long-running code; session state preserved

CLI Options (coder)

Flag	Description
`--version`, `-V`	Show version
`--init [TEMPLATE]`	Initialize example templates (cpmpy, clingo, regex, or all)
`--task FILE`	Task from file
`--dir DIR`	Working directory
`--with PACKAGES`	Extra Python packages to install
`--project FILE`	Project template context
`-i`	Interactive mode
`--api-key KEY`	OpenRouter API key

System Prompts (2)

coder/prompts/system.md — Fileless mode (inline task)
coder/prompts/system_todo.md — File+todo mode (task from file)

Prompts

szeider-python-coder — Prompt Excerpts

Excerpt 1: `system.md` — Core verification mandate

Technique: Mandatory pre-save verification with constraint-independence requirement; anti-trust-solver warning.

2. **Verify Before Saving**: Before calling save_code, you MUST verify your solution:
   - Execute the full script via python_exec and confirm it produces correct output
   - For constraint/logic problems: write a verification function that checks the output against EVERY constraint in the problem statement using plain Python asserts, independent of your solver model
   - For problems with a specific output format: assert that JSON keys, array shapes, and value ranges match the spec exactly
   - For optimization: confirm optimality (e.g., re-solve with a stricter bound and confirm infeasibility)
   - Do NOT trust that solver.solve()==True means your model is correct — your constraints may be wrong
3. **Save Once**: Call save_code only after verification passes
4. **Stop When Done**: Don't add features not requested

Excerpt 2: `system.md` — Error recovery protocol

Technique: Fallback hierarchy for common errors; module-not-found prefers built-in solution first.

## Error Recovery

- **ModuleNotFoundError**: Try to solve with built-in modules first
- **Syntax/Logic Errors**: Debug iteratively with python_exec
- **Unclear Requirements**: Document assumptions and proceed

Excerpt 3: `system.md` — Code cleaning requirements

Technique: Pre-save checklist enforcing production-grade output over development scaffolding.

## Code Cleaning Requirements

Before saving any code with save_code, your script MUST pass this checklist:
- Remove ALL print() statements except final output (JSON or required output)

Excerpt 4: `system.md` — Available tools description

Technique: JSON return format documented so the agent knows what to parse.

1. **python_exec**: Execute Python code in a persistent IPython kernel
   - The kernel maintains state between executions
   - Variables, functions, and imports persist across calls
   - Use print() for output, or the last expression will be returned
   - Returns JSON with: success, stdout, result, stderr, error

Uniqueness

szeider-python-coder — Uniqueness

Differs from Seeds

No seed is a standalone Python execution agent with its own LLM runtime. Closest to claude-flow's MCP-anchored toolserver (own runtime, own tools) but purpose-built for Python code execution rather than general Claude Code orchestration. The multi-kernel MCP design (each agent gets a kernel_id for isolated state) is novel — no seed implements kernel-level isolation for parallel agents. The mandatory pre-save verification with constraint-independent assertions (the "do NOT trust solver.solve()==True" instruction) is the most explicit self-verification requirement in the corpus. The OpenRouter multi-model support (9 models) makes it the most flexible in terms of LLM choice, though it is not Claude-specific.

Positioning

Niche but technically sophisticated. Primary use case is automated constraint problem solving (CPMpy, Clingo) as described in the associated arxiv paper. Secondary use case is providing any MCP client with a persistent Python execution environment.

Observable Failure Modes

OpenRouter dependency: Requires OpenRouter API key; cannot use Anthropic API directly without code changes.
Python 3.13 requirement: Restricts deployment environments.
No streaming output: Results are returned at tool completion, not streamed.
Kernel resource management: Parallel multi-kernel use without explicit cleanup can exhaust system resources.

Explicit Antipatterns

Trusting solver.solve()==True without independent constraint verification
Calling save_code before verification passes
Adding features not explicitly requested
Not removing debug print() statements before saving

Workflow

szeider-python-coder — Workflow

Coder Agent Workflow (one-shot)

Phase	Activity	Tool
1	Understand task	Read task/file
2	Plan approach	Reasoning (no tool)
3	Iterative development	`python_exec` (repeated)
4	Verification	`python_exec` (full script run + constraint assertions)
5	Save final code	`save_code` (ONCE, after verification passes)
6	Stop	Agent terminates

Verification Gate (mandatory)

Before calling save_code:

Execute full script and confirm correct output
For constraint/logic problems: write a verification function with plain Python asserts, independent of the solver model
For problems with specific output format: assert JSON keys, array shapes, value ranges
For optimization: confirm optimality (re-solve with stricter bound, confirm infeasibility)
"Do NOT trust that solver.solve()==True means your model is correct"

MCP Server Workflow (multi-kernel)

Agent A                              Agent B
────────                             ────────
python_reset() → kernel_id="aaa"     python_reset() → kernel_id="bbb"
python_exec(kernel_id="aaa", ...)    python_exec(kernel_id="bbb", ...)
python_exec(kernel_id="aaa", ...)    python_exec(kernel_id="bbb", ...)

Artifacts

Phase	Artifact
Development	Code in IPython kernel (stateful)
Save	`{basename}_code.py`
Logging	`{basename}.jsonl` (execution log)

Memory Context

szeider-python-coder — Memory & Context

State Storage

IPython kernel maintains state across python_exec calls within a session:

Variables persist
Imports persist
Function definitions persist
Package installations persist

Cross-Session Handoff

No automatic handoff. JSONL execution log ({basename}.jsonl) provides a record of the session.

Multi-Kernel State

Each python_reset() with no kernel_id creates a new isolated kernel. State is scoped to kernel_id. Multiple agents can run isolated parallel sessions.

Context from Project Templates

--project FILE passes a project template markdown file as additional context — used for domain-specific constraints (e.g., CPMpy usage patterns, Clingo syntax).

Compaction Handling

None. The agent is designed for focused, short tasks where context overload is unlikely.

Orchestration

szeider-python-coder — Orchestration

Multi-Agent Pattern

Parallel (via MCP multi-kernel) — each agent gets its own kernel_id for isolated execution.

Execution Mode

coder CLI: one-shot — takes a task, executes to completion, saves, terminates
ipython_mcp: event-driven — responds to MCP tool calls

Isolation Mechanism

Process-level isolation per kernel via IPython kernel management.

Multi-Model

Yes — supports 9 models via OpenRouter: deepseek31, gemini25, gemini3flash, gemini3pro, gpt52, grok41, opus45, qwen3, sonnet45. User-configured.

Auto-Validators

Mandatory self-verification: Before save_code, the agent executes the full script and writes independent constraint-checking assertions. This is a prompt-enforced validation, not a hook-triggered one.

Prompt Chaining

No explicit chaining. Single-task agent.

Consensus Mechanism

None.

LangGraph ReAct

The coder CLI uses LangGraph ReAct pattern — the agent loop is: Think → Act (python_exec) → Observe → repeat until save_code is called.

Ui Cli Surface

szeider-python-coder — UI / CLI Surface

CLI Binary

Yes — coder (installed via uv tool install agentic-python-coder)

Subcommands / Flags

coder "task description"           # Inline task
coder --task problem.md            # Task from file
coder --dir results/test1 "task"   # Specify working directory
coder -i                           # Interactive mode
coder --init cpmpy                 # Initialize example templates
coder --with cpmpy --project ...   # Extra packages + project context

MCP Server

ipython_mcp — exposed via uvx --from agentic-python-coder ipython_mcp

Local UI

None. Terminal output only.

IDE Integration

Compatible with any MCP client:

Claude Desktop (primary MCP target)
Claude Code (via .mcp.json)
Any MCP-compatible agent

Observability

JSONL execution log ({basename}.jsonl)
No session replay beyond the log

Related frameworks

same archetype · same primary tool · same memory type

claude-mem (thedotmack) ★ 78k

A8 Cross-runtime harness

Background worker service captures every tool call as an observation, AI-compresses sessions, and auto-injects relevant past…

pi (badlogic/earendil) ★ 55k

A8 Cross-runtime harness

A minimal, hackable, multi-provider terminal coding agent that adapts to your workflows via npm-installable TypeScript Extensions…

Agent Skills (Addy Osmani) ★ 46k

A8 Cross-runtime harness

Encodes senior-engineer software development lifecycle as 23 auto-routed skills and 7 slash commands for any AI coding agent.

wshobson/agents Plugin Marketplace ★ 36k

A8 Cross-runtime harness

Single Markdown source for 83 domain-specialized plugins that auto-generates idiomatic artifacts for five AI coding harnesses.

TabbyML/Tabby ★ 34k

A8 Cross-runtime harness

Self-hosted AI coding assistant server (alternative to GitHub Copilot) with admin dashboard, RAG-based completions, and multi-IDE…

Compound Engineering ★ 17k

A8 Cross-runtime harness

Make each unit of engineering work compound into easier future work via brainstorm→plan→execute→review→learn cycles.

Distribution

Type: cli-tool
License: Apache-2.0
Install: one-liner
Version: main (commit 2026-02-15)

Surfaces

CLI binary: coder
CLI subcmds: 6
Local UI: No

Components

Commands: 0
Skills: 0
Subagents: 0
Hooks: 0
MCP servers: 1
MCP tools: 4
Scripts: 0
Templates: 2

Workflow

Phases: 6
Approval gates: 1
Spec format: none
Spec storage: none
Delta or full: none

Orchestration

Multi-agent: Yes
Pattern: parallel-fan-out
Isolation: process
Consensus: none
Prompt chaining: No

Multi-model

Multi-model: Yes
BYOK: Yes
Modal: text

Execution

Mode: one-shot
Crash recovery: No
Compaction: No
Session handoff: No
Streaming: No

Memory

Type: none
Persistence: session
Search: none
State files: 1 file

Quality

TDD: Yes
TDD mechanism: pre-impl-test-write
Self-review: inline-self

Git / Observability

Auto commit: No
Auto PR: No
Auto merge: No
Worktree/feat: No
Audit log: Yes
Audit format: jsonl
Replay: No

Tools

Primary: standalone
Targets: 3
Portability: high

Signals

Stars: 3
Last commit: 2026-02-15
Contributors: 1
Maintainer: dormant
Quality score: 5.2/10

Summary

szeider-python-coder — Summary

Overview

szeider-python-coder — Overview

Origin

Philosophy

Research Context

Dual-Mode Design

Architecture

szeider-python-coder — Architecture

Distribution

Directory Tree

MCP Server Configuration

Required Runtime

Supported Models (via OpenRouter)

Components

szeider-python-coder — Components

CLI Tools (coder agent)

MCP Server Tools (ipython_mcp)

CLI Options (coder)

System Prompts (2)

Prompts

szeider-python-coder — Prompt Excerpts

Excerpt 1: system.md — Core verification mandate

Excerpt 2: system.md — Error recovery protocol

Excerpt 3: system.md — Code cleaning requirements

Excerpt 4: system.md — Available tools description

Uniqueness

szeider-python-coder — Uniqueness

Differs from Seeds

Positioning

Observable Failure Modes

Explicit Antipatterns

Workflow

szeider-python-coder — Workflow

Coder Agent Workflow (one-shot)

Verification Gate (mandatory)

MCP Server Workflow (multi-kernel)

Artifacts

Memory Context

szeider-python-coder — Memory & Context

State Storage

Cross-Session Handoff

Multi-Kernel State

Context from Project Templates

Compaction Handling

Orchestration

szeider-python-coder — Orchestration

Multi-Agent Pattern

Execution Mode

Isolation Mechanism

Multi-Model

Auto-Validators

Prompt Chaining

Consensus Mechanism

LangGraph ReAct

Ui Cli Surface

szeider-python-coder — UI / CLI Surface

CLI Binary

Subcommands / Flags

MCP Server

Local UI

IDE Integration

Observability

Related frameworks

Excerpt 1: `system.md` — Core verification mandate

Excerpt 2: `system.md` — Error recovery protocol

Excerpt 3: `system.md` — Code cleaning requirements

Excerpt 4: `system.md` — Available tools description