Skip to content
/

agentic-python-coder

szeider-python-coder · szeider/agentic-python-coder · ★ 3 · last commit 2026-02-15

Primitive shape 4 total
MCP tools 4
00

Summary

szeider-python-coder — Summary

agentic-python-coder is a dual-mode Python execution framework: a coder CLI (ReAct agent using LangGraph over OpenRouter) and an ipython_mcp MCP server that exposes a persistent IPython kernel to any MCP-compatible client. The coder agent takes a task (inline, file, or interactive), executes Python iteratively with python_exec, verifies solutions against every constraint before calling save_code, and terminates. The MCP server supports multi-kernel operation for parallel agents. Both share the same IPython kernel for stateful execution. Published on PyPI as agentic-python-coder. With only 3 stars and last commit February 2026, this is primarily notable for its multi-kernel MCP design and its academic research context (associated with a published arxiv paper).

Differs from seeds: No seed ships a standalone Python execution agent. Closest to claude-flow's MCP-anchored toolserver but the MCP server is minimal (4 tools: execute, reset, status, interrupt) and the primary value is the persistence across executions (IPython kernel state). Unlike all seeds, this is a LangGraph-based agent, not a Claude Code plugin.

01

Overview

szeider-python-coder — Overview

Origin

Created by Stefan Szeider (szeider), a computer science professor at TU Wien. Associated with a published arxiv paper: [Szeider 2025, arxiv-2508.07468] on constraint modelling applications. Licensed Apache-2.0.

Philosophy

The system prompt is explicit:

"You are a Python coding assistant designed to solve focused problems efficiently."

"Verify Before Saving: Before calling save_code, you MUST verify your solution: Execute the full script via python_exec and confirm it produces correct output. For constraint/logic problems: write a verification function that checks the output against EVERY constraint in the problem statement using plain Python asserts, independent of your solver model."

The agent is deliberately minimalist:

  • One task at a time
  • One tool to execute code (python_exec)
  • One tool to save code (save_code)
  • Mandatory verification before saving
  • Stop when done — no extra features

Research Context

The coder agent is designed for computational constraint problems (CPMpy, Clingo, regex) — examples show it solving the 8-queens problem and CSP instances. The arxiv paper describes using this agent for automated constraint model generation and verification.

Dual-Mode Design

  • Mode A (coder CLI): Autonomous ReAct agent, runs to completion, saves one Python file
  • Mode B (ipython_mcp): Stateful IPython kernel exposed as MCP tools — usable from Claude Desktop, Claude Code, or any MCP client for interactive Python sessions
02

Architecture

szeider-python-coder — Architecture

Distribution

PyPI package.

# Install as CLI tool
uv tool install agentic-python-coder

# Or as MCP server (no install needed)
uvx --from agentic-python-coder ipython_mcp

Directory Tree

.
├── coder/
│   ├── src/
│   │   └── (LangGraph ReAct agent implementation)
│   ├── prompts/
│   │   ├── system.md          # System prompt for fileless mode
│   │   └── system_todo.md     # System prompt for file-with-todo mode
│   └── tests/
├── pyproject.toml              # Package: agentic-python-coder, Apache-2.0
├── README.md
└── LICENSE

MCP Server Configuration

{
  "mcpServers": {
    "ipython": {
      "command": "uvx",
      "args": ["--from", "agentic-python-coder", "ipython_mcp"]
    }
  }
}

Required Runtime

  • Python 3.13
  • UV package manager
  • OpenRouter API key (for coder CLI)
  • MCP client (for ipython_mcp server)

Supported Models (via OpenRouter)

deepseek31, gemini25, gemini3flash, gemini3pro, gpt52, grok41, opus45, qwen3, sonnet45

03

Components

szeider-python-coder — Components

CLI Tools (coder agent)

Tool Description
python_exec Execute Python code in persistent IPython kernel; returns JSON: success, stdout, result, stderr, error
save_code Save final verified solution to {basename}_code.py; call ONCE after verification passes

MCP Server Tools (ipython_mcp)

Tool Description
python_exec Execute Python code; auto-starts session if needed; default 30s timeout
python_reset Create new kernel (no kernel_id) OR reset existing kernel (with kernel_id); optionally install packages
python_status Check session state: active flag, kernel IDs, Python version, packages, variables
python_interrupt Send interrupt signal to stop long-running code; session state preserved

CLI Options (coder)

Flag Description
--version, -V Show version
--init [TEMPLATE] Initialize example templates (cpmpy, clingo, regex, or all)
--task FILE Task from file
--dir DIR Working directory
--with PACKAGES Extra Python packages to install
--project FILE Project template context
-i Interactive mode
--api-key KEY OpenRouter API key

System Prompts (2)

  • coder/prompts/system.md — Fileless mode (inline task)
  • coder/prompts/system_todo.md — File+todo mode (task from file)
05

Prompts

szeider-python-coder — Prompt Excerpts

Excerpt 1: system.md — Core verification mandate

Technique: Mandatory pre-save verification with constraint-independence requirement; anti-trust-solver warning.

2. **Verify Before Saving**: Before calling save_code, you MUST verify your solution:
   - Execute the full script via python_exec and confirm it produces correct output
   - For constraint/logic problems: write a verification function that checks the output against EVERY constraint in the problem statement using plain Python asserts, independent of your solver model
   - For problems with a specific output format: assert that JSON keys, array shapes, and value ranges match the spec exactly
   - For optimization: confirm optimality (e.g., re-solve with a stricter bound and confirm infeasibility)
   - Do NOT trust that solver.solve()==True means your model is correct — your constraints may be wrong
3. **Save Once**: Call save_code only after verification passes
4. **Stop When Done**: Don't add features not requested

Excerpt 2: system.md — Error recovery protocol

Technique: Fallback hierarchy for common errors; module-not-found prefers built-in solution first.

## Error Recovery

- **ModuleNotFoundError**: Try to solve with built-in modules first
- **Syntax/Logic Errors**: Debug iteratively with python_exec
- **Unclear Requirements**: Document assumptions and proceed

Excerpt 3: system.md — Code cleaning requirements

Technique: Pre-save checklist enforcing production-grade output over development scaffolding.

## Code Cleaning Requirements

Before saving any code with save_code, your script MUST pass this checklist:
- Remove ALL print() statements except final output (JSON or required output)

Excerpt 4: system.md — Available tools description

Technique: JSON return format documented so the agent knows what to parse.

1. **python_exec**: Execute Python code in a persistent IPython kernel
   - The kernel maintains state between executions
   - Variables, functions, and imports persist across calls
   - Use print() for output, or the last expression will be returned
   - Returns JSON with: success, stdout, result, stderr, error
09

Uniqueness

szeider-python-coder — Uniqueness

Differs from Seeds

No seed is a standalone Python execution agent with its own LLM runtime. Closest to claude-flow's MCP-anchored toolserver (own runtime, own tools) but purpose-built for Python code execution rather than general Claude Code orchestration. The multi-kernel MCP design (each agent gets a kernel_id for isolated state) is novel — no seed implements kernel-level isolation for parallel agents. The mandatory pre-save verification with constraint-independent assertions (the "do NOT trust solver.solve()==True" instruction) is the most explicit self-verification requirement in the corpus. The OpenRouter multi-model support (9 models) makes it the most flexible in terms of LLM choice, though it is not Claude-specific.

Positioning

Niche but technically sophisticated. Primary use case is automated constraint problem solving (CPMpy, Clingo) as described in the associated arxiv paper. Secondary use case is providing any MCP client with a persistent Python execution environment.

Observable Failure Modes

  1. OpenRouter dependency: Requires OpenRouter API key; cannot use Anthropic API directly without code changes.
  2. Python 3.13 requirement: Restricts deployment environments.
  3. No streaming output: Results are returned at tool completion, not streamed.
  4. Kernel resource management: Parallel multi-kernel use without explicit cleanup can exhaust system resources.

Explicit Antipatterns

  • Trusting solver.solve()==True without independent constraint verification
  • Calling save_code before verification passes
  • Adding features not explicitly requested
  • Not removing debug print() statements before saving
04

Workflow

szeider-python-coder — Workflow

Coder Agent Workflow (one-shot)

Phase Activity Tool
1 Understand task Read task/file
2 Plan approach Reasoning (no tool)
3 Iterative development python_exec (repeated)
4 Verification python_exec (full script run + constraint assertions)
5 Save final code save_code (ONCE, after verification passes)
6 Stop Agent terminates

Verification Gate (mandatory)

Before calling save_code:

  • Execute full script and confirm correct output
  • For constraint/logic problems: write a verification function with plain Python asserts, independent of the solver model
  • For problems with specific output format: assert JSON keys, array shapes, value ranges
  • For optimization: confirm optimality (re-solve with stricter bound, confirm infeasibility)
  • "Do NOT trust that solver.solve()==True means your model is correct"

MCP Server Workflow (multi-kernel)

Agent A                              Agent B
────────                             ────────
python_reset() → kernel_id="aaa"     python_reset() → kernel_id="bbb"
python_exec(kernel_id="aaa", ...)    python_exec(kernel_id="bbb", ...)
python_exec(kernel_id="aaa", ...)    python_exec(kernel_id="bbb", ...)

Artifacts

Phase Artifact
Development Code in IPython kernel (stateful)
Save {basename}_code.py
Logging {basename}.jsonl (execution log)
06

Memory Context

szeider-python-coder — Memory & Context

State Storage

IPython kernel maintains state across python_exec calls within a session:

  • Variables persist
  • Imports persist
  • Function definitions persist
  • Package installations persist

Cross-Session Handoff

No automatic handoff. JSONL execution log ({basename}.jsonl) provides a record of the session.

Multi-Kernel State

Each python_reset() with no kernel_id creates a new isolated kernel. State is scoped to kernel_id. Multiple agents can run isolated parallel sessions.

Context from Project Templates

--project FILE passes a project template markdown file as additional context — used for domain-specific constraints (e.g., CPMpy usage patterns, Clingo syntax).

Compaction Handling

None. The agent is designed for focused, short tasks where context overload is unlikely.

07

Orchestration

szeider-python-coder — Orchestration

Multi-Agent Pattern

Parallel (via MCP multi-kernel) — each agent gets its own kernel_id for isolated execution.

Execution Mode

  • coder CLI: one-shot — takes a task, executes to completion, saves, terminates
  • ipython_mcp: event-driven — responds to MCP tool calls

Isolation Mechanism

Process-level isolation per kernel via IPython kernel management.

Multi-Model

Yes — supports 9 models via OpenRouter: deepseek31, gemini25, gemini3flash, gemini3pro, gpt52, grok41, opus45, qwen3, sonnet45. User-configured.

Auto-Validators

  • Mandatory self-verification: Before save_code, the agent executes the full script and writes independent constraint-checking assertions. This is a prompt-enforced validation, not a hook-triggered one.

Prompt Chaining

No explicit chaining. Single-task agent.

Consensus Mechanism

None.

LangGraph ReAct

The coder CLI uses LangGraph ReAct pattern — the agent loop is: Think → Act (python_exec) → Observe → repeat until save_code is called.

08

Ui Cli Surface

szeider-python-coder — UI / CLI Surface

CLI Binary

Yescoder (installed via uv tool install agentic-python-coder)

Subcommands / Flags

coder "task description"           # Inline task
coder --task problem.md            # Task from file
coder --dir results/test1 "task"   # Specify working directory
coder -i                           # Interactive mode
coder --init cpmpy                 # Initialize example templates
coder --with cpmpy --project ...   # Extra packages + project context

MCP Server

ipython_mcp — exposed via uvx --from agentic-python-coder ipython_mcp

Local UI

None. Terminal output only.

IDE Integration

Compatible with any MCP client:

  • Claude Desktop (primary MCP target)
  • Claude Code (via .mcp.json)
  • Any MCP-compatible agent

Observability

  • JSONL execution log ({basename}.jsonl)
  • No session replay beyond the log

Related frameworks

same archetype · same primary tool · same memory type

claude-mem (thedotmack) ★ 78k

Background worker service captures every tool call as an observation, AI-compresses sessions, and auto-injects relevant past…

pi (badlogic/earendil) ★ 55k

A minimal, hackable, multi-provider terminal coding agent that adapts to your workflows via npm-installable TypeScript Extensions…

Agent Skills (Addy Osmani) ★ 46k

Encodes senior-engineer software development lifecycle as 23 auto-routed skills and 7 slash commands for any AI coding agent.

wshobson/agents Plugin Marketplace ★ 36k

Single Markdown source for 83 domain-specialized plugins that auto-generates idiomatic artifacts for five AI coding harnesses.

TabbyML/Tabby ★ 34k

Self-hosted AI coding assistant server (alternative to GitHub Copilot) with admin dashboard, RAG-based completions, and multi-IDE…

Compound Engineering ★ 17k

Make each unit of engineering work compound into easier future work via brainstorm→plan→execute→review→learn cycles.