Skip to content
/

mini-coding-agent

mini-coding-agent · rasbt/mini-coding-agent · ★ 882 · last commit 2026-04-07

Primitive shape 5 total
Commands 5
00

Summary

mini-coding-agent — Summary

mini-coding-agent is a minimal, zero-dependency Python coding agent by Sebastian Raschka (rasbt) designed explicitly as an educational reference implementation for understanding the six core components of coding agents. The entire agent lives in a single 400-line Python file (mini_coding_agent.py) with no external Python runtime dependencies beyond the standard library — it talks to Ollama's /api/generate endpoint directly. The six components are: (1) live repo context collection, (2) stable prompt shape with cache reuse, (3) structured tools with validation and approval gates, (4) context reduction and output management, (5) transcript/memory persistence with session resumption, and (6) delegation to bounded subagents. The agent ships a mini-coding-agent CLI entry point (via uv run), an interactive REPL with slash commands, approval modes (ask/auto/never), and session persistence under .mini-coding-agent/sessions/. The companion Substack tutorial "Components of a Coding Agent" is the primary documentation.

differs_from_seeds: mini-coding-agent is closest to agent-os in its minimalism (zero dependencies, markdown-first documentation) but is a fully runnable Python agent rather than a bash install script. Unlike every seed that targets Claude Code or another commercial AI tool, mini-coding-agent targets Ollama local models exclusively. The zero-dependency design (standard library only), single-file implementation, and explicit educational framing make it unique in the corpus — it is the only framework designed to be read cover-to-cover as a learning artifact, not just used as a tool.

01

Overview

mini-coding-agent — Overview

Origin

Authored by Sebastian Raschka (rasbt), a machine learning researcher and educator known for his book "Build a Large Language Model (From Scratch)." The companion tutorial, published in his Substack newsletter "Ahead of AI," explains the six components in depth.

Philosophy

The project's explicit purpose is educational — every design decision prioritizes readability over features:

"A minimal local agent loop with: workspace snapshot collection, stable prompt plus turn state, structured tools, approval handling for risky tools, transcript and memory persistence, bounded delegation."

From the README:

"This coding harness is organized around six practical building blocks"

The six components are labeled explicitly in the source code comments:

##############################
#### Six Agent Components ####
##############################
# 1) Live Repo Context -> WorkspaceContext
# 2) Prompt Shape And Cache Reuse -> build_prefix, memory_text, prompt
# 3) Structured Tools, Validation, And Permissions -> build_tools, run_tool, validate_tool, approve, parse, path, tool_*
# 4) Context Reduction And Output Management -> clip, history_text
# 5) Transcripts, Memory, And Resumption -> SessionStore, record, note_tool, ask, reset
# 6) Delegation And Bounded Subagents -> tool_delegate

Target Audience

The README is written for someone who wants to understand coding agent internals — "what makes a coding agent work" — not for production deployment.

Model Backend

Ollama only (/api/generate endpoint). Default model: qwen3.5:4b. Larger models supported but not required.

Design Constraint: Zero Python Dependencies

Beyond the Python standard library. No LangChain, no anthropic SDK, no requests. Just urllib.request and subprocess.

02

Architecture

mini-coding-agent — Architecture

Distribution

  • Type: Single Python file (mini_coding_agent.py) + pyproject.toml for CLI entry point
  • License: Apache-2.0
  • Runtime: Python 3.10+
  • Dependencies: None (Python standard library only)
  • Model backend: Ollama (local, http://localhost:11434/api/generate)

File Structure

mini-coding-agent/
├── mini_coding_agent.py  # Entire agent in one file
├── pyproject.toml        # CLI entry point (mini-coding-agent)
├── README.md
├── EXAMPLE.md            # Concrete usage walkthrough
└── tests/                # Test suite

Six Components (labeled in source)

Component Code Description
1 WorkspaceContext Collects repo layout, git state, branch, recent commits, project docs (AGENTS.md, README.md, pyproject.toml, package.json)
2 build_prefix, memory_text, prompt Stable system prompt prefix (static, cacheable) + turn state (dynamic)
3 build_tools, run_tool, validate_tool, approve 6 tools with TypedDict schemas, workspace path validation, approval gate
4 clip, history_text MAX_TOOL_OUTPUT = 4000 chars; MAX_HISTORY = 12000 chars; deduplication
5 SessionStore, record, note_tool JSONL session + distilled working memory
6 tool_delegate Bounded subagents with inherited context

Approval Modes

  • --approval ask (default): prompts before risky tool calls (bash, file writes)
  • --approval auto: all risky actions allowed automatically
  • --approval never: risky actions denied (read-only mode)

Session Storage

.mini-coding-agent/sessions/ — JSONL files per session.

Config Files

  • AGENTS.md, README.md, pyproject.toml, package.json — read at startup as workspace context
  • No configuration file for the agent itself
03

Components

mini-coding-agent — Components

Tools (6)

Tool Type Description
read_file Safe Read file contents
list_dir Safe List directory contents
search_files Safe Search for patterns in files
run_command Risky (approval) Execute shell commands
write_file Risky (approval) Create or write files
delegate Risky (approval) Spawn bounded subagent

Risky tools require user approval when --approval ask mode is active.

Slash Commands (REPL)

Command Description
/help Show available commands
/memory Print distilled session memory (current task, tracked files, notes)
/session Print path to current saved session JSON file
/reset Clear session history + distilled memory (keeps REPL open)
/exit or /quit Exit interactive session

WorkspaceContext (Component 1)

Collects at startup:

  • cwd — current working directory
  • repo_root — git repository root
  • branch — current branch
  • default_branch — default branch
  • statusgit status output
  • recent_commits — recent commit messages
  • project_docs — content of AGENTS.md, README.md, pyproject.toml, package.json

SessionStore (Component 5)

Maintains:

  • Full durable transcript (JSONL)
  • Distilled working memory (current task, tracked files, notes)

Working memory is separate from the full transcript — it's the smaller, curated context the agent uses actively.

Subagent (Component 6: tool_delegate)

Bounded delegation:

  • Inherits enough context to complete the subtask
  • Limited scope — cannot recursively delegate (bounded delegation)
  • Returns a summary result to the parent agent

Ignored Paths

IGNORED_PATH_NAMES = {".git", ".mini-coding-agent", "__pycache__", ".pytest_cache", ".ruff_cache", ".venv", "venv"}
05

Prompts

mini-coding-agent — Prompts

Prompt Architecture

The prompt is split into:

  1. Stable prefix (build_prefix): workspace context (git status, branch, project docs) — static per session, designed for model prompt caching
  2. Dynamic turn state: conversation history, working memory, current user message

Verbatim Excerpt 1 — Component 2 (Prompt Shape) from source

##############################
#### 2) Prompt Shape And Cache Reuse ####
##############################
# build_prefix, memory_text, prompt

The stable prefix approach is a performance optimization: the workspace context (cwd, git status, branch, project docs) doesn't change during a session, so it can be placed at the start of the prompt where token cache hits are most likely.

Verbatim Excerpt 2 — Context Management Constants

MAX_TOOL_OUTPUT = 4000
MAX_HISTORY = 12000
IGNORED_PATH_NAMES = {".git", ".mini-coding-agent", "__pycache__", ".pytest_cache", ".ruff_cache", ".venv", "venv"}

Technique: Explicit numeric budgets for context management. The MAX_TOOL_OUTPUT prevents a single tool call from flooding the context window; MAX_HISTORY limits total conversation history. These are production-grade constraints even in an educational reference.

Verbatim Excerpt 3 — Component Labels as Prompting Technique

DOC_NAMES = ("AGENTS.md", "README.md", "pyproject.toml", "package.json")

The build_prefix function reads these files and includes their content in the system prompt's workspace context section. AGENTS.md (the vendor-neutral standard) is prioritized first — the agent always starts with project-level instructions if they exist.

Component 3: Tool Schema Pattern

Tools are defined with TypedDict schemas. Each tool includes:

  • Name
  • Description
  • Parameter types (validated before execution)
  • Risk level (safe vs risky)

The validation runs before the approval gate — invalid tool calls are rejected before asking the user to approve them.

Working Memory as Prompt Technique

The /memory output (current task + tracked files + notes) is a distilled summary that is continuously maintained as a smaller "active context." This prevents the full 12K-char history from being dominant — the distilled memory acts as a persistent summary the agent can reference.

09

Uniqueness

mini-coding-agent — Uniqueness

differs_from_seeds

mini-coding-agent has no close seed equivalent. Its closest seed in spirit is agent-os (minimalist, educational, markdown-first) but agent-os is a bash install script that sets up template files, while mini-coding-agent is a fully runnable Python agent. Unlike every other seed and framework in this batch, mini-coding-agent is explicitly educational — the source code has component labels, the README explains the six components, and the companion Substack article provides the full tutorial. The Ollama-only model backend makes it the only framework in the corpus that requires no API keys and no cloud service, running entirely on a local CPU/GPU. The single-file zero-dependency design is maximally portable: python mini_coding_agent.py on any Python 3.10+ machine with Ollama installed. Unlike superpowers (TDD-enforced, spec-first), mini-coding-agent imposes zero workflow requirements — it's a blank canvas that demonstrates the mechanics.

Positioning

mini-coding-agent is the "learn by reading the code" entry point in the coding agent ecosystem. Sebastian Raschka's audience (ML researchers, students, educators) wants to understand how agents work, not just use one. The companion Substack tutorial is the primary product; the code is the illustration.

Observable Failure Modes

  • Ollama-only: Not usable without Ollama. Cloud/API models not supported.
  • No MCP: No extensibility via MCP servers.
  • Single-user REPL: No background mode, no scheduling, no IM integration.
  • Last commit April 2026: Development appears to have slowed after tutorial publication.
  • No TDD: No test-first enforcement — purely reactive to user requests.
  • Context limits conservative: MAX_HISTORY=12000 chars is modest; complex tasks may exceed this frequently.
  • Bounded delegation only: Subagents cannot recurse, limiting complex task decomposition.
04

Workflow

mini-coding-agent — Workflow

Standard Session Workflow

Phase Description Artifact
1. Start uv run mini-coding-agent REPL opens
2. Workspace context Git status, branch, recent commits, project docs collected WorkspaceContext
3. Prompt build Stable prefix built from workspace context System prompt
4. Task entry User types task in REPL Task message
5. Tool calls Agent calls tools (read/list/search/bash/write/delegate) Tool results
6. Approval gate Risky tools require user confirmation (ask mode) Approved/denied
7. Context management Long outputs clipped at 4K chars; history managed at 12K Pruned history
8. Session save JSONL written to .mini-coding-agent/sessions/ Session file
9. Memory distillation Working memory updated (current task, tracked files, notes) Working memory

Session Resume Workflow

uv run mini-coding-agent --resume latest
uv run mini-coding-agent --resume 20260401-144025-2dd0aa

Session ID format: YYYYMMDD-HHMMSS-<hash>

Approval Gate Detail

For risky tools (bash, write_file, delegate):

  • ask mode: Shows tool name + args, prompts [y/N]
  • auto mode: All approved automatically
  • never mode: All denied

Phase-to-Artifact Map

Phase Artifact
Workspace context WorkspaceContext object injected into prompt prefix
Tool execution Tool result (clipped at MAX_TOOL_OUTPUT=4000)
Session save .mini-coding-agent/sessions/<id>.jsonl
Memory update Distilled working memory (task + files + notes)
06

Memory Context

mini-coding-agent — Memory & Context

Memory Architecture

Full Transcript

  • Path: .mini-coding-agent/sessions/<YYYYMMDD-HHMMSS-hash>.jsonl
  • Content: Complete conversation history (all turns, tool calls, tool results)
  • Purpose: Durable record + session resumption

Working Memory (Distilled)

  • Format: In-memory Python dict (persisted in session JSON)
  • Content: Current task, tracked files, notes
  • Purpose: Smaller, curated active context for the model

The distinction is important: the full transcript is the audit trail; the working memory is what the agent "actively knows" and references.

Context Reduction (Component 4)

MAX_TOOL_OUTPUT = 4000  # Max chars for any single tool result
MAX_HISTORY = 12000     # Max chars for total conversation history
  • Long tool outputs clipped with clip() function (truncation with byte count)
  • Repeated reads deduplicated
  • Older transcript entries compressed to keep under budget
  • middle() function for symmetric truncation (head + ... + tail)

Session Resumption

uv run mini-coding-agent --resume latest

The agent restores:

  • Full conversation history from JSONL
  • Distilled working memory (task + tracked files + notes)
  • Session ID

Cross-Session Memory

No long-term memory beyond the session JSONL file. Each new session starts fresh unless explicitly resumed.

State Files

  • .mini-coding-agent/sessions/<id>.jsonl — session transcript + working memory

No External Memory

No vector DB, no SQLite, no external service. Just files.

07

Orchestration

mini-coding-agent — Orchestration

Subagent Support

Yes — bounded delegation via tool_delegate (Component 6).

Bounded delegation constraints:

  • Subagent inherits "enough context to help" but operates within limits
  • No recursive delegation (subagents cannot spawn sub-subagents)
  • Subagent returns a summary result to the parent

Orchestration Pattern

sequential — single-threaded Python loop. No parallel agents. The tool_delegate call is synchronous (blocking).

Execution Mode

interactive-loop — user types a task, agent responds, repeat.

Isolation Mechanism

None. All file tools are workspace-path-validated (no access outside the target repo), but no VM isolation, no container, no git worktree.

Multi-Model

No. Single model (Ollama), configured via CLI argument or default (qwen3.5:4b).

Approval Gate

Three modes: ask (default), auto, never. Per-tool risky classification. Not per-operation — "risky" is a property of the tool type, not the specific action.

No Daemon Mode

mini-coding-agent runs in the foreground as a REPL. No background process, no daemon, no scheduled tasks.

Agent Loop

Standard think → act → observe:

  1. Build prompt (prefix + history + working memory + user message)
  2. Call Ollama /api/generate
  3. Parse tool calls from response
  4. Validate tool arguments
  5. Approval gate for risky tools
  6. Execute tool
  7. Append to history
  8. Update working memory if applicable
  9. Repeat until no more tool calls
08

Ui Cli Surface

mini-coding-agent — UI / CLI Surface

CLI Binary

Yesmini-coding-agent CLI entry point via uv run mini-coding-agent or python mini_coding_agent.py.

# Start fresh
uv run mini-coding-agent

# With options
uv run mini-coding-agent --model qwen3.5:9b --approval auto

# Resume latest session
uv run mini-coding-agent --resume latest

# Resume specific session
uv run mini-coding-agent --resume 20260401-144025-2dd0aa

CLI Options

Option Default Description
--model qwen3.5:4b Ollama model name
--approval ask Approval mode: ask, auto, never
--resume Resume a session: latest or session ID

REPL Interface

Interactive text REPL with slash commands:

  • /help — show commands
  • /memory — print distilled working memory
  • /session — print session file path
  • /reset — clear history + memory
  • /exit / /quit — exit

No Web UI

No web dashboard, no GUI, no Electron app.

No Daemon / Background Mode

Runs as a foreground process only.

ASCII Art Banner

The agent has a cat-face ASCII art welcome banner defined in WELCOME_ART — a humanizing touch in an educational reference.

Install (via uv)

git clone https://github.com/rasbt/mini-coding-agent.git
cd mini-coding-agent
uv run mini-coding-agent

Without uv:

python mini_coding_agent.py

No pip install needed (zero dependencies beyond stdlib).

Related frameworks

same archetype · same primary tool · same memory type

BMAD-METHOD ★ 48k

Provides a full agile delivery lifecycle with named expert-persona AI collaborators that elicit the human's best thinking rather…

Agent OS ★ 4.6k

Extracts implicit codebase conventions into token-efficient markdown standards files and injects them selectively into AI agent…

Claude Conductor ★ 367

Gives Claude Code a persistent, cross-linked, auto-analyzed documentation system so it retains codebase context across sessions.

Spec-Driver (Greenfield Spec-Driven Development) ★ 25

Prevents spec rot in AI-assisted development by making implementation changes flow back into evergreen, authoritative specs via…

Anthropic Knowledge Work Plugins ★ 16k

Role-specialized plugin bundles with live MCP connectors that turn Claude into a domain expert for enterprise knowledge workers.

Codex Integration for Claude Code (skill-codex) ★ 1.3k

Single Claude Code skill that handles Codex CLI invocation correctly (stdin blocking, thinking token suppression, session resume)…