Skip to content
/

Micro Agent (Builder.io)

micro-agent · BuilderIO/micro-agent · ★ 4.3k · last commit 2024-11-14

Primitive shape
No installable primitives
00

Summary

Micro Agent — Summary

Micro Agent (by Builder.io) is a TypeScript npm CLI that implements the smallest possible test-driven coding agent: write a test, then iterate on code until all tests pass. Its philosophy is explicitly scoped to avoid the "Roomba under the table" problem of general coding agents — it does exactly one thing (generate code that passes tests) and nothing else. It supports multiple LLM backends (OpenAI, Claude, Ollama, Groq) via a unified abstraction, can operate in interactive mode (asking for feedback on the generated test), and supports both text-based and visual (screenshot) test matching. The systemPrompt is 8 lines of TypeScript string. The total codebase is a few hundred lines.

Compared to seeds: closest to mini-swe-agent in intentional scope restriction, but micro-agent is narrower still — it is not a general issue-fixer but specifically a test-driven code generator. Unlike any seed, it makes test generation (not spec generation) the first-class artifact. The closest seed pattern is agent-os (Archetype 4 — minimal by design), but micro-agent has actual code execution.

01

Overview

Micro Agent — Overview

Origin

Created by Builder.io (Steve Sewell et al.). Repository: BuilderIO/micro-agent. MIT license. ~4307 GitHub stars. Last push: 2024-11-14 (likely in maintenance mode).

Philosophy

From README:

"LLMs are great at giving you broken code, and it can take repeat iteration to get that code to work as expected. So why do this manually when AI can handle not just the generation but also the iteration and fixing?"

"AI agents are cool, but general-purpose coding agents rarely work as hoped or promised. They tend to go haywire with compounding errors. Think of your Roomba getting stuck under a table, x1000."

"The idea of a micro agent is to: (1) Create a definitive test case that can give clear feedback if the code works as intended or not, and (2) Iterate on code until all test cases pass."

"This project is not trying to be an end-to-end developer. AI agents are not capable enough to reliably try to be that yet. This project won't install modules, read and write multiple files, or do anything else that is highly likely to cause havoc when it inevitably fails."

What it explicitly is NOT

  • Not an end-to-end developer
  • Won't install modules
  • Won't read/write multiple files
  • Won't do anything with high failure risk

Distribution

  • npm install -g @builder.io/micro-agent
  • Binary: micro-agent
  • Requires Node.js v18+
02

Architecture

Micro Agent — Architecture

Distribution & Install

  • npm install -g @builder.io/micro-agent
  • Binary: micro-agent
  • Requires Node.js v18+

Source Structure

src/
├── cli.ts                    # CLI entry point
├── commands/                 # CLI subcommands
├── helpers/
│   ├── systemPrompt.ts       # THE system prompt (~8 lines)
│   ├── generate.ts           # Code generation loop
│   ├── iterate-on-test.ts    # Test iteration logic
│   ├── iterate-on-test-command.ts  # Test command runner
│   ├── llm.ts                # Multi-provider LLM abstraction
│   ├── config.ts             # Config management
│   ├── interactive-mode.ts   # Interactive feedback loop
│   ├── visual-generate.ts    # Screenshot-based testing
│   ├── visual-test.ts        # Visual test execution
│   ├── output-file.ts        # File output handling
│   ├── apply-unified-diff.ts # Diff application
│   └── ...
└── tests/

LLM Provider Support

Provider How Model selection
OpenAI openai npm package + optional Azure micro-agent config set MODEL=gpt-4o
Anthropic Claude @anthropic-ai/sdk micro-agent config set MODEL=claude
Ollama ollama npm package model includes "llama" or "phi"
Groq / custom endpoint OpenAI-compatible OPENAI_API_ENDPOINT override
Azure OpenAI AzureOpenAI client Azure endpoint URL format

Operation Modes

  1. Interactive mode: micro-agent — asks user for prompt, generates test, asks for feedback on test, iterates on implementation
  2. Unit test matching: micro-agent ./file.ts -t "npm test" — provide a test command, iterate until tests pass
  3. Visual matching: micro-agent ./component.tsx --visual "screenshot.png" — iterate until visual diff matches screenshot
  4. Config management: micro-agent config set KEY=VALUE
03

Components

Micro Agent — Components

Core Files

File Purpose LOC (approx)
helpers/systemPrompt.ts The entire system prompt — 8-line string ~10
helpers/generate.ts Main code generation + test-run loop ~100
helpers/iterate-on-test.ts Update test based on user feedback ~50
helpers/iterate-on-test-command.ts Run test command, capture output ~30
helpers/llm.ts Multi-provider LLM abstraction (OpenAI, Claude, Ollama) ~200
helpers/interactive-mode.ts Interactive feedback collection ~50
helpers/visual-generate.ts Screenshot-based visual test generation ~50
helpers/visual-test.ts Visual comparison test runner ~50
helpers/config.ts Configuration storage and retrieval ~50
helpers/apply-unified-diff.ts Apply unified diffs to files ~50

CLI Subcommands

Command Purpose
micro-agent Interactive mode: prompt → generate test → iterate
micro-agent <file> -t <test-command> Unit test matching mode
micro-agent <file> --visual <screenshot> Visual matching mode
micro-agent config set KEY=VALUE Set configuration (API keys, model, endpoint)

Config Keys

Key Purpose
OPENAI_KEY OpenAI API key
ANTHROPIC_KEY Anthropic API key
MODEL Model name (gpt-4o, claude, llama, etc.)
OPENAI_API_ENDPOINT Custom endpoint (Ollama, Groq, Azure)

The Agent Loop (generate.ts)

  1. Generate test from prompt
  2. Run test command
  3. If tests pass → done
  4. Feed test output back to LLM
  5. LLM generates new code
  6. Apply unified diff to file
  7. Goto 2
05

Prompts

Micro Agent — Prompts

The System Prompt (verbatim — 8 lines)

File: src/helpers/systemPrompt.ts

export const systemPrompt = `You take a prompt and existing unit tests and generate the function implementation accordingly.

1. Think step by step about the algorithm, reasoning about the problem and the solution, similar algorithm, the state, data structures and strategy you will use. Explain all that without emitting any code in this step.

2. Emit a markdown code block with production-ready generated code (function that satisfies all the tests and the prompt).
 - Be sure your code exports function that can be called by an external test file.
 - Make sure your code is reusable and not overly hardcoded to match the prompt.
 - Use two spaces for indents. Add logs if helpful for debugging, you will get the log output on your next try to help you debug.
 - Always return a complete code snippet that can execute, nothing partial and never say "rest of your code" or similar, I will copy and paste your code into my file without modification, so it cannot have gaps or parts where you say to put the "rest of the code" back in.
 - Do not emit tests, just the function implementation.

Stop emitting after the code block`;

Prompting technique:

  1. Think-then-code: mandatory chain-of-thought (step 1) before emitting code (step 2)
  2. Concrete constraints: "I will copy and paste your code into my file without modification" — frames the model as a code block generator, not a conversationalist
  3. Explicit prohibition: "never say 'rest of your code' or similar" — addresses a common LLM failure mode
  4. Self-contained output: the complete function, nothing partial
  5. Debug affordance: "Add logs if helpful for debugging, you will get the log output on your next try" — explicitly tells the model that its output feeds back as input

Test Generation Prompt

When generating tests (from iterate-on-test.ts):

{
  role: 'system',
  content: 'You return code for a unit test only. No other words, just the code',
},
{
  role: 'user',
  content: dedent`
    Here is a unit test file generated from the following prompt
    <prompt>${options.prompt}</prompt>
    ...
    The user has given you this feedback on the test. Please update (or completely rewrite, if needed) the test based on the feedback.
    <feedback>${feedback}</feedback>
  `,
}

Prompting technique: Minimal system prompt ("just the code") + XML-tagged user content with clear feedback slot. Short and unambiguous.

09

Uniqueness

Micro Agent — Uniqueness & Positioning

Differs from Seeds

Micro Agent is closest in philosophy to mini-swe-agent (intentional scope restriction) but is narrower still: it is explicitly NOT a general coding agent but a test-driven code generator. No seed framework makes the test file the primary first-class artifact — seeds focus on specs, requirements, or tasks. Micro Agent inverts this: the test IS the spec, and the agent's only job is to make it pass. The closest seed pattern is kiro (which also uses tests as acceptance criteria), but kiro is a full IDE while micro-agent is an 8-line system prompt. Unlike all seeds, micro-agent explicitly lists what it will NOT do (no multi-file, no installs) as part of its architecture — negative capabilities are design decisions.

Positioning

The minimal viable TDD agent. "Give me a test, I'll write the code that passes it." For teams who want to enforce test-first development at the generation level without a full coding agent harness.

Key Differentiators

  1. 8-line system prompt — the shortest system prompt in the batch
  2. Test = spec: the generated (or user-provided) test file is the specification
  3. Explicit non-goals as architecture: won't do multi-file, won't install packages, won't make broad changes
  4. Visual test mode: can iterate against a screenshot target — unique in the batch
  5. Interactive test review: user can give feedback on the generated test before implementation begins

Observable Failure Modes

  • Single file limitation: many real tasks require multi-file changes; micro-agent can't help
  • No test quality enforcement: if the user's test is bad, micro-agent will generate code that passes a bad test
  • No package installation: if the implementation needs a new dependency, the user must install it manually
  • Maintenance status: last push November 2024; likely in maintenance mode
  • Hard-codes test command: user must know and provide the test runner command
04

Workflow

Micro Agent — Workflow

Interactive Mode Workflow

micro-agent
  → prompt user: "What would you like to build?"
  → generate test file from prompt
  → show test to user, ask for feedback
  → if feedback: update test → continue
  → run test command
  → if pass: done ✓
  → if fail: LLM generates code fix → apply diff → run test → loop

Unit Test Matching Mode Workflow

micro-agent ./file.ts -t "npm test"
  → read existing test file (if exists)
  → run test command to get current state
  → LLM generates/fixes implementation
  → apply unified diff to file
  → run test command again
  → if pass: done ✓
  → if fail: feed output back to LLM → loop

Visual Matching Mode Workflow

micro-agent ./component.tsx --visual target.png
  → take screenshot of current component
  → compare with target.png
  → LLM generates code adjustments
  → apply diff
  → re-screenshot
  → if match: done ✓
  → if not: feed diff back to LLM → loop

Phases & Artifacts

Phase Artifact
Test generation Test file (written to testFile path)
Implementation generation Unified diff → applied to outputFile
Test execution Test output (stdout/stderr)
Final output Passing implementation + passing test

Approval Gates

In interactive mode: user reviews the generated test and can give feedback before iteration begins. No other approval gates — implementation iteration is fully autonomous.

Explicit Non-Goals (from README)

  • Will NOT install npm/pip packages
  • Will NOT read multiple files
  • Will NOT write to multiple files
  • Will NOT do web searches
  • Will NOT manage git state
06

Memory Context

Micro Agent — Memory & Context

Memory: Single-File Scope

Micro Agent has no cross-session memory and no database. Its only state is:

  1. The target file (outputFile) — the code being generated/fixed
  2. The test file (testFile) — the test being satisfied
  3. The conversation history — in-process message list for the current generation session

Context per Iteration

Each LLM call receives:

  • The system prompt
  • The original user prompt
  • The current test code
  • The test execution output from the last run (if iterating)

No Cross-Session State

Config values (API keys, model) are persisted to a config file (micro-agent config set), but task state does not persist.

Explicit Memory Scope Constraint

The project philosophy explicitly limits scope: "won't read and write multiple files." This means the memory model is single-file by design — broader context injection is an antipattern for this tool.

07

Orchestration

Micro Agent — Orchestration

Multi-Agent: No

Single agent, single file, single loop.

Orchestration Pattern: None

Sequential iteration loop — generate code → run tests → fix code → repeat.

Execution Mode

Interactive loop — runs until tests pass or user cancels.

Isolation: None

File writes happen directly to disk. No sandboxing, no containerization.

Prompt Chaining

Yes — test output from step N becomes part of the prompt for step N+1 (the model sees its own failure output and uses it to generate a fix). This is the fundamental loop.

Multi-Model

Supported but not coordinated — user selects one model per session. No multi-model routing within a session.

08

Ui Cli Surface

Micro Agent — UI & CLI Surface

Dedicated CLI Binary

  • Binary name: micro-agent
  • Install: npm install -g @builder.io/micro-agent
  • Not a thin wrapper: TypeScript agent runtime

CLI Modes

Invocation Mode
micro-agent Interactive: ask prompt, generate test, collect feedback, iterate
micro-agent ./file.ts -t "npm test" Unit test matching: iterate until test passes
micro-agent ./component.tsx --visual img.png Visual matching: iterate until screenshot matches
micro-agent config set KEY=VALUE Configure API keys and model selection

Terminal Output

  • Uses @clack/prompts for styled interactive prompts
  • Shows iteration progress, test output, and generated code diffs
  • Streaming output from LLM (via chunk callbacks in llm.ts)

No Local Dashboard

No web UI, no TUI browser. CLI output only.

Related frameworks

same archetype · same primary tool · same memory type

claude-mem (thedotmack) ★ 78k

Background worker service captures every tool call as an observation, AI-compresses sessions, and auto-injects relevant past…

pi (badlogic/earendil) ★ 55k

A minimal, hackable, multi-provider terminal coding agent that adapts to your workflows via npm-installable TypeScript Extensions…

Agent Skills (Addy Osmani) ★ 46k

Encodes senior-engineer software development lifecycle as 23 auto-routed skills and 7 slash commands for any AI coding agent.

wshobson/agents Plugin Marketplace ★ 36k

Single Markdown source for 83 domain-specialized plugins that auto-generates idiomatic artifacts for five AI coding harnesses.

TabbyML/Tabby ★ 34k

Self-hosted AI coding assistant server (alternative to GitHub Copilot) with admin dashboard, RAG-based completions, and multi-IDE…

Compound Engineering ★ 17k

Make each unit of engineering work compound into easier future work via brainstorm→plan→execute→review→learn cycles.