Micro Agent (Builder.io)

micro-agent · BuilderIO/micro-agent · ★ 4.3k · last commit 2024-11-14

Primitive shape

No installable primitives

Summary

Micro Agent — Summary

Micro Agent (by Builder.io) is a TypeScript npm CLI that implements the smallest possible test-driven coding agent: write a test, then iterate on code until all tests pass. Its philosophy is explicitly scoped to avoid the "Roomba under the table" problem of general coding agents — it does exactly one thing (generate code that passes tests) and nothing else. It supports multiple LLM backends (OpenAI, Claude, Ollama, Groq) via a unified abstraction, can operate in interactive mode (asking for feedback on the generated test), and supports both text-based and visual (screenshot) test matching. The systemPrompt is 8 lines of TypeScript string. The total codebase is a few hundred lines.

Compared to seeds: closest to mini-swe-agent in intentional scope restriction, but micro-agent is narrower still — it is not a general issue-fixer but specifically a test-driven code generator. Unlike any seed, it makes test generation (not spec generation) the first-class artifact. The closest seed pattern is agent-os (Archetype 4 — minimal by design), but micro-agent has actual code execution.

Overview

Micro Agent — Overview

Origin

Created by Builder.io (Steve Sewell et al.). Repository: BuilderIO/micro-agent. MIT license. ~4307 GitHub stars. Last push: 2024-11-14 (likely in maintenance mode).

Philosophy

From README:

"LLMs are great at giving you broken code, and it can take repeat iteration to get that code to work as expected. So why do this manually when AI can handle not just the generation but also the iteration and fixing?"

"AI agents are cool, but general-purpose coding agents rarely work as hoped or promised. They tend to go haywire with compounding errors. Think of your Roomba getting stuck under a table, x1000."

"The idea of a micro agent is to: (1) Create a definitive test case that can give clear feedback if the code works as intended or not, and (2) Iterate on code until all test cases pass."

"This project is not trying to be an end-to-end developer. AI agents are not capable enough to reliably try to be that yet. This project won't install modules, read and write multiple files, or do anything else that is highly likely to cause havoc when it inevitably fails."

What it explicitly is NOT

Not an end-to-end developer
Won't install modules
Won't read/write multiple files
Won't do anything with high failure risk

Distribution

npm install -g @builder.io/micro-agent
Binary: micro-agent
Requires Node.js v18+

Architecture

Micro Agent — Architecture

Distribution & Install

npm install -g @builder.io/micro-agent
Binary: micro-agent
Requires Node.js v18+

Source Structure

src/
├── cli.ts                    # CLI entry point
├── commands/                 # CLI subcommands
├── helpers/
│   ├── systemPrompt.ts       # THE system prompt (~8 lines)
│   ├── generate.ts           # Code generation loop
│   ├── iterate-on-test.ts    # Test iteration logic
│   ├── iterate-on-test-command.ts  # Test command runner
│   ├── llm.ts                # Multi-provider LLM abstraction
│   ├── config.ts             # Config management
│   ├── interactive-mode.ts   # Interactive feedback loop
│   ├── visual-generate.ts    # Screenshot-based testing
│   ├── visual-test.ts        # Visual test execution
│   ├── output-file.ts        # File output handling
│   ├── apply-unified-diff.ts # Diff application
│   └── ...
└── tests/

LLM Provider Support

Provider	How	Model selection
OpenAI	`openai` npm package + optional Azure	`micro-agent config set MODEL=gpt-4o`
Anthropic Claude	`@anthropic-ai/sdk`	`micro-agent config set MODEL=claude`
Ollama	`ollama` npm package	model includes "llama" or "phi"
Groq / custom endpoint	OpenAI-compatible	`OPENAI_API_ENDPOINT` override
Azure OpenAI	AzureOpenAI client	Azure endpoint URL format

Operation Modes

Interactive mode: micro-agent — asks user for prompt, generates test, asks for feedback on test, iterates on implementation
Unit test matching: micro-agent ./file.ts -t "npm test" — provide a test command, iterate until tests pass
Visual matching: micro-agent ./component.tsx --visual "screenshot.png" — iterate until visual diff matches screenshot
Config management: micro-agent config set KEY=VALUE

Components

Micro Agent — Components

Core Files

File	Purpose	LOC (approx)
`helpers/systemPrompt.ts`	The entire system prompt — 8-line string	~10
`helpers/generate.ts`	Main code generation + test-run loop	~100
`helpers/iterate-on-test.ts`	Update test based on user feedback	~50
`helpers/iterate-on-test-command.ts`	Run test command, capture output	~30
`helpers/llm.ts`	Multi-provider LLM abstraction (OpenAI, Claude, Ollama)	~200
`helpers/interactive-mode.ts`	Interactive feedback collection	~50
`helpers/visual-generate.ts`	Screenshot-based visual test generation	~50
`helpers/visual-test.ts`	Visual comparison test runner	~50
`helpers/config.ts`	Configuration storage and retrieval	~50
`helpers/apply-unified-diff.ts`	Apply unified diffs to files	~50

CLI Subcommands

Command	Purpose
`micro-agent`	Interactive mode: prompt → generate test → iterate
`micro-agent <file> -t <test-command>`	Unit test matching mode
`micro-agent <file> --visual <screenshot>`	Visual matching mode
`micro-agent config set KEY=VALUE`	Set configuration (API keys, model, endpoint)

Config Keys

Key	Purpose
`OPENAI_KEY`	OpenAI API key
`ANTHROPIC_KEY`	Anthropic API key
`MODEL`	Model name (gpt-4o, claude, llama, etc.)
`OPENAI_API_ENDPOINT`	Custom endpoint (Ollama, Groq, Azure)

The Agent Loop (generate.ts)

Generate test from prompt
Run test command
If tests pass → done
Feed test output back to LLM
LLM generates new code
Apply unified diff to file
Goto 2

Prompts

Micro Agent — Prompts

The System Prompt (verbatim — 8 lines)

File: src/helpers/systemPrompt.ts

export const systemPrompt = `You take a prompt and existing unit tests and generate the function implementation accordingly.

1. Think step by step about the algorithm, reasoning about the problem and the solution, similar algorithm, the state, data structures and strategy you will use. Explain all that without emitting any code in this step.

2. Emit a markdown code block with production-ready generated code (function that satisfies all the tests and the prompt).
 - Be sure your code exports function that can be called by an external test file.
 - Make sure your code is reusable and not overly hardcoded to match the prompt.
 - Use two spaces for indents. Add logs if helpful for debugging, you will get the log output on your next try to help you debug.
 - Always return a complete code snippet that can execute, nothing partial and never say "rest of your code" or similar, I will copy and paste your code into my file without modification, so it cannot have gaps or parts where you say to put the "rest of the code" back in.
 - Do not emit tests, just the function implementation.

Stop emitting after the code block`;

Prompting technique:

Think-then-code: mandatory chain-of-thought (step 1) before emitting code (step 2)
Concrete constraints: "I will copy and paste your code into my file without modification" — frames the model as a code block generator, not a conversationalist
Explicit prohibition: "never say 'rest of your code' or similar" — addresses a common LLM failure mode
Self-contained output: the complete function, nothing partial
Debug affordance: "Add logs if helpful for debugging, you will get the log output on your next try" — explicitly tells the model that its output feeds back as input

Test Generation Prompt

When generating tests (from iterate-on-test.ts):

{
  role: 'system',
  content: 'You return code for a unit test only. No other words, just the code',
},
{
  role: 'user',
  content: dedent`
    Here is a unit test file generated from the following prompt
    <prompt>${options.prompt}</prompt>
    ...
    The user has given you this feedback on the test. Please update (or completely rewrite, if needed) the test based on the feedback.
    <feedback>${feedback}</feedback>
  `,
}

Prompting technique: Minimal system prompt ("just the code") + XML-tagged user content with clear feedback slot. Short and unambiguous.

Uniqueness

Micro Agent — Uniqueness & Positioning

Differs from Seeds

Micro Agent is closest in philosophy to mini-swe-agent (intentional scope restriction) but is narrower still: it is explicitly NOT a general coding agent but a test-driven code generator. No seed framework makes the test file the primary first-class artifact — seeds focus on specs, requirements, or tasks. Micro Agent inverts this: the test IS the spec, and the agent's only job is to make it pass. The closest seed pattern is kiro (which also uses tests as acceptance criteria), but kiro is a full IDE while micro-agent is an 8-line system prompt. Unlike all seeds, micro-agent explicitly lists what it will NOT do (no multi-file, no installs) as part of its architecture — negative capabilities are design decisions.

Positioning

The minimal viable TDD agent. "Give me a test, I'll write the code that passes it." For teams who want to enforce test-first development at the generation level without a full coding agent harness.

Key Differentiators

8-line system prompt — the shortest system prompt in the batch
Test = spec: the generated (or user-provided) test file is the specification
Explicit non-goals as architecture: won't do multi-file, won't install packages, won't make broad changes
Visual test mode: can iterate against a screenshot target — unique in the batch
Interactive test review: user can give feedback on the generated test before implementation begins

Observable Failure Modes

Single file limitation: many real tasks require multi-file changes; micro-agent can't help
No test quality enforcement: if the user's test is bad, micro-agent will generate code that passes a bad test
No package installation: if the implementation needs a new dependency, the user must install it manually
Maintenance status: last push November 2024; likely in maintenance mode
Hard-codes test command: user must know and provide the test runner command

Workflow

Micro Agent — Workflow

Interactive Mode Workflow

micro-agent
  → prompt user: "What would you like to build?"
  → generate test file from prompt
  → show test to user, ask for feedback
  → if feedback: update test → continue
  → run test command
  → if pass: done ✓
  → if fail: LLM generates code fix → apply diff → run test → loop

Unit Test Matching Mode Workflow

micro-agent ./file.ts -t "npm test"
  → read existing test file (if exists)
  → run test command to get current state
  → LLM generates/fixes implementation
  → apply unified diff to file
  → run test command again
  → if pass: done ✓
  → if fail: feed output back to LLM → loop

Visual Matching Mode Workflow

micro-agent ./component.tsx --visual target.png
  → take screenshot of current component
  → compare with target.png
  → LLM generates code adjustments
  → apply diff
  → re-screenshot
  → if match: done ✓
  → if not: feed diff back to LLM → loop

Phases & Artifacts

Phase	Artifact
Test generation	Test file (written to `testFile` path)
Implementation generation	Unified diff → applied to `outputFile`
Test execution	Test output (stdout/stderr)
Final output	Passing implementation + passing test

Approval Gates

In interactive mode: user reviews the generated test and can give feedback before iteration begins. No other approval gates — implementation iteration is fully autonomous.

Explicit Non-Goals (from README)

Will NOT install npm/pip packages
Will NOT read multiple files
Will NOT write to multiple files
Will NOT do web searches
Will NOT manage git state

Memory Context

Micro Agent — Memory & Context

Memory: Single-File Scope

Micro Agent has no cross-session memory and no database. Its only state is:

The target file (outputFile) — the code being generated/fixed
The test file (testFile) — the test being satisfied
The conversation history — in-process message list for the current generation session

Context per Iteration

Each LLM call receives:

The system prompt
The original user prompt
The current test code
The test execution output from the last run (if iterating)

No Cross-Session State

Config values (API keys, model) are persisted to a config file (micro-agent config set), but task state does not persist.

Explicit Memory Scope Constraint

The project philosophy explicitly limits scope: "won't read and write multiple files." This means the memory model is single-file by design — broader context injection is an antipattern for this tool.

Orchestration

Micro Agent — Orchestration

Multi-Agent: No

Single agent, single file, single loop.

Orchestration Pattern: None

Sequential iteration loop — generate code → run tests → fix code → repeat.

Execution Mode

Interactive loop — runs until tests pass or user cancels.

Isolation: None

File writes happen directly to disk. No sandboxing, no containerization.

Prompt Chaining

Yes — test output from step N becomes part of the prompt for step N+1 (the model sees its own failure output and uses it to generate a fix). This is the fundamental loop.

Multi-Model

Supported but not coordinated — user selects one model per session. No multi-model routing within a session.

Ui Cli Surface

Micro Agent — UI & CLI Surface

Dedicated CLI Binary

Binary name: micro-agent
Install: npm install -g @builder.io/micro-agent
Not a thin wrapper: TypeScript agent runtime

CLI Modes

Invocation	Mode
`micro-agent`	Interactive: ask prompt, generate test, collect feedback, iterate
`micro-agent ./file.ts -t "npm test"`	Unit test matching: iterate until test passes
`micro-agent ./component.tsx --visual img.png`	Visual matching: iterate until screenshot matches
`micro-agent config set KEY=VALUE`	Configure API keys and model selection

Terminal Output

Uses @clack/prompts for styled interactive prompts
Shows iteration progress, test output, and generated code diffs
Streaming output from LLM (via chunk callbacks in llm.ts)

No Local Dashboard

No web UI, no TUI browser. CLI output only.

Related frameworks

same archetype · same primary tool · same memory type

claude-mem (thedotmack) ★ 78k

A8 Cross-runtime harness

Background worker service captures every tool call as an observation, AI-compresses sessions, and auto-injects relevant past…

pi (badlogic/earendil) ★ 55k

A8 Cross-runtime harness

A minimal, hackable, multi-provider terminal coding agent that adapts to your workflows via npm-installable TypeScript Extensions…

Agent Skills (Addy Osmani) ★ 46k

A8 Cross-runtime harness

Encodes senior-engineer software development lifecycle as 23 auto-routed skills and 7 slash commands for any AI coding agent.

wshobson/agents Plugin Marketplace ★ 36k

A8 Cross-runtime harness

Single Markdown source for 83 domain-specialized plugins that auto-generates idiomatic artifacts for five AI coding harnesses.

TabbyML/Tabby ★ 34k

A8 Cross-runtime harness

Self-hosted AI coding assistant server (alternative to GitHub Copilot) with admin dashboard, RAG-based completions, and multi-IDE…

Compound Engineering ★ 17k

A8 Cross-runtime harness

Make each unit of engineering work compound into easier future work via brainstorm→plan→execute→review→learn cycles.

Distribution

Type: npm-package
License: MIT
Install: npm-install
Version: main (2024-11-14)

Surfaces

CLI binary: micro-agent
CLI subcmds: 3
Local UI: No
Tech stack: null

Components

Commands: 0
Skills: 0
Subagents: 0
Hooks: 0
MCP servers: 0
MCP tools: 0
Scripts: 0
Templates: 0

Workflow

Phases: 5
Approval gates: 1
Spec format: none
Spec storage: none
Delta or full: delta-diff

Orchestration

Multi-agent: No
Pattern: none
Max concurrent: 1
Isolation: none
Consensus: none
Prompt chaining: Yes

Multi-model

Multi-model: No
BYOK: Yes
Locked to: null (OpenAI, Anthropic, Ollama, Groq via config)
Modal: text+vision

Execution

Mode: interactive-loop
Crash recovery: No
Compaction: No
Session handoff: No
Streaming: Yes

Memory

Type: none
Persistence: none
Search: none
State files: 2 files

Quality

TDD: Yes
TDD mechanism: post-hook-test-runner
Validators: 1
Self-review: none

Git / Observability

Auto commit: No
Auto PR: No
Auto merge: No
Worktree/feat: No
Audit log: No
Audit format: none
Replay: No

Tools

Primary: micro-agent-cli
Targets: 2
Portability: high

Signals

Stars: 4.3k
Last commit: 2024-11-14
Maintainer: dormant
Quality score: 2.4/10