Skip to content
/

Brood Box

brood-box · stacklok/brood-box · ★ 36 · last commit 2026-05-25

Primitive shape
No installable primitives
00

Summary

Brood Box — Summary

Brood Box (bbox) is a CLI tool from Stacklok that runs any coding agent (Claude Code, Codex, OpenCode, Gemini, Hermes) inside a hardware-isolated microVM, providing KVM-backed isolation rather than container-based isolation. The workspace is copy-on-write-snapshotted before the agent runs, so the agent never touches real files; when the session ends, a SHA-256 diff is computed and changes are flushed back only after interactive per-file review. Security layers include an egress firewall with three profiles (permissive/standard/locked), ephemeral per-session SSH keys, automatic exclusion of sensitive file patterns, and Cedar-based MCP authorization. It also proxies ToolHive MCP servers into the VM, forwarding credentials and SSH agents without exposing them to the guest. Brood Box is not a coding agent framework itself — it sits one layer below the agent loop as an execution substrate, making it fundamentally different from all 11 seed frameworks. While seeds like superpowers, BMAD, or claude-flow augment how an agent behaves (via skills, hooks, personas), Brood Box augments where the agent runs and what it can affect, giving every agent (regardless of internal methodology) hardware-level isolation and workspace protection.

01

Overview

Brood Box — Overview

Origin

Brood Box was created by Stacklok, Inc. (a company focused on supply-chain and AI security) as a response to the security risks inherent in giving coding agents direct workspace access. The first public commit was in 2025, and as of May 2026 it is marked "EXPERIMENTAL" with active development.

Philosophy

The project's README states the problem directly: "Coding agents are powerful, but they need access to your workspace, your API keys, and the ability to run arbitrary code. That's a lot of trust to hand over. Containers help, but they share the host kernel. One escape and you're done."

Brood Box's answer is hardware virtualization: each agent session gets a real KVM microVM (via libkrun), not a container with a shared kernel.

Key design principles from the CLAUDE.md and ARCHITECTURE.md:

  • Zero persistent state: "Each session is fully ephemeral; nothing lingers after cleanup"
  • Review-first: COW workspace, diff on exit, interactive approve/reject per file
  • Non-overridable security patterns: .env*, *.pem, .ssh/, .aws/ are always excluded even if .broodboxignore negates them
  • Tighten-only merge: workspace-local config can only restrict, never widen, global security settings
  • DDD layered architecture: strict domain/application/infrastructure separation with no I/O in domain layer

Architecture Philosophy

"Hardware isolation with the feel of a local terminal"

The workflow is designed to be a thin wrapper: bbox claude-code and you get a full interactive agent session. The VM boots from an OCI image, mounts a COW workspace via virtio-fs, and runs a custom Go PID-1 init binary that handles networking, SSH, and the agent startup command.

02

Architecture

Brood Box — Architecture

Distribution

  • Type: Standalone binary CLI (bbox) written in Go
  • Install: Download pre-built release tarball from GitHub Releases and move to /usr/local/bin; or task build from source
  • Build: Pure Go, CGO_ENABLED=0; the embedded go-microvm runtime was pre-compiled with CGO elsewhere. Self-contained binary with bbox-init (guest PID 1) embedded.
  • Platform: Linux with /dev/kvm (KVM) or macOS Apple Silicon (Hypervisor.framework)
  • Runtime deps: None at runtime except libkrunfw firmware (downloaded and cached at ~/.cache/broodbox/firmware/ on first run); no system libkrun-devel needed

Directory Tree

brood-box/
├── cmd/
│   ├── bbox/main.go        — composition root, Cobra CLI
│   └── bbox-init/          — guest VM PID 1 init binary (Linux only)
├── pkg/
│   ├── domain/             — pure types + interfaces (no I/O)
│   │   ├── agent/          — Agent value object, env forwarding
│   │   ├── config/         — Config types, merge logic
│   │   ├── vm/             — VMRunner/VM interfaces
│   │   ├── snapshot/       — FileChange, Differ, Reviewer, Flusher interfaces
│   │   ├── egress/         — DNS-aware egress policy
│   │   └── settings/       — host-to-guest settings injection interface
│   ├── sandbox/            — SandboxRunner application orchestrator
│   └── runtime/            — public factory, wires default infra for SDK consumers
├── internal/infra/
│   ├── vm/                 — go-microvm VMRunner (KVM backend)
│   ├── ssh/                — interactive PTY terminal session
│   ├── workspace/          — COW cloning (FICLONE/clonefile)
│   ├── diff/               — SHA-256 file differ
│   ├── review/             — per-file interactive reviewer + flusher
│   ├── mcp/                — ToolHive vmcp proxy + Cedar authz profiles
│   ├── exclude/            — gitignore-style pattern matching
│   └── settings/           — FSInjector (host → guest file copy)
├── images/                 — Dockerfile definitions for agent OCI images
├── docs/
│   ├── ARCHITECTURE.md
│   ├── USER_GUIDE.md
│   └── DEVELOPMENT.md
└── Taskfile.yaml           — build system (never invoke go directly)

Config Files

  • ~/.config/broodbox/config.yaml — global config (CPUs, memory, egress profiles, MCP settings, agent env forwarding)
  • .broodbox.yaml in workspace root — per-project overrides (tighten-only: cannot widen egress or disable review)
  • .broodboxignore — gitignore-syntax exclude patterns
  • Config precedence: CLI flags > per-workspace > global

Supported Agents

Agent Command Default Resources
Claude Code bbox claude-code 2 vCPUs, 4 GiB
Codex bbox codex 2 vCPUs, 4 GiB
OpenCode bbox opencode 2 vCPUs, 4 GiB
Gemini CLI bbox gemini 2 vCPUs, 4 GiB
Hermes bbox hermes 2 vCPUs, 4 GiB

Custom agents can be defined in config.yaml with a custom OCI image.

Target AI Tools

Claude Code, Codex, OpenCode, Gemini CLI, Hermes, plus any custom agent — all wrapped at the execution layer, not the agent-instruction layer.

VM Execution Flow

bbox claude-code
  → Create COW snapshot (FICLONE on Linux, clonefile on macOS)
  → Pull OCI image; extract rootfs; inject bbox-init + SSH keys
  → Boot microVM (libkrun/KVM) with virtio-fs workspace mount
  → Guest: bbox-init as PID 1 → network → SSH → exec agent
  → Interactive SSH PTY session
  → Agent exits → VM stopped explicitly
  → SHA-256 diff → per-file review → flush accepted → cleanup snapshot
03

Components

Brood Box — Components

CLI Commands (bbox subcommands)

Command Purpose
bbox claude-code Boot Claude Code in microVM with COW workspace
bbox codex Boot Codex in microVM
bbox opencode Boot OpenCode in microVM
bbox gemini Boot Gemini CLI in microVM
bbox hermes Boot Hermes agent in microVM
bbox list List available agent configurations

CLI Flags (key, shared across agent commands)

Flag Purpose
--cpus N Override vCPU count (default: 2)
--memory N Override RAM in MiB (default: 4096)
--workspace PATH Use different workspace directory
--review Enable interactive per-file diff review on exit
--exclude PATTERN Add gitignore-style exclude patterns
--workspace-mode=direct Skip COW; agent writes directly (requires --yes)
--egress-profile PROFILE permissive / standard / locked
--allow-host HOST:PORT Add host to egress allowlist (DNS only)
--no-mcp Disable ToolHive MCP proxy
--mcp-group NAME Use specific ToolHive group for MCP servers
--mcp-authz-profile PROFILE full-access / observe / safe-tools / custom
--mcp-config PATH Custom Cedar MCP authorization policies
--timings Print per-phase timing summary to stderr
--trace Write OTel trace JSON to VM data dir

Infrastructure Components (internal)

Component Type Purpose
SandboxRunner Application service Orchestrates full lifecycle: snapshot → VM → terminal → stop → diff → review → flush
MicroVMRunner Infrastructure go-microvm VMRunner impl using KVM/Hypervisor.framework
InteractiveSession Infrastructure PTY-forwarded SSH session with SIGWINCH handling
WorkspaceCloner Infrastructure COW snapshot via FICLONE (Linux) or clonefile (macOS)
Differ Infrastructure SHA-256-based file change detection
Reviewer Infrastructure Interactive per-file diff display and accept/reject
Flusher Infrastructure Hash-re-verified file flush from snapshot to workspace
VMCPProvider Infrastructure ToolHive vmcp proxy aggregating MCP servers
FSInjector Infrastructure Copies host settings/skills/config into guest rootfs
ExcludeConfig Domain Non-overridable + overridable pattern matching
EgressPolicy Domain DNS-aware egress rules per profile
bbox-init Guest binary PID 1 in VM: mounts, networking, SSH server, reaper

MCP Authorization Profiles

Profile What agent can do
full-access (default) Everything, no restriction
observe List + read tools/prompts/resources only
safe-tools Above + non-destructive, non-open-world tools
custom Operator-defined Cedar policies from config

OCI Images (GitHub Container Registry: ghcr.io/stacklok/brood-box/)

  • claude-code — Claude Code agent image
  • codex — Codex agent image
  • opencode — OpenCode agent image
  • hermes — Hermes agent image
  • gemini — Gemini CLI image

Egress Profiles

Profile Allowed
permissive All outbound traffic
standard LLM provider + GitHub, npm, PyPI, Go proxy, Docker Hub, GHCR
locked LLM provider API only (e.g. api.anthropic.com)

Scripts / Task Targets

Build system uses task (Taskfile.yaml):

  • task build — self-contained bbox binary
  • task build-init — guest PID 1 binary
  • task test — go test with race detector
  • task lint — golangci-lint
  • task verify — fmt + lint + test
  • task image-all — build all agent OCI images
  • task image-push — push images to GHCR
05

Prompts

Brood Box — Prompts

Brood Box does not ship agent prompt files, skills, or slash-commands. It operates as an execution substrate below the agent loop — it controls the environment the agent runs in, not the instructions the agent receives.

The only "prompt-adjacent" text is injected into the guest VM environment via FSInjector, which copies host CLAUDE.md, skill files, and config files from the host into the guest rootfs. This means whatever instructions the host agent has (e.g., a Claude Code configuration with skills) are forwarded into the VM. The content of those instructions is determined by the host configuration, not by Brood Box itself.

Verbatim: CLAUDE.md (Developer Instructions, Not Agent Prompts)

The CLAUDE.md in the repository serves as development instructions for AI coding agents contributing to Brood Box itself, not as runtime agent prompts. Excerpt from the architecture section:

## Architecture — Strict DDD (Domain-Driven Design)

This project follows DDD layered architecture with dependency injection **strictly and 
without exception**. Every new type, interface, and function MUST be placed in the 
correct layer. Violating layer boundaries is a blocking issue — do not merge code 
that breaks these rules.

**Domain** (`pkg/domain/`) — Pure types and interfaces. ZERO I/O, ZERO external 
dependencies, ZERO side effects.

Technique: Imperative rule specification (DDD boundaries as non-negotiable invariants). This is a repo contributor CLAUDE.md, not a framework skill.

Verbatim: Security Exclusion Configuration

From CLAUDE.md, security-sensitive pattern list (injected as domain constants, not prompts):

Security-sensitive patterns (`.env*`, `*.pem`, `.ssh/`, `.aws/`, etc.) are **always 
excluded** and cannot be negated.

The agent never receives instructions about security; rather, the infrastructure itself enforces exclusion before the agent ever sees the workspace.

Assessment

Brood Box has zero agent-facing prompt files. All framework behavior is enforced at the infrastructure layer (VM config, firewall rules, file exclusion patterns, hash verification). This is the defining characteristic of an execution substrate: it provides safety guarantees that cannot be bypassed by agent instruction manipulation.

09

Uniqueness

Brood Box — Uniqueness & Positioning

differs_from_seeds

Brood Box is architecturally unlike all 11 seed frameworks. The seeds (superpowers, openspec, BMAD-METHOD, claude-flow, taskmaster-ai, agent-os, kiro, ccmemory, claude-conductor, spec-driver, spec-kit) all operate within the agent loop — they augment how an agent behaves by injecting skills, commands, hooks, MCP tools, or persona files. Brood Box operates below the agent loop: it wraps any agent in a hardware-isolated microVM, providing a security substrate that is orthogonal to whatever methodology or skill-pack the agent uses. The closest seed analogy is the isolation_mechanism field: all seeds use none, git-worktree, or container; Brood Box uses microvm. It is not a coding methodology but an execution environment — the difference is between "what should the agent do" (seeds) and "where can the agent safely do it" (Brood Box).

Positioning

Brood Box occupies the "security substrate" layer:

[ User ] → [ bbox ] → [ microVM ] → [ Claude Code / Codex / OpenCode / Gemini ]
                                              ↕
                               [ COW workspace snapshot ]

No seed framework sits at this layer. The closest comparable products are E2B and Daytona (cloud sandbox APIs), microsandbox (local microVM SDK), and Arrakis (self-hosted microVM REST API) — all in this same batch rather than in the seeds.

Distinctive Opinion

The workspace should never be modified directly by an agent; every session should be ephemeral and every change should require explicit human review before landing.

Explicit Antipatterns

  • Containers with shared kernel (mentioned explicitly as insufficient)
  • Per-workspace config widening global security settings
  • --workspace-mode=direct as default (only allowed with explicit --yes)
  • Custom Cedar policies coming from workspace config (would allow untrusted repos to define authz)
  • Running raw go build / go test instead of task (builds miss critical flags)

Observable Failure Modes

  1. Linux-only microVM: Requires /dev/kvm; no Windows support; macOS requires Apple Silicon
  2. First-run latency: Firmware download on first use can block if network unavailable
  3. go-microvm tag dependency: Release binaries for go-microvm must exist for task fetch-runtime to succeed
  4. OCI image freshness: Agent images are :latest-only, rebuilt weekly; no version pinning for agent images
  5. macOS entitlements: go-microvm-runner must be code-signed with assets/entitlements.plist; unsigned builds fail
  6. EXPERIMENTAL status: APIs, CLI flags, config format all unstable between releases

Cross-References

  • Uses ToolHive (also from Stacklok) for MCP server discovery and vmcp proxy
  • Uses go-microvm (also from Stacklok) as the libkrun Go wrapper
  • Cedar authorization comes from cedarpolicy.com (Amazon's Cedar policy language)
04

Workflow

Brood Box — Workflow

Execution Phases

Phase What Happens Artifact
1. Snapshot COW clone of workspace (FICLONE / clonefile) ~/.local/share/broodbox/<session>/snapshot/
2. Image Prep Pull OCI image; extract rootfs; inject bbox-init + SSH keys + settings Temp rootfs directory
3. VM Boot libkrun starts microVM; bbox-init as PID 1 Running VM with virtio-fs workspace mount
4. Agent Session Interactive SSH PTY session with agent Agent output in terminal
5. VM Stop Explicit stop before diff (prevents TOCTOU) Stopped VM
6. Diff SHA-256 hash index comparison snapshot vs original List of FileChange records with unified diffs
7. Review Per-file interactive accept/reject (if --review) or auto-accept ReviewResult
8. Flush Hash re-verified copy of accepted changes to real workspace Modified files in original workspace
9. Cleanup Remove snapshot, VM data dir (unless BBOX_KEEP_VM_DATA=1) Clean state

Approval Gates

There is exactly one approval gate: the per-file diff review step (phase 7). This only activates if --review flag is passed. Without --review, all changes are auto-accepted. The gate is interactive: the user sees each file's unified diff and presses accept/reject.

In --workspace-mode=direct, snapshot isolation is skipped entirely and no review gate exists.

Security Enforcement Points

  • Config-time: review.enabled in per-workspace .broodbox.yaml is ignored (only global/CLI can enable/disable)
  • Snapshot-time: Security-sensitive patterns are always excluded before COW snapshot is created
  • Egress: DNS-based firewall applied at VM network level; per-workspace config cannot widen global profile
  • MCP-time: Cedar-based authz profile enforced at MCP proxy layer; custom profile cannot come from workspace config

Workspace Mode Comparison

Mode Isolation Review Credential Sanitization
snapshot (default) COW snapshot Optional (--review) Yes (git config sanitized)
direct None Not available Skipped
06

Memory Context

Brood Box — Memory & Context

State Model: Fully Ephemeral

Brood Box explicitly enforces zero persistent state as a security property:

"Each session is fully ephemeral; nothing lingers after cleanup"

State Files Written During a Session

Path Purpose Lifecycle
~/.cache/broodbox/firmware/ libkrunfw firmware cache Persistent across sessions (download once)
~/.local/share/broodbox/<session>/snapshot/ COW workspace snapshot Deleted on session cleanup
~/.local/share/broodbox/<session>/vm-data/ VM console.log, vm.log, rootfs-work Deleted on cleanup (unless BBOX_KEEP_VM_DATA=1)
~/.local/share/broodbox/<session>/trace.json OTel trace spans (if --trace) Deleted on cleanup
~/.config/broodbox/config.yaml Global user config Permanent
.broodbox.yaml Per-workspace config Permanent (user-managed)
.broodboxignore Per-workspace excludes Permanent (user-managed)

Context Passing Into VM

The FSInjector component copies settings from the host into the guest rootfs before boot. This includes:

  • Host CLAUDE.md / agent configuration files
  • Skill files (for agents that use them)
  • Claude credentials (if configured)
  • Git identity (user.name / user.email from host .git/config)
  • Forwarded environment variables matching patterns from env_forward in config

Context Passing Out of VM

On session exit:

  • SHA-256 diff produces a list of changed files
  • User reviews and accepts/rejects each file (if --review)
  • Accepted changes are flushed to original workspace

There is no compaction, no session memory, no cross-session handoff, and no vector/graph store. Each invocation starts fresh.

Credential Handling

  • Git credentials are forwarded via token injection; .git/config is sanitized in the snapshot (sensitive values stripped) before being injected into the VM
  • API keys are forwarded as environment variables matching patterns in env_forward config
  • Forwarded values are single-quote shell-escaped before injection
  • Per-session ECDSA P-256 SSH keys are generated and destroyed on exit
07

Orchestration

Brood Box — Orchestration

Multi-Agent Pattern

Brood Box runs one agent per invocation — it is not a multi-agent orchestrator. Each bbox <agent> call boots one microVM with one agent session. There is no coordination between concurrent bbox invocations; each is fully isolated.

The isolation itself enables safe parallelism (different terminal tabs can run different agents on the same or different workspaces without interference), but bbox provides no orchestration layer for this.

Isolation Mechanism

MicroVM (KVM hardware virtualization) — the defining feature.

  • Uses libkrun (KVM on Linux, Hypervisor.framework on macOS Apple Silicon)
  • Not containers (shares kernel) — full hardware VM boundary
  • The go-microvm module provides the VMRunner abstraction
  • Workspace mounted via virtio-fs as a COW snapshot
  • Guest uses overlayfs to protect rootfs

Execution Mode

Interactive one-shot: The user invokes bbox <agent>, interacts with the agent session via PTY-forwarded SSH, then reviews/accepts changes on exit. There is no daemon, no background scheduler, no event-driven trigger.

The --workspace-mode=direct variant is also one-shot but without snapshot isolation.

Multi-Model

No. Brood Box does not route or select models. The agent being wrapped determines its own model configuration. Brood Box is model-agnostic.

Consensus Mechanism

None. Single-agent per session.

Prompt Chaining

No. Brood Box does not chain prompts. The agent receives its own prompts via its own mechanism.

Observability

  • --timings: per-phase timing summary to stderr
  • --trace: OTel spans written as JSON to VM data dir
  • Key trace spans: bbox.Prepare → bbox.StartVM → microvm.Run → microvm.RootfsClone / microvm.SSHWaitReady
  • Broodbox application log: broodbox.log in session data dir
  • VM console log: console.log in VM data dir

Git Automation

  • No automatic commits or PRs
  • Git credentials are forwarded into the VM so the agent inside can use git
  • Agent-driven git operations inside the VM are subject to normal agent behavior (not bbox behavior)
  • After session: changes are diffed and flushed; no git operations performed by bbox itself

MCP Integration

Brood Box automatically discovers ToolHive MCP groups and proxies them as an HTTP host service accessible to the guest VM. The VMCPProvider uses toolhive's vmcp library. Cedar-based authorization profiles restrict what MCP operations the agent can perform.

08

Ui Cli Surface

Brood Box — UI & CLI Surface

CLI Binary

  • Binary name: bbox
  • Install: Pre-built tarball release or task build from source
  • Is thin wrapper: No — bbox is its own runtime (Go binary embedding go-microvm runtime and bbox-init guest binary)
  • Subcommands: claude-code, codex, opencode, gemini, hermes, list (+ custom agents defined in config)

Key CLI Flags

bbox <agent>
  --cpus N               Override vCPU count
  --memory N             Override RAM in MiB
  --workspace PATH       Use different directory
  --review               Enable per-file review on exit
  --exclude PATTERN      Additional exclude patterns (repeatable)
  --workspace-mode       snapshot (default) or direct
  --yes                  Required with --workspace-mode=direct on first use
  --egress-profile       permissive / standard / locked
  --allow-host HOST:PORT Add to egress allowlist
  --no-mcp               Disable MCP proxy
  --mcp-group NAME       ToolHive group for MCP servers
  --mcp-authz-profile    full-access / observe / safe-tools / custom
  --mcp-config PATH      Custom Cedar authz policies
  --timings              Per-phase timing summary
  --trace                OTel trace JSON
  --no-firmware-download Use system libkrunfw only

bbox list                List available agent configurations

Local UI

No web dashboard, no TUI, no desktop app. The interactive surface is:

  1. The PTY-forwarded SSH terminal session (the agent's native UI)
  2. The per-file diff review prompt at session exit (text-mode, shows unified diffs with accept/reject prompts)

Observability

  • Timing: --timings flag prints per-phase wall-clock summary to stderr
  • Tracing: --trace flag writes OTel JSON to trace.json in VM data dir (viewable with jq or importable into Jaeger)
  • Application log: broodbox.log in session data dir (structured slog output)
  • VM console log: console.log in VM data dir (guest stdout/stderr)
  • Preserve data: BBOX_KEEP_VM_DATA=1 prevents cleanup for debugging

IDE Integration

None. Brood Box is a terminal CLI; there is no VS Code extension, IDE plugin, or GUI integration.

Related frameworks

same archetype · same primary tool · same memory type

claude-mem (thedotmack) ★ 78k

Background worker service captures every tool call as an observation, AI-compresses sessions, and auto-injects relevant past…

pi (badlogic/earendil) ★ 55k

A minimal, hackable, multi-provider terminal coding agent that adapts to your workflows via npm-installable TypeScript Extensions…

Agent Skills (Addy Osmani) ★ 46k

Encodes senior-engineer software development lifecycle as 23 auto-routed skills and 7 slash commands for any AI coding agent.

wshobson/agents Plugin Marketplace ★ 36k

Single Markdown source for 83 domain-specialized plugins that auto-generates idiomatic artifacts for five AI coding harnesses.

TabbyML/Tabby ★ 34k

Self-hosted AI coding assistant server (alternative to GitHub Copilot) with admin dashboard, RAG-based completions, and multi-IDE…

Compound Engineering ★ 17k

Make each unit of engineering work compound into easier future work via brainstorm→plan→execute→review→learn cycles.