agentbox

agentbox-mattolson · mattolson/agent-sandbox · ★ 174 · last commit 2026-05-26

Primitive shape

No installable primitives

Summary

agentbox (mattolson) — Summary

agentbox is a Go CLI + Docker image toolkit that runs AI coding agents (Claude Code, Codex, Gemini CLI, OpenCode, Pi, Factory, Copilot) inside locked-down local Docker sandboxes with multi-layer network enforcement. The security model has two layers: a mitmproxy sidecar that enforces a host allowlist with fine-grained per-route rules (scheme, method, path, query), plus an iptables firewall that blocks all direct outbound — traffic must transit the proxy. Credentials are injected by the proxy into HTTP headers from host-side secret files; the agent container never sees the actual token. The CLI binary agentbox handles init, exec, switch (change agent while preserving state), policy editing, and compose management. DevContainer mode enables VS Code and JetBrains IDE integration. The project is described as early-stage with the primary author using Claude Code as the primary supported agent.

Differs from seeds: No seed has two-layer network isolation (mitmproxy allowlist + iptables firewall). The closest seed is agent-os (personal harness for one user) but agentbox is provider-agnostic (7 supported agents) and adds real network-level enforcement. Unlike nanoclaw (per-session container + messaging channels), agentbox is a pure code-execution sandbox for development tasks, not a messaging assistant.

Overview

agentbox (mattolson) — Overview

Origin

agentbox is a personal project by Matt Olson (mattolson). Written in Go. The project targets Apple Silicon macOS (primary), with Colima as the recommended VM backend, though any Docker-compatible runtime works.

Philosophy

From README:

"Run AI coding agents in a locked-down local sandbox with:

Minimal filesystem access (read/write access to only your repository directory)

Configurable network egress policy enforced by a sidecar proxy (restrict by hostname, as well as fine-grained attributes, like scheme, method, path, and query string)

Secret injection conducted in the proxy, so the agent container never sees secrets such as API keys

Iptables firewall preventing direct outbound (all traffic must go through the proxy)

Reproducible environments (Debian container with pinned dependencies)"

Key Design Principles

Network enforcement happens at two independent layers — both must be bypassed to exfiltrate data
Credentials stay on the host; never enter the agent container
One binary (agentbox) manages the full sandbox lifecycle
User-owned policy files persist across agent switches; managed files are regenerated

Repo Facts

GitHub: https://github.com/mattolson/agent-sandbox
Stars: 174 (2026-05-26)
Language: Go + Python (proxy enforcement)
License: MIT
Last commit: 2026-05-26
Status: Early-stage development (README warning)

Architecture

agentbox (mattolson) — Architecture

Distribution

Binary: downloaded from GitHub Releases
Quick install: curl -fsSL .../install.sh | sh (installs to ~/.local/bin)
Platform: macOS (Apple Silicon primary), Linux

Security Architecture (Two-Layer)

Agent Container (Debian)
    ↓ all outbound blocked by iptables
Proxy Sidecar (mitmproxy)
    ↓ host policy + route rules
    ↓ credential injection (HTTP header from host secret file)
Internet

Layer 1 — Proxy (mitmproxy sidecar):

Enforces hostname allowlist
Optional per-route rules: scheme, method, path prefix, query string
Credential injection: HTTP header from host-side secret file
Default policy: BLOCKS ALL traffic
Non-matching requests get 403 response
Hot-reload: agentbox reloads proxy policy on save

Layer 2 — Firewall (iptables):

Blocks all direct outbound from agent container
Only Docker bridge network (proxy) reachable
Even if agent ignores HTTP_PROXY/HTTPS_PROXY env vars, direct outbound is blocked

Network Policy Architecture

.agent-sandbox/policy/
├── user.policy.yaml              # Shared user-owned (survives agent switches)
├── user.agent.<agent>.policy.yaml # Per-agent user-owned
├── base.yml                      # Managed base policy (regenerated)
└── policy.devcontainer.yaml      # DevContainer-specific

Directory Structure

agent-sandbox/
├── cmd/agentbox/     # Go CLI commands
├── internal/
│   ├── cli/          # Cobra commands + user behavior
│   ├── runtime/      # Path discovery, agent state, lifecycle
│   ├── scaffold/     # File generation (init, switch)
│   ├── docker/       # Docker + Compose invocation
│   └── embeddata/    # Embedded templates
├── images/
│   ├── base/         # Debian base image
│   ├── agents/       # Per-agent images (claude, codex, gemini, etc.)
│   ├── proxy/        # mitmproxy + enforcer.py
│   └── build.sh
├── .agent-sandbox/   # Generated runtime files (this repo's local sandbox)
└── docs/             # CLI reference, policy schema, agent setup guides

Supported Agents

Agent	CLI	VS Code	JetBrains
Claude Code	Full	Full	Full
Codex	Full	Preview	Preview
Gemini CLI	Preview	Preview	No
OpenCode	Preview	Preview	No
Pi	Preview	No	No
Factory	Preview	Full	Full
Copilot	Preview	Preview	No

Components

agentbox (mattolson) — Components

CLI Binary (`agentbox`)

Key subcommands:

agentbox init — Interactive: project name, agent, mode, IDE; generates compose + policy files
agentbox exec — Open shell in agent container; start agent CLI inside
agentbox switch --agent <agent> — Switch agent while preserving state volumes and user overrides
agentbox policy config — Output effective network policy
agentbox compose config — Output fully combined Docker compose stack
agentbox edit policy — Edit user policy file in $EDITOR (hot-reload triggers on save)

Docker Images

base/ — Debian bookworm, non-root dev user (uid/gid 500), zsh
agents/<agent>/ — Per-agent images (claude, codex, gemini, opencode, pi, factory, copilot)
proxy/ — mitmproxy + Python enforcer addon (enforcer.py)

proxy/enforcer.py

Python addon for mitmproxy
Checks HTTPS CONNECT tunnels against hostname policy
Checks decrypted HTTP/HTTPS requests against scheme, method, path, query rules
Non-matching: 403 response
Credential injection: reads host secret file, injects as HTTP header into matched requests

Network Policy (YAML)

services:
  - github                    # pre-defined service allowlist entry
  
domains:
  - registry.npmjs.org        # plain hostname allow
  - host: api.example.com     # hostname with per-route rules
    rules:
      - schemes: [https]
        methods: [GET]
        path:
          prefix: /v1/public/

Generated Runtime Files (`.agent-sandbox/`)

active-target.env — Active agent + runtime metadata
compose/base.yml — Managed shared compose layer
compose/agent.<agent>.yml — Managed per-agent compose
compose/user.override.yml — User-owned shared overrides (preserved across switches)
compose/user.agent.<agent>.override.yml — Per-agent user overrides
policy/user.policy.yaml — Shared user policy (preserved)
policy/user.agent.<agent>.policy.yaml — Per-agent user policy

DevContainer Support

.devcontainer/devcontainer.json — Generated by agentbox
VS Code: "Dev Containers: Reopen in Container"
JetBrains: Remote Development → Dev Containers

Prompts

agentbox (mattolson) — Prompts

agentbox is an infrastructure tool, not a prompt engineering framework. It does not ship agent prompt files. The AGENTS.md and CLAUDE.md files are developer guidance for contributors, not user-facing prompts.

Verbatim: AGENTS.md (Repo Context Guide)

# Agent Instructions

## Repo Snapshot

Agent Sandbox is a Go CLI plus Docker images and templates for running AI coding 
agents inside locked-down local sandboxes.

## Source Of Truth

Prefer these in order when describing or changing current behavior:
1. Code in `internal/` and `images/`
2. Tests in `internal/**/*_test.go` and `images/proxy/tests/`
3. User-facing docs in `README.md`, `docs/cli.md`, `docs/policy/schema.md`, 
   and `docs/agents/*.md`

Treat `docs/plan/` and `docs/roadmap.md` as planning or historical context, 
not as the source of truth for current implementation.

## Development Environment

This repo is usually developed from inside its own sandbox container. Network 
restrictions are real:
- Outbound traffic from the agent container must go through the proxy sidecar
- Direct outbound is blocked by iptables
- SSH is disabled, and Git remote URLs are rewritten to HTTPS

Technique: Meta-recursive documentation — the project itself is developed inside its own sandbox. The AGENTS.md provides a trust hierarchy (code > tests > docs > plans) and explicitly notes the network restrictions the contributing agent will face.

Policy YAML Pattern (from README)

services:
  - github

domains:
  - registry.npmjs.org
  - host: api.example.com
    rules:
      - schemes: [https]
        methods: [GET]
        path:
          prefix: /v1/public/

Technique: Declarative allowlist policy with progressive specificity. Default-deny, explicit-allow. Fine-grained route rules enable allowing a service's read API while blocking its write API.

Uniqueness

agentbox (mattolson) — Uniqueness & Positioning

differs_from_seeds

agentbox has no close seed equivalent. agent-os is the nearest in "personal local harness" philosophy but agent-os does no isolation (in-place bash scripts). agentbox's two-layer network enforcement (mitmproxy allowlist + iptables firewall) is unique in this corpus — no seed framework attempts to enforce network egress. Unlike nanoclaw (which uses Docker containers for messaging assistant sessions), agentbox is a code-development sandbox with no messaging functionality. Unlike IronClaw (WASM sandbox), agentbox uses Docker + iptables.

Distinctive Positioning

Dual-layer network enforcement: mitmproxy allowlist (application layer) + iptables (kernel layer). Two independent systems must be bypassed to exfiltrate data. No other framework in this corpus implements this.
Credential injection at proxy layer: Agent container has HTTP_PROXY env pointing to sidecar; proxy reads host secret file, injects as HTTP header. Agent sees the auth header in responses; never sees the token file. More secure than NanoClaw's OneCLI (which is host-side, not container-layer).
Multi-agent switch with state preservation: agentbox switch --agent codex preserves per-agent Docker volumes (auth, history) while regenerating managed config files. User policy files survive all switches.
Meta-recursive development: The repo itself is developed inside its own sandbox — the AGENTS.md explicitly describes the network restrictions the contributing agent will encounter. Self-hosting of the tool.
DevContainer-first IDE integration: Not a Claude Code plugin or VS Code extension, but integrates with ANY IDE supporting devcontainer spec.

Observable Failure Modes

Early-stage warning in README — expect breaking changes
174 stars — small community, limited documentation depth
Requires Colima or other Docker VM (non-trivial setup on macOS)
Default policy blocks all traffic — substantial policy configuration required per project
No AI memory or context enhancement — purely a security sandbox
No multi-agent or parallel execution
SSH disabled (git HTTPS-only) — may conflict with some workflows

Workflow

agentbox (mattolson) — Workflow

Setup Phase

Step	Artifact	Gate
Install Colima + Docker	Container runtime	Manual
Install `agentbox` binary	Binary in PATH	Automated
`agentbox init`	docker-compose.yml + policy files + .devcontainer	Interactive prompts
`agentbox exec`	Shell inside agent container
Start agent CLI inside container	Agent running in sandbox	Manual

Per-Session Flow (CLI mode)

Step	Artifact
`agentbox exec`	Shell inside Debian container
`claude --dangerously-skip-permissions` (safe because sandboxed)	Agent running
Tool calls from agent → proxy check → internet	Policy-filtered requests
Credential headers injected by proxy	Auth without agent seeing keys
Files written to `/workspace` (bind-mounted repo)	Code changes

Network Policy Customization

Step	Artifact
`agentbox edit policy`	Opens `user.policy.yaml` in $EDITOR
Save file	Hot-reload triggers; proxy picks up changes
`agentbox policy config`	Verify effective policy

Agent Switch Flow

Step	Artifact
`agentbox switch --agent codex`	Regenerates compose + devcontainer
User-owned policy files preserved	No policy loss
Per-agent state volumes preserved (credentials, history)	Agent state intact

Approval Gates

agentbox init prompts: project name, agent, mode (CLI/devcontainer), IDE
Policy file approval: human edits + saves user.policy.yaml
No mid-task approval gates during agent execution

Memory Context

agentbox (mattolson) — Memory & Context

State Architecture

agentbox manages container state, not agent context:

Per-agent state volumes: Docker named volumes preserve agent credentials, history, and config across container restarts
User-owned policy files: Persist across agent switches (user.policy.yaml, user.agent.<agent>.policy.yaml)
/workspace bind mount: Repository directory mounted into container — file changes persist on host

No Agent Memory System

agentbox does not implement agent memory or context management. It is a sandbox infrastructure tool. Memory is managed by the agent CLI running inside the container (e.g., Claude Code's own context management).

Cross-Session Handoff

Per-agent volumes preserved across:
- Container restarts
- agentbox switch (switch to different agent, switch back — state restored)
Policy files preserved across all switches
Active agent recorded in .agent-sandbox/active-target.env

DevContainer Persistence

VS Code/JetBrains devcontainer: IDE manages container lifecycle
Agent history and credentials persist in Docker volumes between IDE sessions

Container Isolation

Agent container has ONLY /workspace (repo directory) accessible
No access to rest of host filesystem
Network restricted to proxy-approved endpoints
SSH disabled (git remote URLs rewritten to HTTPS)

Orchestration

agentbox (mattolson) — Orchestration

Multi-Agent

No. agentbox runs one agent at a time per project. agentbox switch changes the active agent but not concurrently.

Isolation Mechanism

Container + network policy (two independent layers):

Layer 1 — Filesystem isolation:

Debian container with minimal filesystem (only /workspace bind-mounted)
Non-root dev user (uid/gid 500)
No host filesystem access beyond repo directory

Layer 2a — Proxy network enforcement (mitmproxy sidecar):

Default-deny all outbound
Hostname allowlist + per-route rules
Credential injection from host secret files
Hot-reloadable YAML policy

Layer 2b — Firewall (iptables in container):

Blocks ALL direct outbound
Forces all traffic through proxy sidecar
Proxy bypass is impossible from inside container

Execution Mode

On-demand interactive (CLI mode):

agentbox exec → shell → start agent CLI
No daemon, no background process

DevContainer mode:

IDE manages container lifecycle
Agent runs within IDE's devcontainer session

Orchestration Pattern

None. Single agent, single session.

Multi-Model

Yes (sequential, not concurrent): agentbox switch --agent <agent> changes agent provider. Not multi-model simultaneously.

Consensus

None.

Ui Cli Surface

agentbox (mattolson) — UI & CLI Surface

CLI Binary (`agentbox`)

Go binary, downloaded from GitHub Releases
Install: curl -fsSL .../install.sh | sh

Key subcommands:

agentbox init — Initialize sandbox for project
agentbox exec — Open shell in agent container
agentbox switch --agent <agent> — Switch active agent
agentbox policy config — Show effective network policy
agentbox compose config — Show combined Docker compose config
agentbox edit policy — Edit user policy in $EDITOR

DevContainer Integration

Generates .devcontainer/devcontainer.json
VS Code: "Dev Containers: Reopen in Container"
JetBrains: Remote Development → Dev Containers
Per-agent devcontainer support

No Web Dashboard

agentbox has no web UI. All operations are CLI or IDE-devcontainer.

Observability

agentbox policy config — Inspect combined effective policy
agentbox compose config — Inspect combined Docker compose
mitmproxy proxy logs (accessible in sandbox)
iptables firewall: transparent (no separate UI)

Secret Injection (Proxy Layer)

Secret files mounted into proxy container only
Agent container receives HTTP header with injected credential
Agent CLI sees the header; never sees the raw secret file

Related frameworks

same archetype · same primary tool · same memory type

CodeMachine CLI ★ 2.5k

A16 Cross-vendor router

JavaScript-DSL workflow orchestration engine that captures repeatable AI coding agent workflows with tracks, condition groups,…

Codexia ★ 690

A16 Cross-vendor router

Tauri desktop app providing visual control plane, task scheduler, git worktree manager, and headless REST API for Codex CLI +…

Kagan ★ 88

A16 Cross-vendor router

Kanban TUI for AI coding agents with a structurally enforced human review gate (REVIEW → DONE cannot be automated) — one git…

oh-my-claudecode (Yeachan-Heo) ★ 35k

A16 Cross-vendor router

Zero-learning-curve teams-first multi-agent orchestration for Claude Code with autopilot (6-phase lifecycle), ralph (PRD-driven…

Paseo ★ 6.8k

A16 Cross-vendor router

Multi-provider AI coding agent orchestration daemon with cross-device access (phone/desktop/CLI) and git worktree isolation.

CCG Workflow ★ 5.4k

A16 Cross-vendor router

Routes Claude + Codex + Gemini to task-appropriate collaboration strategies (direct-fix through full-collaborate) with hook-based…

Distribution

Type: cli-tool
License: MIT
Install: multi-step
Version: latest (2026-05-26)

Surfaces

CLI binary: agentbox
CLI subcmds: 6
Local UI: No
Tech stack: null

Components

Commands: 0
Skills: 0
Subagents: 0
Hooks: 0
MCP servers: 0
MCP tools: 0
Scripts: 2
Templates: 7

Workflow

Phases: 4
Approval gates: 2
Spec format: none
Spec storage: none
Delta or full: none

Orchestration

Multi-agent: No
Pattern: none
Max concurrent: 1
Isolation: container
Consensus: none
Prompt chaining: No

Multi-model

Multi-model: Yes
BYOK: Yes

Execution

Mode: one-shot
Crash recovery: No
Compaction: No
Session handoff: Yes
Streaming: No

Memory

Type: file-based
Persistence: session
Search: none
State files: 2 files

Quality

TDD: No
TDD mechanism: none
Self-review: none

Git / Observability

Auto commit: No
Auto PR: No
Auto merge: No
Worktree/feat: No
Audit log: No
Audit format: none
Replay: No

Tools

Primary: claude-code
Targets: 7
Portability: high

Signals

Stars: 174
Last commit: 2026-05-26
Contributors: 1
Maintainer: active
Quality score: 3/10

Summary

agentbox (mattolson) — Summary

Overview

agentbox (mattolson) — Overview

Origin

Philosophy

Key Design Principles

Repo Facts

Architecture

agentbox (mattolson) — Architecture

Distribution

Security Architecture (Two-Layer)

Network Policy Architecture

Directory Structure

Supported Agents

Components

agentbox (mattolson) — Components

CLI Binary (agentbox)

Docker Images

proxy/enforcer.py

Network Policy (YAML)

Generated Runtime Files (.agent-sandbox/)

DevContainer Support

Prompts

agentbox (mattolson) — Prompts

Verbatim: AGENTS.md (Repo Context Guide)

Policy YAML Pattern (from README)

Uniqueness

agentbox (mattolson) — Uniqueness & Positioning

differs_from_seeds

Distinctive Positioning

Observable Failure Modes

Workflow

agentbox (mattolson) — Workflow

Setup Phase

Per-Session Flow (CLI mode)

Network Policy Customization

Agent Switch Flow

Approval Gates

Memory Context

agentbox (mattolson) — Memory & Context

State Architecture

No Agent Memory System

Cross-Session Handoff

DevContainer Persistence

Container Isolation

Orchestration

agentbox (mattolson) — Orchestration

Multi-Agent

Isolation Mechanism

Execution Mode

Orchestration Pattern

Multi-Model

Consensus

Ui Cli Surface

agentbox (mattolson) — UI & CLI Surface

CLI Binary (agentbox)

DevContainer Integration

No Web Dashboard

Observability

Secret Injection (Proxy Layer)

Related frameworks

CLI Binary (`agentbox`)

Generated Runtime Files (`.agent-sandbox/`)

CLI Binary (`agentbox`)