Skip to content
/

agentbox

agentbox-mattolson · mattolson/agent-sandbox · ★ 174 · last commit 2026-05-26

Primitive shape
No installable primitives
00

Summary

agentbox (mattolson) — Summary

agentbox is a Go CLI + Docker image toolkit that runs AI coding agents (Claude Code, Codex, Gemini CLI, OpenCode, Pi, Factory, Copilot) inside locked-down local Docker sandboxes with multi-layer network enforcement. The security model has two layers: a mitmproxy sidecar that enforces a host allowlist with fine-grained per-route rules (scheme, method, path, query), plus an iptables firewall that blocks all direct outbound — traffic must transit the proxy. Credentials are injected by the proxy into HTTP headers from host-side secret files; the agent container never sees the actual token. The CLI binary agentbox handles init, exec, switch (change agent while preserving state), policy editing, and compose management. DevContainer mode enables VS Code and JetBrains IDE integration. The project is described as early-stage with the primary author using Claude Code as the primary supported agent.

Differs from seeds: No seed has two-layer network isolation (mitmproxy allowlist + iptables firewall). The closest seed is agent-os (personal harness for one user) but agentbox is provider-agnostic (7 supported agents) and adds real network-level enforcement. Unlike nanoclaw (per-session container + messaging channels), agentbox is a pure code-execution sandbox for development tasks, not a messaging assistant.

01

Overview

agentbox (mattolson) — Overview

Origin

agentbox is a personal project by Matt Olson (mattolson). Written in Go. The project targets Apple Silicon macOS (primary), with Colima as the recommended VM backend, though any Docker-compatible runtime works.

Philosophy

From README:

"Run AI coding agents in a locked-down local sandbox with:

  • Minimal filesystem access (read/write access to only your repository directory)
  • Configurable network egress policy enforced by a sidecar proxy (restrict by hostname, as well as fine-grained attributes, like scheme, method, path, and query string)
  • Secret injection conducted in the proxy, so the agent container never sees secrets such as API keys
  • Iptables firewall preventing direct outbound (all traffic must go through the proxy)
  • Reproducible environments (Debian container with pinned dependencies)"

Key Design Principles

  1. Network enforcement happens at two independent layers — both must be bypassed to exfiltrate data
  2. Credentials stay on the host; never enter the agent container
  3. One binary (agentbox) manages the full sandbox lifecycle
  4. User-owned policy files persist across agent switches; managed files are regenerated

Repo Facts

02

Architecture

agentbox (mattolson) — Architecture

Distribution

  • Binary: downloaded from GitHub Releases
  • Quick install: curl -fsSL .../install.sh | sh (installs to ~/.local/bin)
  • Platform: macOS (Apple Silicon primary), Linux

Security Architecture (Two-Layer)

Agent Container (Debian)
    ↓ all outbound blocked by iptables
Proxy Sidecar (mitmproxy)
    ↓ host policy + route rules
    ↓ credential injection (HTTP header from host secret file)
Internet

Layer 1 — Proxy (mitmproxy sidecar):

  • Enforces hostname allowlist
  • Optional per-route rules: scheme, method, path prefix, query string
  • Credential injection: HTTP header from host-side secret file
  • Default policy: BLOCKS ALL traffic
  • Non-matching requests get 403 response
  • Hot-reload: agentbox reloads proxy policy on save

Layer 2 — Firewall (iptables):

  • Blocks all direct outbound from agent container
  • Only Docker bridge network (proxy) reachable
  • Even if agent ignores HTTP_PROXY/HTTPS_PROXY env vars, direct outbound is blocked

Network Policy Architecture

.agent-sandbox/policy/
├── user.policy.yaml              # Shared user-owned (survives agent switches)
├── user.agent.<agent>.policy.yaml # Per-agent user-owned
├── base.yml                      # Managed base policy (regenerated)
└── policy.devcontainer.yaml      # DevContainer-specific

Directory Structure

agent-sandbox/
├── cmd/agentbox/     # Go CLI commands
├── internal/
│   ├── cli/          # Cobra commands + user behavior
│   ├── runtime/      # Path discovery, agent state, lifecycle
│   ├── scaffold/     # File generation (init, switch)
│   ├── docker/       # Docker + Compose invocation
│   └── embeddata/    # Embedded templates
├── images/
│   ├── base/         # Debian base image
│   ├── agents/       # Per-agent images (claude, codex, gemini, etc.)
│   ├── proxy/        # mitmproxy + enforcer.py
│   └── build.sh
├── .agent-sandbox/   # Generated runtime files (this repo's local sandbox)
└── docs/             # CLI reference, policy schema, agent setup guides

Supported Agents

Agent CLI VS Code JetBrains
Claude Code Full Full Full
Codex Full Preview Preview
Gemini CLI Preview Preview No
OpenCode Preview Preview No
Pi Preview No No
Factory Preview Full Full
Copilot Preview Preview No
03

Components

agentbox (mattolson) — Components

CLI Binary (agentbox)

Key subcommands:

  • agentbox init — Interactive: project name, agent, mode, IDE; generates compose + policy files
  • agentbox exec — Open shell in agent container; start agent CLI inside
  • agentbox switch --agent <agent> — Switch agent while preserving state volumes and user overrides
  • agentbox policy config — Output effective network policy
  • agentbox compose config — Output fully combined Docker compose stack
  • agentbox edit policy — Edit user policy file in $EDITOR (hot-reload triggers on save)

Docker Images

  • base/ — Debian bookworm, non-root dev user (uid/gid 500), zsh
  • agents/<agent>/ — Per-agent images (claude, codex, gemini, opencode, pi, factory, copilot)
  • proxy/ — mitmproxy + Python enforcer addon (enforcer.py)

proxy/enforcer.py

  • Python addon for mitmproxy
  • Checks HTTPS CONNECT tunnels against hostname policy
  • Checks decrypted HTTP/HTTPS requests against scheme, method, path, query rules
  • Non-matching: 403 response
  • Credential injection: reads host secret file, injects as HTTP header into matched requests

Network Policy (YAML)

services:
  - github                    # pre-defined service allowlist entry
  
domains:
  - registry.npmjs.org        # plain hostname allow
  - host: api.example.com     # hostname with per-route rules
    rules:
      - schemes: [https]
        methods: [GET]
        path:
          prefix: /v1/public/

Generated Runtime Files (.agent-sandbox/)

  • active-target.env — Active agent + runtime metadata
  • compose/base.yml — Managed shared compose layer
  • compose/agent.<agent>.yml — Managed per-agent compose
  • compose/user.override.yml — User-owned shared overrides (preserved across switches)
  • compose/user.agent.<agent>.override.yml — Per-agent user overrides
  • policy/user.policy.yaml — Shared user policy (preserved)
  • policy/user.agent.<agent>.policy.yaml — Per-agent user policy

DevContainer Support

  • .devcontainer/devcontainer.json — Generated by agentbox
  • VS Code: "Dev Containers: Reopen in Container"
  • JetBrains: Remote Development → Dev Containers
05

Prompts

agentbox (mattolson) — Prompts

agentbox is an infrastructure tool, not a prompt engineering framework. It does not ship agent prompt files. The AGENTS.md and CLAUDE.md files are developer guidance for contributors, not user-facing prompts.

Verbatim: AGENTS.md (Repo Context Guide)

# Agent Instructions

## Repo Snapshot

Agent Sandbox is a Go CLI plus Docker images and templates for running AI coding 
agents inside locked-down local sandboxes.

## Source Of Truth

Prefer these in order when describing or changing current behavior:
1. Code in `internal/` and `images/`
2. Tests in `internal/**/*_test.go` and `images/proxy/tests/`
3. User-facing docs in `README.md`, `docs/cli.md`, `docs/policy/schema.md`, 
   and `docs/agents/*.md`

Treat `docs/plan/` and `docs/roadmap.md` as planning or historical context, 
not as the source of truth for current implementation.

## Development Environment

This repo is usually developed from inside its own sandbox container. Network 
restrictions are real:
- Outbound traffic from the agent container must go through the proxy sidecar
- Direct outbound is blocked by iptables
- SSH is disabled, and Git remote URLs are rewritten to HTTPS

Technique: Meta-recursive documentation — the project itself is developed inside its own sandbox. The AGENTS.md provides a trust hierarchy (code > tests > docs > plans) and explicitly notes the network restrictions the contributing agent will face.

Policy YAML Pattern (from README)

services:
  - github

domains:
  - registry.npmjs.org
  - host: api.example.com
    rules:
      - schemes: [https]
        methods: [GET]
        path:
          prefix: /v1/public/

Technique: Declarative allowlist policy with progressive specificity. Default-deny, explicit-allow. Fine-grained route rules enable allowing a service's read API while blocking its write API.

09

Uniqueness

agentbox (mattolson) — Uniqueness & Positioning

differs_from_seeds

agentbox has no close seed equivalent. agent-os is the nearest in "personal local harness" philosophy but agent-os does no isolation (in-place bash scripts). agentbox's two-layer network enforcement (mitmproxy allowlist + iptables firewall) is unique in this corpus — no seed framework attempts to enforce network egress. Unlike nanoclaw (which uses Docker containers for messaging assistant sessions), agentbox is a code-development sandbox with no messaging functionality. Unlike IronClaw (WASM sandbox), agentbox uses Docker + iptables.

Distinctive Positioning

  1. Dual-layer network enforcement: mitmproxy allowlist (application layer) + iptables (kernel layer). Two independent systems must be bypassed to exfiltrate data. No other framework in this corpus implements this.

  2. Credential injection at proxy layer: Agent container has HTTP_PROXY env pointing to sidecar; proxy reads host secret file, injects as HTTP header. Agent sees the auth header in responses; never sees the token file. More secure than NanoClaw's OneCLI (which is host-side, not container-layer).

  3. Multi-agent switch with state preservation: agentbox switch --agent codex preserves per-agent Docker volumes (auth, history) while regenerating managed config files. User policy files survive all switches.

  4. Meta-recursive development: The repo itself is developed inside its own sandbox — the AGENTS.md explicitly describes the network restrictions the contributing agent will encounter. Self-hosting of the tool.

  5. DevContainer-first IDE integration: Not a Claude Code plugin or VS Code extension, but integrates with ANY IDE supporting devcontainer spec.

Observable Failure Modes

  • Early-stage warning in README — expect breaking changes
  • 174 stars — small community, limited documentation depth
  • Requires Colima or other Docker VM (non-trivial setup on macOS)
  • Default policy blocks all traffic — substantial policy configuration required per project
  • No AI memory or context enhancement — purely a security sandbox
  • No multi-agent or parallel execution
  • SSH disabled (git HTTPS-only) — may conflict with some workflows
04

Workflow

agentbox (mattolson) — Workflow

Setup Phase

Step Artifact Gate
Install Colima + Docker Container runtime Manual
Install agentbox binary Binary in PATH Automated
agentbox init docker-compose.yml + policy files + .devcontainer Interactive prompts
agentbox exec Shell inside agent container
Start agent CLI inside container Agent running in sandbox Manual

Per-Session Flow (CLI mode)

Step Artifact
agentbox exec Shell inside Debian container
claude --dangerously-skip-permissions (safe because sandboxed) Agent running
Tool calls from agent → proxy check → internet Policy-filtered requests
Credential headers injected by proxy Auth without agent seeing keys
Files written to /workspace (bind-mounted repo) Code changes

Network Policy Customization

Step Artifact
agentbox edit policy Opens user.policy.yaml in $EDITOR
Save file Hot-reload triggers; proxy picks up changes
agentbox policy config Verify effective policy

Agent Switch Flow

Step Artifact
agentbox switch --agent codex Regenerates compose + devcontainer
User-owned policy files preserved No policy loss
Per-agent state volumes preserved (credentials, history) Agent state intact

Approval Gates

  • agentbox init prompts: project name, agent, mode (CLI/devcontainer), IDE
  • Policy file approval: human edits + saves user.policy.yaml
  • No mid-task approval gates during agent execution
06

Memory Context

agentbox (mattolson) — Memory & Context

State Architecture

agentbox manages container state, not agent context:

  • Per-agent state volumes: Docker named volumes preserve agent credentials, history, and config across container restarts
  • User-owned policy files: Persist across agent switches (user.policy.yaml, user.agent.<agent>.policy.yaml)
  • /workspace bind mount: Repository directory mounted into container — file changes persist on host

No Agent Memory System

agentbox does not implement agent memory or context management. It is a sandbox infrastructure tool. Memory is managed by the agent CLI running inside the container (e.g., Claude Code's own context management).

Cross-Session Handoff

  • Per-agent volumes preserved across:
    • Container restarts
    • agentbox switch (switch to different agent, switch back — state restored)
  • Policy files preserved across all switches
  • Active agent recorded in .agent-sandbox/active-target.env

DevContainer Persistence

  • VS Code/JetBrains devcontainer: IDE manages container lifecycle
  • Agent history and credentials persist in Docker volumes between IDE sessions

Container Isolation

  • Agent container has ONLY /workspace (repo directory) accessible
  • No access to rest of host filesystem
  • Network restricted to proxy-approved endpoints
  • SSH disabled (git remote URLs rewritten to HTTPS)
07

Orchestration

agentbox (mattolson) — Orchestration

Multi-Agent

No. agentbox runs one agent at a time per project. agentbox switch changes the active agent but not concurrently.

Isolation Mechanism

Container + network policy (two independent layers):

Layer 1 — Filesystem isolation:

  • Debian container with minimal filesystem (only /workspace bind-mounted)
  • Non-root dev user (uid/gid 500)
  • No host filesystem access beyond repo directory

Layer 2a — Proxy network enforcement (mitmproxy sidecar):

  • Default-deny all outbound
  • Hostname allowlist + per-route rules
  • Credential injection from host secret files
  • Hot-reloadable YAML policy

Layer 2b — Firewall (iptables in container):

  • Blocks ALL direct outbound
  • Forces all traffic through proxy sidecar
  • Proxy bypass is impossible from inside container

Execution Mode

On-demand interactive (CLI mode):

  • agentbox exec → shell → start agent CLI
  • No daemon, no background process

DevContainer mode:

  • IDE manages container lifecycle
  • Agent runs within IDE's devcontainer session

Orchestration Pattern

None. Single agent, single session.

Multi-Model

Yes (sequential, not concurrent): agentbox switch --agent <agent> changes agent provider. Not multi-model simultaneously.

Consensus

None.

08

Ui Cli Surface

agentbox (mattolson) — UI & CLI Surface

CLI Binary (agentbox)

  • Go binary, downloaded from GitHub Releases
  • Install: curl -fsSL .../install.sh | sh

Key subcommands:

  • agentbox init — Initialize sandbox for project
  • agentbox exec — Open shell in agent container
  • agentbox switch --agent <agent> — Switch active agent
  • agentbox policy config — Show effective network policy
  • agentbox compose config — Show combined Docker compose config
  • agentbox edit policy — Edit user policy in $EDITOR

DevContainer Integration

  • Generates .devcontainer/devcontainer.json
  • VS Code: "Dev Containers: Reopen in Container"
  • JetBrains: Remote Development → Dev Containers
  • Per-agent devcontainer support

No Web Dashboard

agentbox has no web UI. All operations are CLI or IDE-devcontainer.

Observability

  • agentbox policy config — Inspect combined effective policy
  • agentbox compose config — Inspect combined Docker compose
  • mitmproxy proxy logs (accessible in sandbox)
  • iptables firewall: transparent (no separate UI)

Secret Injection (Proxy Layer)

  • Secret files mounted into proxy container only
  • Agent container receives HTTP header with injected credential
  • Agent CLI sees the header; never sees the raw secret file

Related frameworks

same archetype · same primary tool · same memory type

CodeMachine CLI ★ 2.5k

JavaScript-DSL workflow orchestration engine that captures repeatable AI coding agent workflows with tracks, condition groups,…

Codexia ★ 690

Tauri desktop app providing visual control plane, task scheduler, git worktree manager, and headless REST API for Codex CLI +…

Kagan ★ 88

Kanban TUI for AI coding agents with a structurally enforced human review gate (REVIEW → DONE cannot be automated) — one git…

oh-my-claudecode (Yeachan-Heo) ★ 35k

Zero-learning-curve teams-first multi-agent orchestration for Claude Code with autopilot (6-phase lifecycle), ralph (PRD-driven…

Paseo ★ 6.8k

Multi-provider AI coding agent orchestration daemon with cross-device access (phone/desktop/CLI) and git worktree isolation.

CCG Workflow ★ 5.4k

Routes Claude + Codex + Gemini to task-appropriate collaboration strategies (direct-fix through full-collaborate) with hook-based…