Batch 33 — Sandbox Runtimes Overflow (microVM / WASM / K8s / CUA)

Roster (8 frameworks)

slug	stars	distribution	cli_binary	local_ui	orchestration	multi_model	tier
agenttier	19	standalone-repo (Helm)	`agenttier` (Go+Python)	web-dashboard (Next.js)	none (infra only)	no	A
cua-sandbox	17,104	standalone-repo (pip + curl)	`lume`/`cua-driver`/`cuabot`/`cb`	native-desktop-H265	parallel-fan-out	no	A
sandboxed-sh	438	standalone-repo (Docker/native)	none	web-dashboard (Next.js, :3000)	hierarchical	yes	A
opensandbox	10,828	standalone-repo (pip/npm/go/etc)	`osb`	none	parallel-fan-out	no	A
cubesandbox	5,940	standalone-repo (one-click sh)	`cubemastercli`	web (??)	parallel-fan-out	no	A
swe-rex	508	pip package	none	none	parallel-fan-out	no	A
capsule	285	pip + npm	`capsule`	none	none	no	A
open-agent-thorgal	438	(duplicate of sandboxed-sh)	—	—	—	—	C (duplicate)

Intra-batch patterns

All 8 entries sit below the agent loop rather than at it — they are execution environments, execution protocols, or infrastructure platforms that run underneath whatever agent framework operates above them. None ships slash-commands, Claude Code hooks, or skill files as their primary artifact (sandboxed.sh is the exception, shipping 2 orchestrator skills, but its core value is the Rust server + workspace isolation). The batch divides cleanly into four sub-categories by isolation primitive: K8s pod (AgentTier), macOS/QEMU VM (CUA), systemd-nspawn/Docker (sandboxed.sh), container+pluggable secure runtime (OpenSandbox, CubeSandbox), persistent bash session abstraction (SWE-ReX), and WebAssembly function sandbox (Capsule). Two entries have enterprise Asian-technology-company origins (OpenSandbox/Alibaba, CubeSandbox/Tencent) and target Chinese-cloud-native deployment contexts alongside global markets.

Most interesting finds

CubeSandbox — achieves sub-60ms KVM microVM cold starts via snapshot+CoW with <5MB memory overhead, enabling thousands of sandboxes per node. E2B drop-in compatibility is a clever go-to-market move targeting the entire E2B user base. This is the strongest isolation-at-speed tradeoff in the corpus.
Capsule — the only WASM-native isolation in the corpus. WebAssembly fuel metering as CPU control (instruction counting, not cgroups) is a novel architecture. Cross-platform (no Linux/KVM) gives it a deployment envelope no other batch entry has. The function-level (not session-level) granularity is a unique design choice.

Items written as Tier C

open-agent-thorgal: Duplicate of sandboxed-sh (same GitHub repo, Th0rgal/openagent self-describes as "Formerly known as Open Agent"). Only summary + METRICS written; canonical: false. Full 11-file analysis lives in sandboxed-sh.

Cross-references discovered

sandboxed-sh and open-agent-thorgal are the same repo (Th0rgal/openagent) — marked non-canonical.
sandboxed.sh orchestrator-boss skill's multi-model guidance explicitly names codex/gpt-5.5, gemini-3.1-pro-preview, claudecode, and opencode — real multi-model routing encoded in a persona-md skill file, which is unusual.
CUA ships a skills/ directory that integrates with Claude Code — making it a hybrid: primarily infrastructure, secondarily a skills provider.
OpenSandbox references kubernetes-sigs/agent-sandbox as a related project (batch 33's own agent-sandbox-k8s already analyzed in another batch).
SWE-ReX was extracted from SWE-agent and is used by Mini-SWE-agent — the SWE-agent ecosystem is a family of related projects.
AgentTier references E2B, Daytona as implicit competitors; CubeSandbox explicitly targets E2B migration; OpenSandbox references E2B API shape — E2B is the de facto reference API for this sandbox layer.