Batch 18 — Sandboxed/isolated personal harnesses: container/WASM/scoped-FS isolation

Roster (10)

slug	stars	distribution	cli_binary	local_ui	isolation_mechanism	multi_agent	tier
nanoclaw	29,431	npm-package	`ncl`	none	container (Docker, per session)	yes (multi-channel swarm)	A
ironclaw	12,351	cargo crate	`ironclaw`	ratatui TUI	WASM capability sandbox	no	A
paseo	6,754	npm monorepo	`paseo`	Expo mobile + Electron desktop	git-worktree	yes (multi-provider)	A
osaurus	5,509	macOS app	`osaurus`	SwiftUI native macOS app	Apple Container Linux VM	no	A
clawmanager	1,351	k8s-native	`cm`	React 19 dashboard	Kubernetes Pod	yes (K8s swarm)	A
stakpak-agent	1,563	cargo crate	`stp`	ratatui TUI	Docker + Warden network	no	A
agentbox-mattolson	174	go binary	`agentbox`	none	mitmproxy + iptables	no	A
scion-gcp	1,548	go binary	`scion`	React web dashboard (Hub mode)	container + git-worktree	yes (harness-agnostic swarm)	A
terminal-bench-env	82	script	none	none	container (Docker, benchmark isolation)	no	B
code-yeongyu-my-cc-harness	13	config-files	none	none	none	yes (Sonnet+Haiku hierarchy)	A

Intra-batch patterns

The batch theme — isolation mechanism — reveals a spectrum of isolation philosophies with a clear split between security isolation and evaluation isolation:

True security isolation (stakpak-agent, agentbox-mattolson, ironclaw, osaurus): The framework's primary isolation goal is preventing the agent from performing unsafe operations — network egress, filesystem access, credential exfiltration. agentbox-mattolson is the most extreme: two-layer enforcement (mitmproxy sidecar + iptables) specifically to intercept and redirect credentials. ironclaw's WASM capability sandbox prevents tool code from accessing anything not explicitly granted. osaurus uses Apple Container Linux VMs (macOS 26+) for OS-level isolation with a privacy filter.

Execution environment isolation (nanoclaw, scion-gcp, clawmanager): Containers provide agent isolation for state management and reproducibility, not primarily for security. Each agent/session gets a clean environment. The threat model is "contamination between agents," not "agent doing something dangerous."

Evaluation isolation (terminal-bench-env): Docker containers provide reproducibility for benchmark tasks — same starting state every time. Security is irrelevant; isolation serves measurement validity.

No isolation (paseo via git-worktree, code-yeongyu): File-system separation only (worktrees) or no isolation at all (personal harness trusting itself).

A secondary pattern: multi-provider neutrality is more common than expected. scion-gcp explicitly supports Claude/Gemini/Codex/OpenCode equally. paseo supports Claude/Codex/Copilot/OpenCode/Pi. nanoclaw supports multiple messaging channels. This batch has a higher provider-neutrality rate than most other batches — possibly because isolated/containerized setups abstract away the agent binary naturally.

Hook-as-QA appears only in code-yeongyu: running ruff, type checking, import analysis, and comment language enforcement on every PostToolUse event. No other isolated harness in this batch uses hooks this aggressively for quality enforcement.

Most interesting finds

agentbox-mattolson — The mitmproxy + iptables two-layer network isolation architecture is the most sophisticated network security approach in the entire catalog. The design insight — use a MITM proxy to intercept and rewrite credentials from requests (so the agent never sees real credentials), then use iptables to block any traffic not routed through the proxy — creates a credential isolation system where the agent cannot exfiltrate API keys even if it tries. This pattern is applicable to any container-based agent deployment. The framework is early-stage (174 stars) but the architecture is production-relevant.
code-yeongyu/my-claude-code-harness — The executor.md agent uses profane, aggressive language as intentional "attention weight engineering" to enforce single-task focus. The developer explicitly theorizes that strong emotional language creates stronger attention patterns for critical constraints. Combined with comprehensive PostToolUse Python static analysis hooks (ruff, typing, import style, comment language enforcement) running after every file write, this is the most unusual combination of prompt engineering + automated QA in the batch. The profane executor prompt is likely unique in the entire catalog of 330+ frameworks.

Items written as Tier B

terminal-bench-env — Not an agent harness. It is evaluation infrastructure: 3,500+ verified Docker task environments for measuring terminal agent capability, accompanied by a minimal ReAct BashAgent. No workflow methodology, no skills, no hooks, no persistent memory. Treated as Tier B because it has research value and a published paper (arXiv 2602.07274) but does not fit the harness archetype. Full 11 files written.

Cross-references discovered

scion-gcp explicitly targets the same harnesses that appear in other batches: claude-code (batch 1/9 seeds), gemini-cli, codex/opencode — scion is the orchestrator layer above all of them
nanoclaw (nanocoai/nanoclaw) was assigned as qwibitai/nanoclaw in the batch manifest — actual canonical repo is at nanocoai organization; qwibitai may be the founder's personal account that was later transferred
ironclaw's NEAR AI authentication suggests it may be positioning for decentralized AI agent networks (NEAR Protocol ecosystem) — unusual for a terminal tool
osaurus depends on macOS 26 (Apple Container) — a 2026 OS not yet widely deployed; the framework is future-gated against current hardware
paseo daemon on port 6767 is a local relay service enabling mobile (Expo) ↔ desktop (Electron) ↔ CLI coordination — the multi-surface design pattern is unique in this batch

Isolation Mechanism Taxonomy (batch contribution)

This batch clarifies a taxonomy of isolation mechanisms by their primary goal:

Goal	Mechanism	Example
Credential isolation	mitmproxy + iptables	agentbox-mattolson
Tool capability control	WASM capability sandbox	ironclaw
OS-level privacy	Apple Container Linux VM	osaurus
Network egress control	Docker + Warden	stakpak-agent
State separation between agents	Container per session	nanoclaw, scion-gcp
Scalable multi-agent isolation	Kubernetes pods	clawmanager
Benchmark reproducibility	Docker per task	terminal-bench-env
File-system branch separation	Git worktree	paseo
None (personal trust)	—	code-yeongyu