Skip to content
/

sandcastle (mattpocock)

sandcastle-mattpocock · mattpocock/sandcastle · ★ 5.1k · last commit 2026-05-16

Primitive shape
No installable primitives
00

Summary

sandcastle (mattpocock) — Summary

Sandcastle by mattpocock is a TypeScript npm library (@ai-hero/sandcastle) for orchestrating AI coding agents in isolated container sandboxes via a programmatic API (sandcastle.run()). With 5,103 stars and 8 contributors, it is the most technically distinct framework in this batch: it is NOT a skill pack or prompt collection but a TypeScript Effect-based library that spawns Claude Code, Codex, or custom agents inside Docker, Podman, or Vercel Firecracker microVM sandboxes with git-worktree branch management and merge-back. The .sandcastle/ configuration directory holds prompt files, environment configuration, coding standards, and a main.ts orchestration script. A sandcastle CLI provides init and interactive mode. Closest seed comparison: like claude-flow in that it is a multi-agent runtime with its own execution infrastructure, but sandcastle uses container isolation (Docker/Podman/Vercel) rather than MCP tools, and operates as a TypeScript library rather than a CLI plugin — making it fundamentally an orchestration SDK rather than a workflow framework.

01

Overview

sandcastle (mattpocock) — Overview

Origin

Created by mattpocock (@mattpocock), well-known TypeScript educator/creator (Total TypeScript). 5,103 stars, 8 contributors. Active to May 2026. Package: @ai-hero/sandcastle (AI Hero company).

Philosophy

From README:

"A TypeScript library for orchestrating AI coding agents in isolated sandboxes: You invoke agents with a single sandcastle.run(). Sandcastle handles sandboxing the agent with a configurable branch strategy. The commits made on the branches get merged back."

"Sandcastle is provider-agnostic — it ships with built-in providers for Docker, Podman, and Vercel, and you can create your own. Great for parallelizing multiple AFK agents, creating review pipelines, or even just orchestrating your own agents."

Key Design Decisions

  1. Programmatic TypeScript API: not a CLI plugin or skill pack — a library you import and call
  2. Container isolation: Docker/Podman bind-mount sandboxes or Vercel Firecracker microVMs
  3. Branch strategy: agent changes go to isolated branches; sandcastle merges them back
  4. maxIterations: configurable loops per run (default: 1)
  5. Lifecycle hooks: host and sandbox hooks for setup (onWorktreeReady, onSandboxReady)
  6. Effect library: uses effect TypeScript library for functional composition

Example Entry File

import { run, claudeCode } from "@ai-hero/sandcastle";
import { docker } from "@ai-hero/sandcastle/sandboxes/docker";

await run({
  agent: claudeCode("claude-opus-4-7"),
  sandbox: docker(),
  promptFile: ".sandcastle/prompt.md",
});

Use Cases

  • Parallelizing multiple AFK agents across independent branches
  • Review pipelines (implement → review → merge)
  • CI-triggered autonomous coding
  • Overnight feature development
02

Architecture

sandcastle (mattpocock) — Architecture

Package Structure

npm package @ai-hero/sandcastle published by AI Hero. TypeScript monorepo using changesets for versioning. CLI binary sandcastle at dist/main.js.

Core Modules

src/
  main.ts          — CLI entry (sandcastle init, sandcastle interactive)
  run.ts           — programmatic run() API
  sandbox/
    docker.ts      — Docker bind-mount provider
    podman.ts      — Podman bind-mount provider
    vercel.ts      — Vercel Firecracker microVM provider
    noSandbox.ts   — passthrough (no isolation)
  branch/
    head.ts        — work on current HEAD (no isolation)
    branch.ts      — isolated branch per run
    merge-to-head.ts — isolated branch + merge back (default for Docker/Vercel)
  session/
    hostSessionStore.ts    — persists agent session on host
    sandboxSessionStore.ts — persists agent session inside container
    transferSession.ts     — copies session across boundary
    codexHostSessionStore.ts — session adapter for Codex
  agents/
    claudeCode.ts  — Claude Code agent definition
    codex.ts       — OpenAI Codex agent definition
    custom.ts      — BYO agent interface
  output/
    structuredOutput.ts — XML tag extraction (<plan>...</plan>)
  worktree/
    createWorktree.ts   — git worktree management
    worktreeVariants.ts — run-in-worktree helpers

Effect Library Foundation

All internal logic uses the effect TypeScript library (functional composition, typed errors, resource management). External API (run(), createSandbox()) is unwrapped to Promise for usability.

Sandbox Providers

Provider Isolation Mechanism Notes
docker() container bind-mount host repo into container requires Docker daemon
podman() container bind-mount via Podman rootless option
vercel() VM Firecracker microVM requires Vercel account
noSandbox() none runs in CWD development/testing only

Branch Strategies

Strategy When used Behavior
head noSandbox default changes land on current branch
branch explicit opt-in creates sandcastle/<run-id> branch
merge-to-head docker/vercel default branch created, merged after completion

Lifecycle Hooks

{
  host: {
    onWorktreeReady: async (ctx) => { /* called on host after git worktree created */ },
    onSandboxReady:  async (ctx) => { /* called on host after container started */ },
  },
  sandbox: {
    onSandboxReady: async (ctx) => { /* called inside container at startup */ },
  }
}

.sandcastle/ Configuration Directory

File Purpose
.sandcastle/.env.example environment variable template
.sandcastle/CODING_STANDARDS.md injected into agent context
.sandcastle/Dockerfile custom sandbox image
.sandcastle/implement-prompt.md prompt for implementation phase
.sandcastle/merge-prompt.md prompt for merge/review phase
.sandcastle/plan-prompt.md prompt for planning phase
.sandcastle/review-prompt.md prompt for review agents
.sandcastle/run.ts orchestration entry point (user-authored)

Structured Output

Agents write XML tags in output:

<plan>
  JSON or markdown plan content
</plan>

Host structuredOutput.ts extracts content between tag pairs, enabling plan→execute pipelines.

Multi-Phase Orchestration Pattern (from run.ts example)

// Phase 1: Plan (single orchestrator agent in Docker)
const plan = await run({
  agent: claudeCode("claude-opus-4-7"),
  sandbox: docker(),
  promptFile: ".sandcastle/plan-prompt.md",
});

// Phase 2: Parallel execute + review (up to 4 concurrent)
const results = await Promise.allSettled(
  tasks.map(task => run({
    agent: claudeCode("claude-opus-4-7"),
    sandbox: docker(),
    promptFile: ".sandcastle/implement-prompt.md",
    maxIterations: 10,
  }))
);

Session Management

  • hostSessionStore — persists Claude Code session on host filesystem
  • sandboxSessionStore — persists session inside sandbox, survives container restart
  • transferSession — copies session data across sandbox boundary (host→container)
  • codexHostSessionStore — wraps Codex's session file format

No Hook System

sandcastle does NOT use Claude Code hooks.json or Codex lifecycle hooks. All control flow is JavaScript/TypeScript in the user's run.ts. The lifecycle is: worktree created → sandbox started → agent invoked → output extracted → merge/cleanup.

03

Skills And Commands

sandcastle (mattpocock) — Skills and Commands

No Skills or Slash Commands

sandcastle is a TypeScript library, not a Claude Code plugin. It ships no .claude/commands/, no skills/ directory, and no slash command definitions. Skill-equivalent functionality is expressed as TypeScript API calls.

CLI Commands (sandcastle binary)

The sandcastle CLI has exactly two subcommands:

sandcastle init

Scaffolds the .sandcastle/ configuration directory in the current project:

.sandcastle/
  .env.example
  CODING_STANDARDS.md
  Dockerfile
  implement-prompt.md
  merge-prompt.md
  plan-prompt.md
  review-prompt.md
  run.ts

sandcastle interactive (/ sandcastle)

Launches an interactive REPL-style mode for running agents interactively. Built with @clack/prompts for terminal UI.

Programmatic API Surface

The primary interface is the TypeScript API:

Export Type Purpose
run(options) async function Execute agent in sandbox with branch strategy
interactive(options) async function Interactive agent session
claudeCode(model) factory Create Claude Code agent definition
codex(model) factory Create OpenAI Codex agent definition
docker(options?) factory Create Docker sandbox provider
podman(options?) factory Create Podman sandbox provider
vercel(options?) factory Create Vercel Firecracker provider
noSandbox() factory No-isolation passthrough
createSandbox(provider) async Low-level sandbox lifecycle control
createWorktree(options) async Low-level git worktree management

run() Options

interface RunOptions {
  agent: AgentDefinition;       // claudeCode() or codex() result
  sandbox: SandboxProvider;     // docker(), podman(), vercel(), noSandbox()
  promptFile?: string;          // path to .md prompt file
  prompt?: string;              // inline prompt string
  maxIterations?: number;       // agent loop count (default: 1)
  branchStrategy?: BranchStrategy; // head | branch | merge-to-head
  host?: {
    onWorktreeReady?: HookFn;
    onSandboxReady?: HookFn;
  };
  sandbox?: {
    onSandboxReady?: HookFn;
  };
  output?: {
    tags?: string[];            // XML tags to extract from output
  };
}

Agent Definitions

// Claude Code agent
const agent = claudeCode("claude-opus-4-7");
const agent = claudeCode("claude-sonnet-4-5");

// OpenAI Codex agent
const agent = codex("codex-1");
const agent = codex("o4-mini");

Prompt Loading

Prompts are loaded from markdown files at runtime. The promptFile path is relative to the project root. CODING_STANDARDS.md is concatenated into the prompt context automatically when present in .sandcastle/.

No MCP Integration

sandcastle does not expose or consume MCP servers. Agent tool access is determined by the agent's own configuration (Claude Code's built-in tools, Codex's built-in tools). The sandbox provides filesystem isolation but does not intercept tool calls.

05

Agents And Subagents

sandcastle (mattpocock) — Agents and Subagents

Agent Model

sandcastle uses the term "agent" to mean a single AI coding session (one Claude Code process or one Codex session) running inside a sandbox. There are no persistent named persona agents or role-based subagents.

Built-in Agent Definitions

claudeCode(model)

Wraps Claude Code as the execution engine:

import { claudeCode } from "@ai-hero/sandcastle";
const agent = claudeCode("claude-opus-4-7");
const agent = claudeCode("claude-sonnet-4-5");
const agent = claudeCode("claude-haiku-4-5");

Internally spawns claude CLI inside the sandbox with the given model flag.

codex(model)

Wraps OpenAI Codex CLI:

import { codex } from "@ai-hero/sandcastle";
const agent = codex("codex-1");
const agent = codex("o4-mini");

Internally spawns codex CLI inside the sandbox.

Custom Agents

BYO agent interface — implement the AgentDefinition interface to wrap any CLI or subprocess as an agent.

Multi-Agent Patterns

sandcastle has no built-in multi-agent topology. Multi-agent is achieved by the user calling run() multiple times in JavaScript:

Parallel Workers

// 4 concurrent implementation agents
const workers = await Promise.allSettled(
  tasks.map(task => run({
    agent: claudeCode("claude-opus-4-7"),
    sandbox: docker(),
    prompt: task.description,
  }))
);

Plan-Review Pipeline

// Agent 1: Plan
const planResult = await run({
  agent: claudeCode("claude-opus-4-7"),
  sandbox: docker(),
  promptFile: ".sandcastle/plan-prompt.md",
});

// Agent 2: Implement (per task in plan)
// Agent 3: Review (per implementation)

Reviewer Agent

await run({
  agent: claudeCode("claude-opus-4-7"),
  sandbox: docker(),
  promptFile: ".sandcastle/review-prompt.md",
  // review-prompt.md references the implementation output
});

AGENTS.md

Present at repo root and in .factory/ and .claude/ directories. Documents expected agent behavior for contributors to the sandcastle project itself (not for end users of the library). Contains standard Claude Code contributor guidance.

No Spawning Primitives

sandcastle agents do NOT spawn subagents via:

  • Claude Code Task tool
  • MCP tool calls
  • Agent SDK threads

All parallelism is at the orchestration layer (JavaScript Promise.allSettled), not at the agent-within-agent level.

maxIterations

Each run() call has a maxIterations parameter (default: 1) that controls how many times the agent loops before stopping. This is a per-run loop count, not a global iteration budget:

await run({
  agent: claudeCode("claude-opus-4-7"),
  maxIterations: 20,  // agent can loop up to 20 times within this run
  promptFile: ".sandcastle/implement-prompt.md",
});

Isolation Per Agent

Each run() call gets its own:

  • Git worktree (if using branch strategies)
  • Sandbox instance (if using Docker/Podman/Vercel)
  • Session store (if session persistence enabled)

This means N parallel run() calls = N isolated containers + N isolated branches.

09

Uniqueness

sandcastle (mattpocock) — Uniqueness

Differs From Seeds

No seed framework uses container isolation (Docker/Podman/Firecracker microVM) as its primary isolation mechanism. All 11 seeds use git worktrees, file system isolation, or no isolation. sandcastle is the only framework in the entire batch that provides VM-level agent isolation as a first-class primitive.

The TypeScript library API (rather than CLI plugin, skill pack, or prompt collection) is also unique in the batch. Every other framework is a Claude Code plugin, a CLI wrapper, a shell script collection, or a markdown system. sandcastle is an npm-importable SDK.

Positioning

  • Only container-isolated framework in batch of 33: Docker/Podman bind-mount and Vercel Firecracker microVM
  • Only TypeScript Effect-based SDK in the batch: functional composition via effect library, typed errors, resource management
  • Provider-agnostic sandbox abstraction: four interchangeable sandbox providers with identical API
  • Branch strategy abstraction: head/branch/merge-to-head as swappable strategies, not hardcoded git patterns
  • mattpocock brand: creator of Total TypeScript (significant TypeScript community reach), giving this library outsized visibility vs its 8-contributor count
  • AFK agent use case explicit: designed for "overnight feature development", "CI-triggered autonomous coding" — not interactive pair programming

Closest Comparisons

  • claude-flow (seed): both are multi-agent runtimes with their own execution infrastructure. claude-flow uses MCP tools for coordination; sandcastle uses container isolation + TypeScript promises. claude-flow is a CLI plugin; sandcastle is a library.
  • sandcastle vs oh-my-codex-yeachan: both orchestrate Claude Code for parallel work. Yeachan uses tmux + session pooling; sandcastle uses Docker + git worktrees. Yeachan has 46 skills and a rich plugin ecosystem; sandcastle has zero skills and pure SDK composition.
  • sandcastle vs vnx-orchestration: both have multi-provider multi-agent execution. VNX is Python with a governance-first philosophy; sandcastle is TypeScript with an infrastructure-first philosophy.

Observable Failure Modes

  • Docker daemon required: no fallback if Docker not running; noSandbox() is explicitly unsafe for production
  • Effect library learning curve: internal Effect types leak into error messages; TypeScript users unfamiliar with Effect find debugging difficult
  • No crash recovery: failed run() leaves no recoverable state — user must implement their own checkpointing
  • Vercel lock-in for VM isolation: if Docker is unavailable (CI without DinD), Vercel microVM requires a Vercel account and incurs cost
  • maxIterations=1 default: the default single-iteration setting is surprising; most users need higher values
  • No approval gates: no built-in way to pause between phases for human review without custom @clack/prompts code

Distinctive Opinion

"Container isolation is the right primitive for autonomous agents. Git branches are the right handoff mechanism. TypeScript is the right composition language. Everything else is userland."

sandcastle makes a strong bet that the correct level of abstraction for AI coding agents is container sandboxes + TypeScript async/await, rejecting the prompt-engineering and CLI-plugin approaches that dominate the rest of the ecosystem.

04

Hooks And Automation

sandcastle (mattpocock) — Hooks and Automation

No Claude Code / Codex Lifecycle Hooks

sandcastle does NOT use:

  • Claude Code hooks.json (SessionStart / PreToolUse / PostToolUse / Stop)
  • Codex config.json hooks
  • Any event-driven hook system

All automation is implemented in TypeScript inside the user's run.ts orchestration script. The user composes run() calls in sequence or in parallel using standard JavaScript async patterns.

Lifecycle Hook API (host/sandbox callbacks)

sandcastle exposes its own non-event-driven hook API as part of the run() options object:

await run({
  agent: claudeCode("claude-opus-4-7"),
  sandbox: docker(),
  host: {
    onWorktreeReady: async ({ worktreePath }) => {
      // Runs on host after git worktree is created, before sandbox starts
      // Use case: inject extra files, set up fixtures, copy secrets
      await fs.copyFile('.env.local', `${worktreePath}/.env`);
    },
    onSandboxReady: async ({ container }) => {
      // Runs on host after container is started, before agent invoked
      // Use case: warm up dependencies, health check, install packages
    },
  },
  sandbox: {
    onSandboxReady: async () => {
      // Runs inside container at container startup
      // Use case: install packages, run migrations, setup DB
    },
  },
});

Automation Patterns

Sequential Phase Pipeline

// Each run() is awaited — sequential phases
const planOutput = await run({ agent, promptFile: 'plan-prompt.md' });
const planData = planOutput.structured.plan;

for (const task of planData.tasks) {
  await run({ agent, prompt: `Implement: ${task}` });
}

Parallel Agent Execution

// Promise.allSettled — parallel, partial failure tolerant
const results = await Promise.allSettled(
  branches.map(branch => run({
    agent,
    sandbox: docker(),
    branchStrategy: 'merge-to-head',
    promptFile: '.sandcastle/implement-prompt.md',
  }))
);

CI Integration

// In GitHub Actions / CI scripts
import { run, claudeCode } from "@ai-hero/sandcastle";
import { docker } from "@ai-hero/sandcastle/sandboxes/docker";

await run({
  agent: claudeCode("claude-opus-4-7"),
  sandbox: docker(),
  promptFile: ".sandcastle/prompt.md",
  maxIterations: 20,
  branchStrategy: "merge-to-head",
});

Structured Output Extraction

The only automated data extraction mechanism is XML tag parsing in agent output:

const result = await run({
  agent,
  output: { tags: ["plan", "summary"] },
  promptFile: ".sandcastle/plan-prompt.md",
});

// Agent writes: <plan>{"tasks": [...]}</plan> in its output
const plan = result.structured.plan;  // parsed content

Git Automation

When using merge-to-head branch strategy:

  1. sandcastle creates a git worktree at sandcastle/<run-id>
  2. Agent makes commits on the worktree branch
  3. After agent completes, sandcastle merges the branch back to HEAD
  4. Worktree is cleaned up

This is automatic — no user code required for the git operations.

Session Persistence

Sessions are optionally persisted via session stores:

  • hostSessionStore: saves session to host filesystem (survives re-runs)
  • sandboxSessionStore: saves session inside container (survives container restart within a run)
  • transferSession: copies session data from host to sandbox (resume across boundaries)

No automatic session persistence — user must opt in via run() options.

06

Workflow And Phases

sandcastle (mattpocock) — Workflow and Phases

No Prescribed Workflow

sandcastle is an orchestration SDK, not a workflow framework. It ships no built-in phases, no approval gates, and no prescribed feature-development lifecycle. The user designs their own workflow in run.ts.

Default Single-Run Pattern

run() called
  → create git worktree
  → start sandbox (Docker/Podman/Vercel)
  → invoke agent with promptFile
  → agent loops up to maxIterations
  → extract structured output (optional)
  → merge branch to HEAD (if merge-to-head strategy)
  → clean up worktree + container

Suggested Multi-Phase Pattern (from examples/docs)

1. Init Phase

sandcastle init
# Creates .sandcastle/ with template files
# User edits implement-prompt.md, plan-prompt.md, etc.

2. Plan Phase

User writes run.ts:

const planResult = await run({
  agent: claudeCode("claude-opus-4-7"),
  sandbox: docker(),
  promptFile: ".sandcastle/plan-prompt.md",
  output: { tags: ["plan"] },
});
const tasks = JSON.parse(planResult.structured.plan).tasks;

3. Parallel Implement Phase

await Promise.allSettled(
  tasks.map(task => run({
    agent: claudeCode("claude-opus-4-7"),
    sandbox: docker(),
    maxParallel: 4,
    promptFile: ".sandcastle/implement-prompt.md",
    prompt: `Task: ${task.description}`,
    maxIterations: 10,
  }))
);

4. Review Phase (optional)

await run({
  agent: claudeCode("claude-opus-4-7"),
  sandbox: docker(),
  promptFile: ".sandcastle/review-prompt.md",
});

5. Merge Phase

Automatic when branchStrategy: "merge-to-head" — no explicit phase needed.

No Approval Gates

sandcastle has no built-in human-in-the-loop checkpoints. Any approval gates must be implemented by the user in run.ts (e.g., using @clack/prompts to pause and ask for confirmation between phases).

No State Machine

No .sandcastle/state/ directory, no phase tracking, no resumable execution. Each run() is independent — if a run fails, the user re-runs from where their run.ts left off (or implements their own checkpointing).

Sandbox Lifecycle per Phase

Each phase gets its own sandbox (container/VM):

  • Containers are created fresh for each run() call
  • Host git worktree is the handoff mechanism between phases
  • If phase N commits to branch X, phase N+1 can read branch X from within its own sandbox

CI/CD Integration Pattern

The typical production pattern is CI-triggered:

# .github/workflows/sandcastle.yml
- run: npx tsx .sandcastle/run.ts

The user's run.ts orchestrates the full multi-phase pipeline as a single Node.js script.

Interactive Mode

For development/exploration:

sandcastle interactive
# REPL-style agent session with @clack/prompts TUI
# Not part of CI — for human-driven exploration

No Spec-Driven Development

sandcastle has no spec file format, no requirements-first gate, and no design approval step. It is purely an execution and isolation primitive. The user decides what the agent works on via promptFile content.

07

State And Memory

sandcastle (mattpocock) — State and Memory

No Built-in Project Memory

sandcastle ships no CLAUDE.md injection system, no knowledge base, no vector store, and no persistent project memory beyond what the agent accumulates in its own session context.

Session Stores

The primary state mechanism is Claude Code's session persistence, wrapped in sandcastle's session store API:

hostSessionStore

Saves the Claude Code session to the host filesystem:

import { hostSessionStore } from "@ai-hero/sandcastle/session";

await run({
  agent: claudeCode("claude-opus-4-7"),
  sessionStore: hostSessionStore(".sandcastle/sessions/run-001"),
});

Session survives across multiple run() calls. Agent "remembers" prior context.

sandboxSessionStore

Saves session inside the container. Useful for multi-iteration runs where the container is restarted:

import { sandboxSessionStore } from "@ai-hero/sandcastle/session";

transferSession

Copies a session from host to sandbox at container startup. Enables warm-starting an agent inside a fresh container with prior session context:

import { transferSession } from "@ai-hero/sandcastle/session";

codexHostSessionStore

Session adapter for Codex CLI's session file format. Allows Codex agents to use the same session persistence pattern.

Git Worktree as State

The primary persistent state mechanism is the git repository:

  • Each run() creates a worktree at sandcastle/<run-id>
  • Agent commits land in the worktree's branch
  • On completion, merge-to-head merges commits to main branch
  • Merged commits are the durable record of what happened

.sandcastle/ Directory

Serves as lightweight project configuration (not runtime state):

.sandcastle/
  .env.example          — environment template (committed)
  CODING_STANDARDS.md   — injected into agent context
  Dockerfile            — custom sandbox image
  implement-prompt.md   — prompt templates (committed)
  plan-prompt.md
  review-prompt.md
  run.ts                — orchestration logic

No .sandcastle/state/, no per-run tracking files, no resume support.

Structured Output as Handoff

The only inter-phase state handoff mechanism is structured output extraction:

const result = await run({
  output: { tags: ["plan"] },
  promptFile: ".sandcastle/plan-prompt.md",
});
// result.structured.plan contains agent output between <plan>...</plan>
// Caller passes this data to next run() call

No Crash Recovery

If run() throws, the container is stopped and the worktree is cleaned up. No partial state is preserved. User must implement their own checkpointing in run.ts if needed.

Context Window Management

No automatic context compaction or context rotation. The agent handles its own context window within its session. maxIterations is the only control the user has over agent loop duration.

Environment Variables

.sandcastle/.env.example documents required env vars. The user populates .sandcastle/.env (gitignored). The onSandboxReady hook or Dockerfile ENV instructions inject these into the container.

No Audit Log

No structured execution log beyond what the terminal outputs and what agent commits leave in git history. The git commit history is the de facto audit trail.

08

Ui Cli Surface

sandcastle (mattpocock) — UI & CLI Surface

CLI Binary

sandcastle — ships as part of the @ai-hero/sandcastle npm package.

bin: { "sandcastle": "dist/main.js" }

Install:

npm install @ai-hero/sandcastle
npx sandcastle init

CLI Subcommands

sandcastle init

Scaffolds .sandcastle/ configuration directory in the current project. Creates template files: .env.example, CODING_STANDARDS.md, Dockerfile, implement-prompt.md, merge-prompt.md, plan-prompt.md, review-prompt.md, run.ts.

Run once per project. Non-destructive on re-run (does not overwrite existing files).

sandcastle (interactive mode)

Running sandcastle without a subcommand (or with interactive) launches an interactive TUI session using @clack/prompts. Allows running agents interactively with terminal prompts for configuration.

TUI: @clack/prompts

The interactive mode uses @clack/prompts for terminal UI:

  • Spinner feedback during agent runs
  • Select prompts for sandbox/model choice
  • Text input for custom prompts

No web dashboard. No browser UI.

No Web Dashboard

sandcastle has no web server, no localhost dashboard, and no visual monitoring interface. Observability is terminal output only.

Programmatic API (Primary Surface)

The main interface for production use is the TypeScript API, not the CLI:

import { run, claudeCode } from "@ai-hero/sandcastle";
import { docker } from "@ai-hero/sandcastle/sandboxes/docker";

await run({
  agent: claudeCode("claude-opus-4-7"),
  sandbox: docker(),
  promptFile: ".sandcastle/prompt.md",
});

Package Exports

{
  "exports": {
    ".":                    "@ai-hero/sandcastle (main API)",
    "./sandboxes/docker":   "docker() provider",
    "./sandboxes/podman":   "podman() provider",
    "./sandboxes/vercel":   "vercel() provider",
    "./sandboxes/noSandbox":"noSandbox() provider",
    "./session":            "session store utilities",
    "./output":             "structured output extraction"
  }
}

Terminal Output

During run():

  • Spinner showing current agent action
  • Streaming agent output (configurable)
  • Final structured output summary

No progress bars, no phase tracking UI.

Update Mechanism

Standard npm: npm update @ai-hero/sandcastle. No self-update command. Uses changesets for versioning (v0.5.12 at time of analysis).

Cross-Platform

  • macOS: Docker Desktop or Podman
  • Linux: Docker or Podman (native)
  • Windows: WSL2 required for Docker
  • Vercel sandbox: cross-platform via Vercel API

No Windows-native support for Docker bind-mounts (WSL2 path mapping required).

10

Install And Config

sandcastle (mattpocock) — Install and Config

Installation

npm install @ai-hero/sandcastle
# or
pnpm add @ai-hero/sandcastle
# or
yarn add @ai-hero/sandcastle

Package: @ai-hero/sandcastle (AI Hero organization on npm) Current version: v0.5.12

Prerequisites

Required

  • Node.js 18+ (TypeScript runtime)
  • One of:
    • Docker Desktop (macOS/Windows) or Docker Engine (Linux)
    • Podman
    • Vercel account (for Firecracker microVM)
  • Claude Code CLI (claude) installed — for claudeCode() agent
  • OR Codex CLI (codex) installed — for codex() agent

Optional

  • tsx for running .sandcastle/run.ts directly without build step

Project Setup

# 1. Install package
npm install @ai-hero/sandcastle

# 2. Scaffold .sandcastle/ directory
npx sandcastle init

# 3. Edit prompts
# .sandcastle/implement-prompt.md
# .sandcastle/plan-prompt.md

# 4. Set environment variables
cp .sandcastle/.env.example .sandcastle/.env
# Edit .env with ANTHROPIC_API_KEY, etc.

# 5. Edit .sandcastle/run.ts to define your orchestration
# (or use the template directly)

# 6. Run
npx tsx .sandcastle/run.ts

Configuration Files

.sandcastle/run.ts (required)

User-authored TypeScript orchestration script. Entry point for all agent runs. No schema — pure TypeScript.

.sandcastle/Dockerfile (optional)

Custom Docker image for sandbox. If absent, sandcastle uses a default Node.js + claude-code image.

.sandcastle/CODING_STANDARDS.md (optional)

Automatically injected into every agent's context. Project-specific coding rules, style guides, constraints.

.sandcastle/.env (required, gitignored)

Runtime secrets:

ANTHROPIC_API_KEY=sk-ant-...
OPENAI_API_KEY=sk-...  # if using codex()

.sandcastle/.env.example (committed)

Template for .env. Committed to repo so contributors know what variables to set.

TypeScript Configuration

sandcastle ships full TypeScript types. No tsconfig changes required. Works with strict: true.

// Type-safe options
import type { RunOptions, AgentDefinition, SandboxProvider } from "@ai-hero/sandcastle";

Sandbox-Specific Prerequisites

docker() Provider

# Verify Docker is running
docker info

# sandcastle uses bind mounts — no special Docker privileges needed
# Host repo is bind-mounted into container at /workspace

podman() Provider

# Rootless podman supported
podman --version

vercel() Provider

# Requires VERCEL_TOKEN in .env
VERCEL_TOKEN=vercel_...

noSandbox() Provider

No prerequisites. Runs agent directly in CWD. Explicitly marked as unsafe for untrusted agent code.

CI/CD Integration

# .github/workflows/sandcastle.yml
name: Sandcastle Agent
on:
  workflow_dispatch:
    inputs:
      task:
        description: "Task description"
        required: true

jobs:
  agent:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with: { node-version: '20' }
      - run: npm install
      - run: npx tsx .sandcastle/run.ts
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}

Update

npm update @ai-hero/sandcastle
# Or specific version:
npm install @ai-hero/sandcastle@latest

No self-update mechanism. Versioning via changesets (semantic versioning).

Related frameworks

same archetype · same primary tool · same memory type

Daytona ★ 72k

Provide secure, elastic, sub-90ms sandbox compute infrastructure for running AI-generated code, accessible via multi-language…

CUA ★ 17k

Unified SDK for building, benchmarking, and deploying agents that interact with full OS GUIs via isolated VMs.

E2B ★ 12k

Run AI-generated code safely in cloud-hosted isolated sandboxes via a 3-line SDK integration.

OpenSandbox ★ 11k

Protocol-first general-purpose sandbox platform for AI applications with multi-language SDKs and pluggable isolation backends.

Microsandbox ★ 6.3k

Spawn hardware-isolated microVMs as child processes directly from application code, with no server setup, in under 100ms.

CubeSandbox ★ 5.9k

Sub-60ms KVM microVM sandboxes for AI agents with E2B drop-in compatibility and <5MB memory overhead.