Skip to content
/

OpenShell

openshell-nvidia · NVIDIA/OpenShell · ★ 6.3k · last commit 2026-05-26

Primitive shape 30 total
Commands 9 Skills 19 Subagents 2
00

Summary

NVIDIA OpenShell — Summary

NVIDIA OpenShell is a Rust-built sandbox runtime for autonomous AI agents — providing container and MicroVM isolation with declarative YAML policy enforcement at four layers: filesystem, network, process, and inference. The openshell CLI creates sandboxes pre-loaded with agent tools (Claude Code, Codex, OpenCode, GitHub Copilot), hot-reloads network and inference policies on running sandboxes, and manages credential providers that inject API keys at runtime without exposing them to the sandbox filesystem. OpenShell ships a Ratatui-based terminal TUI (openshell term) for live gateway/sandbox monitoring, plus a rich .agents/skills/ directory (19 skills) and Claude Code subagent definitions for its own development workflow. The project is "built agent-first" — the same agent skills used by contributors are shipped in the repo.

OpenShell is unique among all seeds and batch frameworks in combining: (1) four-layer defense-in-depth policy enforcement (including inference routing); (2) a shipped .agents/skills/ system for its own development; and (3) Claude Code .claude/agents/ subagent personas. It is closest to agent-os (methodology + file conventions for agents) but differs fundamentally — OpenShell ships a full runtime, not just markdown conventions.

01

Overview

NVIDIA OpenShell — Overview

Origin

Built by NVIDIA (initially as "proof-of-life" / alpha software). Written primarily in Rust with Python SDK packaging. Related to NemoClaw (NVIDIA's OpenClaw runtime with managed inference).

Philosophy

From the README:

"OpenShell is the safe, private runtime for autonomous AI agents. It provides sandboxed execution environments that protect your data, credentials, and infrastructure — governed by declarative YAML policies that prevent unauthorized file access, data exfiltration, and uncontrolled network activity."

"OpenShell is built agent-first. The project ships with agent skills for everything from gateway troubleshooting to policy generation, and we expect contributors to use them."

Key beliefs:

  1. Agents need policy enforcement, not just isolation
  2. Credentials should never leak to sandbox filesystems — inject at runtime
  3. Network access should be allowlisted, not blocked generically
  4. Inference routing should be privacy-aware (strip caller credentials, inject backend credentials)
  5. The project itself should be built using the same agent workflows it enables

Alpha Status Note

"Alpha software — single-player mode. OpenShell is proof-of-life: one developer, one environment, one gateway. We are building toward multi-tenant enterprise deployments, but the starting point is getting your own environment up and running."

Policy Model

Policies are declarative YAML. Static sections (filesystem, process) locked at creation; dynamic sections (network, inference) hot-reloadable on running sandboxes:

openshell policy set demo --policy examples/policy.yaml --wait

Protection Layers

Layer Protects When Applied
Filesystem Reads/writes outside allowed paths Locked at creation
Network Unauthorized outbound connections Hot-reloadable
Process Privilege escalation, dangerous syscalls Locked at creation
Inference Reroutes model API calls to controlled backends Hot-reloadable
02

Architecture

NVIDIA OpenShell — Architecture

Distribution

  • Binary (recommended): curl -LsSf https://raw.githubusercontent.com/NVIDIA/OpenShell/main/install.sh | sh
  • PyPI (uv): uv tool install -U openshell
  • Helm chart (experimental): helm install openshell oci://ghcr.io/nvidia/openshell/helm-chart

Directory Structure

crates/
  openshell-cli/       # openshell CLI binary (user-facing)
  openshell-server/    # Gateway control-plane API
  openshell-sandbox/   # Sandbox runtime (container supervision)
  openshell-policy/    # Policy engine (filesystem/network/process/inference)
  openshell-router/    # Privacy-aware LLM routing
  openshell-bootstrap/ # Gateway registration, mTLS bundle storage
  openshell-ocsf/      # OCSF v1.7.0 structured logging
  openshell-core/      # Common types, config, errors
  openshell-providers/ # Credential provider backends
  openshell-tui/       # Ratatui-based terminal dashboard
  openshell-driver-docker/     # Docker compute driver
  openshell-driver-podman/     # Podman compute driver
  openshell-driver-vm/         # libkrun MicroVM driver
  openshell-driver-kubernetes/ # K8s compute driver
  openshell-vfio/      # GPU passthrough (experimental)
  openshell-prover/    # Policy prover
python/openshell/      # Python SDK + CLI packaging
proto/                 # gRPC protobuf definitions
deploy/
  docker/
  helm/
  kubernetes/
.agents/
  skills/              # 19 workflow automation skills
  agents/              # (empty in public listing)
.claude/
  agents/              # Claude Code subagent definitions
    arch-doc-writer.md
    principal-engineer-reviewer.md
architecture/          # Architecture documentation (subsystem*.md)
docs/
fern/                  # Fern docs site config
examples/

Required Runtime

  • macOS, Windows with WSL2, or Linux
  • Docker, Podman, or MicroVM-compatible host (for sandbox execution)
  • No Python required for CLI (binary install)
  • Python (for SDK path)

Compute Drivers

  • Docker (default for most users)
  • Podman (alternative OCI)
  • libkrun MicroVM (openshell-driver-vm) — standalone subprocess, embeds own rootfs + runtime
  • Kubernetes (in-cluster pod-based sandboxes, experimental)

Install Complexity

One-liner binary install. Multi-step for Kubernetes/Helm deployment.

03

Components

NVIDIA OpenShell — Components

CLI Commands

Command Purpose
openshell sandbox create -- <agent> Create sandbox + launch agent
openshell sandbox connect [name] SSH into running sandbox
openshell sandbox list List all sandboxes
openshell provider create ... Create credential provider
openshell policy set <name> ... Apply/update YAML policy
openshell policy get <name> Show active policy
openshell inference set ... Configure inference endpoint
openshell logs [name] --tail Stream sandbox logs
openshell term Launch TUI dashboard

.agents/skills/ (19 skills)

Skill Purpose
build-from-issue Plan + implement GitHub issue (human-gated)
create-github-issue Create formatted GitHub issue
create-github-pr Create PR with standard template
create-spike Investigate unknown problem, produce spike doc
debug-inference Troubleshoot inference connectivity
debug-openshell-cluster Debug gateway + sandbox cluster issues
fix-security-issue Implement security fix (requires state:agent-ready)
generate-sandbox-policy Author YAML policy from requirements
helm-dev-environment Set up Helm dev environment
openshell-cli Run openshell CLI commands
review-github-pr Review PR with principal engineer lens
review-security-issue Security severity assessment + remediation plan
sbom Generate software bill of materials
sync-agent-infra Sync agent infrastructure
test-release-canary Test release candidate
triage-issue Assess and classify community issues
tui-development TUI development workflows
update-docs Update documentation
watch-github-actions Monitor CI/CD pipeline

.claude/agents/ (2 subagents)

Agent Purpose Model
arch-doc-writer Update/create architecture documentation files opus
principal-engineer-reviewer Code/plan/architecture review inherit

Rust Crates

16 crates across CLI, server, policy engine, drivers (Docker/Podman/VM/K8s), TUI, logging, routing, providers, bootstrap, core, prover, VFIO.

Architecture Docs

architecture/ directory contains subsystem documentation maintained by arch-doc-writer agent.

05

Prompts

NVIDIA OpenShell — Prompts

Verbatim Excerpt 1: build-from-issue Skill (from .agents/skills/build-from-issue/SKILL.md)

## Critical: `state:agent-ready` Label Is Human-Only

The `state:agent-ready` label is a **human gate**. It signals that a human has reviewed the plan and authorized the agent to build. Under **no circumstances** should this skill or any agent:

- Apply the `state:agent-ready` label
- Ask the user to let the agent apply it
- Suggest automating its application
- Bypass the check by proceeding without it

If the label is not present, the agent **must stop and wait**. This is a non-negotiable safety control — it ensures a human explicitly authorizes every build.

Technique: Iron-Law prohibition with explicit list of banned behaviors. Pattern matches superpowers' "Iron Law + rationalization table" approach but applied to safety/authorization rather than code quality.


Verbatim Excerpt 2: arch-doc-writer Agent (from .claude/agents/arch-doc-writer.md)

---
name: arch-doc-writer
description: "Use this agent when documentation in the `architecture/` directory needs to be updated or created for a specific file after implementing a feature, fix, refactor, or behavior change. Launch one instance of this agent per file that needs updating."
model: opus
color: yellow
memory: project
---

You are a principal-level technical writer with deep expertise in systems programming, distributed systems, and developer documentation. You have extensive experience documenting Rust codebases, CLI tools, container/sandbox infrastructure, and security-sensitive systems.

## Your Mission
You maintain the contents of documentation files in the `architecture/` directory of this project. Your goal is to keep documentation perfectly synchronized with the actual codebase so that humans and agents can trust it as a reliable source of truth.

Technique: Persona-based subagent with explicit model (opus) and memory scope (project). Role description includes domain expertise statement ("principal-level technical writer with deep expertise in...") — BMAD-style persona activation pattern.


Verbatim Excerpt 3: principal-engineer-reviewer Agent

---
name: principal-engineer-reviewer
description: >
  Use this agent to review existing code, audit plans, evaluate product requirements,
  or get architectural guidance that balances pragmatism, user experience, and security.
tools: Read, Grep, Glob, Bash, WebFetch, WebSearch
model: inherit
memory: project
---

You are a principal engineer reviewing code, plans, and architecture for the OpenShell project.
Your reviews balance three priorities equally:
1. **Pragmatism** — Does the solution match the complexity of the problem?
2. **User empathy** — How does this affect the people who use, operate, and maintain this system?
3. **Security** — What are the threat surfaces?

Technique: Named subagent with explicit tool list, model inheritance, and three-priority framework. CWE/OWASP/CAPEC references in extended instructions indicate security-aware review agent.

09

Uniqueness

NVIDIA OpenShell — Uniqueness

Differs From Seeds

OpenShell has no direct seed equivalent. It is closest to agent-os (both ship instruction files for agents + a runtime philosophy), but the differences are categorical: agent-os is pure markdown conventions with bash scripts; OpenShell is a full Rust runtime with policy-enforced container/MicroVM isolation, hot-reloadable YAML policies, and Privacy Router for inference routing. The .agents/skills/ system resembles superpowers (skills-based agent behavior) and the .claude/agents/ subagents resemble BMAD-METHOD's persona files, but OpenShell ships a complete infrastructure runtime alongside these — not just behavioral files. The L7 inference routing (stripping caller credentials, injecting backend credentials) is not seen in any framework in the entire corpus.

Positioning

The only framework in this batch that ships both (a) a production sandbox runtime with four-layer policy enforcement and (b) a complete agent-driven development workflow (19 skills + 2 Claude subagents) for its own development. The "built agent-first" claim is backed by the actual .agents/skills/ implementation.

Observable Failure Modes

  1. Alpha software: Documented as "proof-of-life; expect rough edges"
  2. NVIDIA-specific: Integration with NemoClaw (NVIDIA's OpenClaw) positions it in NVIDIA's ecosystem
  3. Vouch system friction: External contributors need human vouching before PRs accepted — slows community contributions
  4. Kubernetes experimental: K8s driver and Helm are marked experimental
  5. No multi-tenant yet: Current architecture is single-player (one developer, one gateway, one environment)
  6. GPU requires NVIDIA toolkit: GPU passthrough needs NVIDIA Container Toolkit on host

What Makes It Extraordinary

The build-from-issue skill with its "state:agent-ready is human-only, non-negotiable safety control" policy is the most sophisticated human-in-the-loop agent safety design in the corpus. The architecture documents self-maintained by the arch-doc-writer subagent are a novel approach to keeping documentation synchronized with implementation.

04

Workflow

NVIDIA OpenShell — Workflow

User Workflow (Running Agents in Sandboxes)

Phase Artifact Description
Install openshell binary `curl ...
Create Provider Credential bundle openshell provider create --from-existing
Create Sandbox Running container/MicroVM openshell sandbox create -- claude
Apply Policy YAML policy enforced openshell policy set name --policy file.yaml
Agent Runs Agent + policy enforcement Claude/Codex/OpenCode inside isolated sandbox
Monitor TUI dashboard openshell term

Development Workflow (Contributing to OpenShell)

Human-gated state machine using .agents/skills/:

Community issue filed
  ↓
triage-issue (assess + classify)
  ↓
create-spike (investigate unknowns)
  ↓
[Human applies state:agent-ready label]
  ↓
build-from-issue (implement)
  ↓
review-github-pr (principal engineer review)
  ↓
create-github-pr (submit)

Approval Gates

  1. state:agent-ready label — Human-only gate before build-from-issue executes. Documented as "non-negotiable safety control."
  2. state:review-ready label — After plan is posted; human reviews before implementation
  3. Security issue gate — review-security-issue must complete before fix-security-issue runs

Policy Hot-Reload

Network and inference policy sections can be updated on running sandboxes without restart:

openshell policy set sandbox-name --policy new-policy.yaml --wait

Vouch System

External contributors must be vouched before PRs are accepted (.github/VOUCHED.td list, /vouch comment command).

06

Memory Context

NVIDIA OpenShell — Memory & Context

State Storage

  • Architecture docs (architecture/): Maintained by arch-doc-writer agent — persistent, per-subsystem markdown files
  • Plans (architecture/plans/): Git-ignored; spike/implementation plan documents
  • Build-from-issue plan comment: Single edited comment per GitHub issue with 🏗️ build-plan marker
  • OCSF structured logs: Security-sensitive events logged in OCSF v1.7.0 format

Memory Type

File-based (architecture docs, plan docs). External GitHub issue state for development workflow.

Agent Comment Markers

The build-from-issue skill uses:

  • > **🏗️ build-plan** — identifies the single plan comment (edited in-place as plan evolves)
  • > **🏗️ build-from-issue-agent** — all other agent comments (status updates, responses)

This creates a persistent memory artifact on GitHub issues that survives across agent sessions.

Cross-Session Handoff

Yes — via GitHub issue state (labels, plan comment, conversation history). Each invocation of build-from-issue reads prior state to determine next action.

Claude Code Subagent Memory Scope

Both .claude/agents/ subagents use memory: project — they can read project files but have no cross-session memory beyond files on disk.

Compaction

Not documented for agent workflows. Gateway state managed by openshell-bootstrap (mTLS bundles, auth tokens).

OCSF Logging

Sandbox events logged in OCSF v1.7.0 structured format. Security-relevant events use OCSF; routine operations use plain tracing (Rust's tracing crate).

07

Orchestration

NVIDIA OpenShell — Orchestration

Multi-Agent Support

Yes — the project's development workflow uses multi-agent patterns (parallel subagents, sequential workflow chains). The sandbox runtime itself supports running multiple agents, each in their own sandbox.

Orchestration Pattern (Development Workflow)

Sequential (workflow chains with human gates):

  • triage-issuecreate-spikebuild-from-issue
  • review-security-issuefix-security-issue
  • Policy generation: openshell-cligenerate-sandbox-policy

Parallel (arch-doc-writer): "Launch two arch-doc-writer agents — one for each file — to update the documentation in parallel."

Isolation Mechanism

Container (Docker/Podman default) or MicroVM (libkrun via openshell-driver-vm). Policy-enforced egress routing (HTTP CONNECT proxy intercepts all outbound). L7 policy enforcement (method + path level).

Subagent Definition Format

  • .claude/agents/: persona-md (Claude Code subagent format with YAML frontmatter + markdown persona)
  • .agents/skills/: skill-md (SKILL.md files, skill-pack pattern)

Both formats coexist in the same repo.

Spawn Mechanism

  • Claude Code agents: claude-task-tool (arch-doc-writer description explicitly says "uses Task tool to launch")
  • Skills: harness-native discovery (CONTRIBUTING.md references "your harness can discover and load them natively")

Multi-Model Usage

Yes, per subagent:

  • arch-doc-writer: model=opus (explicit)
  • principal-engineer-reviewer: model=inherit (uses parent session model)

Execution Mode

Event-driven — skills are invoked per user request or issue event, not continuous.

Consensus Mechanism

None automated. Human gate (state:agent-ready label) is the authorization point.

Crash Recovery

Unknown for sandbox runtime. Development workflow uses GitHub issue state as durable record.

Streaming Output

openshell logs --tail streams sandbox logs. TUI refreshes every 2 seconds.

08

Ui Cli Surface

NVIDIA OpenShell — UI / CLI Surface

CLI Binary

  • Name: openshell
  • Not a thin wrapper — own Rust runtime
  • Install: binary, PyPI (uv tool install openshell), or Helm

CLI Subcommands

Command Description
sandbox create -- <agent> Create sandbox + launch agent
sandbox connect [name] SSH into sandbox
sandbox list List all sandboxes
provider create Create credential provider
policy set <name> Apply YAML policy
policy get <name> Show active policy
inference set Configure inference endpoint
logs [name] --tail Stream sandbox logs
term Launch TUI dashboard

Terminal TUI (openshell term)

  • Type: terminal-tui
  • Tech stack: Ratatui (Rust TUI library) — inspired by k9s (Kubernetes TUI)
  • Port: not applicable (terminal, not web)
  • Features:
    • Live gateway health display
    • Sandbox list with status
    • Provider display
    • Auto-refresh every 2 seconds
    • Keyboard navigation (Tab, j/k, Enter, : for command mode)

Agent Skills Surface

.agents/skills/ — 19 skill directories, each with SKILL.md. Harness-discoverable (not Claude-Code-specific).

Claude Code Subagents

.claude/agents/ — 2 subagent files:

  • arch-doc-writer.md (model: opus, memory: project)
  • principal-engineer-reviewer.md (model: inherit, memory: project, tools: Read/Grep/Glob/Bash/WebFetch/WebSearch)

Observability

  • OCSF v1.7.0 structured logging for security events
  • Plain tracing (Rust) for routine events
  • openshell logs --tail for streaming sandbox logs
  • TUI live dashboard

IDE Integration

None documented. Project uses Claude Code, OpenCode, Codex, GitHub Copilot as target agents.

Cross-Tool Portability

Low-Medium. The sandbox runtime runs any agent (Claude Code, Codex, OpenCode, GitHub Copilot, Ollama) but the agent skills/subagents are Claude Code / Claude-harness specific.

Related frameworks

same archetype · same primary tool · same memory type

Daytona ★ 72k

Provide secure, elastic, sub-90ms sandbox compute infrastructure for running AI-generated code, accessible via multi-language…

CUA ★ 17k

Unified SDK for building, benchmarking, and deploying agents that interact with full OS GUIs via isolated VMs.

E2B ★ 12k

Run AI-generated code safely in cloud-hosted isolated sandboxes via a 3-line SDK integration.

OpenSandbox ★ 11k

Protocol-first general-purpose sandbox platform for AI applications with multi-language SDKs and pluggable isolation backends.

Microsandbox ★ 6.3k

Spawn hardware-isolated microVMs as child processes directly from application code, with no server setup, in under 100ms.

CubeSandbox ★ 5.9k

Sub-60ms KVM microVM sandboxes for AI agents with E2B drop-in compatibility and <5MB memory overhead.