Skip to content
/

stakpak/agent

stakpak-agent · stakpak/agent · ★ 1.6k · last commit 2026-05-26

Primitive shape 2 total
Subagents 1 MCP tools 1
00

Summary

stakpak/agent — Summary

Stakpak is a Rust-based DevOps AI agent CLI designed for 24/7 autonomous infrastructure operations. Its core value proposition is "all the upside of a PaaS, none of the lock-in" — an always-on agent that generates infrastructure code, debugs Kubernetes, configures CI/CD, and automates deployments without exposing LLM tools to raw credentials. Key security primitives: dynamic secret substitution (AI works with credentials without ever seeing values), Warden network-level policy guardrails (blocks destructive operations before execution), and a Docker-based sandbox that isolates subagent tool calls. The stakpak binary ships with 20+ subcommands including an Autopilot mode (stakpak up) that starts a background daemon with cron-scheduled tasks and channel integrations (Slack). A Ratatui TUI surfaces live output. The framework bundles DevOps knowledge as "Rulebooks" — configurable SOPs/playbooks that shape agent behavior.

Differs from seeds: Most similar to agent-os (always-on autonomous agent) but Stakpak is narrowly scoped to DevOps/infrastructure, not general-purpose. Unlike taskmaster-ai (task breakdown MCP server), Stakpak embeds the task execution in the CLI binary with Docker sandbox isolation. The secret substitution and Warden guardrails are unique in this corpus — no seed framework has network-level policy enforcement on AI tool calls.

01

Overview

stakpak/agent — Overview

Origin

Stakpak is an open source project from Stakpak (stakpak.dev). Written in Rust 2024 edition. The agent is designed to live permanently on servers or developer machines and keep applications running.

Philosophy (verbatim from README)

"Ship your code, on autopilot." "An open source agent that lives on your machines 24/7, keeps your apps running, and only pings when it needs a human." "You can't trust most AI agents with your DevOps. One mistake, and your production is toast. Stakpak is built different."

Key design pillars:

  • Secret Substitution — The LLM works with your credentials without ever seeing them
  • Warden Guardrails — Network-level policies block destructive operations before they run
  • DevOps Playbooks Baked-in — Curated library of DevOps knowledge in Stakpak Rulebooks

Target User

DevOps engineers and developers who want autonomous infrastructure management without giving an LLM root access to production.

Repo Facts

  • GitHub: https://github.com/stakpak/agent
  • Stars: 1,563 (2026-05-26)
  • Language: Rust
  • License: Apache-2.0
  • Version: 0.3.82
  • Last commit: 2026-05-26
  • Install: brew tap stakpak/stakpak && brew install stakpak or curl .../install.sh | sh
02

Architecture

stakpak/agent — Architecture

Distribution

  • Homebrew: brew tap stakpak/stakpak && brew install stakpak
  • Install script: curl -sSL https://stakpak.dev/install.sh | sh
  • Quick start: stakpak init && stakpak autopilot up

Rust Workspace Structure

stakpak/
├── cli/                # CLI commands (Cobra-equivalent in Rust)
├── tui/                # Ratatui TUI
├── libs/
│   ├── shared/
│   ├── api/
│   ├── ai/
│   ├── agent-core/
│   ├── server/
│   ├── gateway/
│   ├── ak/             # Access Keys
│   ├── mcp/
│   │   ├── client/
│   │   ├── config/
│   │   ├── server/
│   │   └── proxy/
│   └── shell-tool-approvals/
└── Dockerfile          # Docker image for sandbox

Security Architecture

Stakpak Agent
├── Secret Substitution Layer
│   └── AI sees placeholder tokens, host injects real values at execution time
├── Warden Guardrails
│   └── Network-level policy blocking destructive ops before run
├── Docker Sandbox
│   └── Subagent tool calls isolated in container
└── mTLS
    └── End-to-end encrypted MCP communication

Config Files

  • ~/.stakpak/config.toml — Behavior profiles (model, allowed_tools, auto_approve, system_prompt, max_turns, credentials)
  • ~/.stakpak/autopilot.toml — Runtime wiring (schedules, channels, service settings)
  • railway.toml — Railway deployment config

Runtime Requirements

  • Rust (for building from source)
  • Docker (for sandbox mode + Autopilot)
  • macOS, Linux, or Windows

Sandbox Mode

Autopilot spawns a Docker-based sandbox container for subagent tool calls:

  • persistent (default): single container, reused for all sessions
  • ephemeral: spawned per-session only when sandbox: true requested
03

Components

stakpak/agent — Components

CLI Binary (stakpak)

Key subcommands:

  • stakpak init — Understand apps and tech stack
  • stakpak up — Alias for stakpak autopilot up (start 24/7 daemon)
  • stakpak down — Alias for stakpak autopilot down
  • stakpak autopilot up/down/status/logs/doctor — Autopilot lifecycle
  • stakpak autopilot schedule add <name> --cron <expr> --prompt <p> --profile <profile>
  • stakpak autopilot channel add slack --bot-token ... --profile ops

TUI (tui/)

  • Ratatui-based terminal UI
  • Built with Ratatui framework (badge: "Built with Ratatui")
  • Live progress streaming for long-running operations

Autopilot Daemon

Features:

  • 24/7 background operation
  • Cron-scheduled task execution
  • Channel integration (Slack)
  • Preflight checks before startup (stakpak autopilot doctor)
  • Self-repair for stuck operations

Security Components

  • Dynamic Secret Substitution: Credentials stored in config, substituted at execution time; AI never sees raw values
  • Warden Guardrails: Network-level policy; blocks destructive operations before execution
  • Privacy Mode: Redacts sensitive data (IP addresses, AWS account IDs)
  • Secure Password Generation: Cryptographically secure with configurable complexity
  • Mutual TLS (mTLS): End-to-end encrypted MCP communication

DevOps Intelligence

  • Infrastructure Code Indexing: Automatic semantic indexing of Terraform, Kubernetes, Dockerfile, GitHub Actions
  • Documentation Research Agent: Built-in web search for technical docs
  • Subagents: Specialized research agents with different tool access levels (enabled with --enable-subagents)
  • Rule Books: Configurable SOPs, playbooks, organizational policies
  • Persistent Knowledge: Learns from interactions; remembers incidents, resources, environment details

Operational Tools

  • Asynchronous Task Management: Run background commands (port forwarding, servers) with tracking/cancellation
  • Real-time Progress Streaming: Long-running processes stream updates
  • Reversible File Operations: All modifications auto-backed up with recovery
  • Bulk Message Approval: Approve multiple tool calls at once
05

Prompts

stakpak/agent — Prompts

Rule Books (verbatim from README)

Rule Books — Customize agent behavior with internal standard operating procedures, 
playbooks, and organizational policies

Rule Books are DevOps knowledge files that shape the agent's behavior. Not publicly browsable in the OSS repo (may be bundled in binary), but described as:

  • Standard operating procedures (SOPs)
  • Playbooks for specific infrastructure scenarios
  • Organizational policy enforcement

Secret Substitution Pattern (from README)

Dynamic Secret Substitution — AI can read/write/compare secrets without seeing 
actual values

The agent works with placeholder tokens like {{DB_PASSWORD}} in its tool calls; the host layer substitutes the real value before execution. The LLM sees: set_env DB_PASSWORD={{DB_PASSWORD}}; execution runs: export DB_PASSWORD=actual_secret.

Technique: Variable interpolation security pattern — the LLM manipulates tokens representing secrets without ever having access to the plaintext values. Prevents credential leakage through model context or logs.

Warden Policy Verbatim (inferred from README)

Warden Guardrails — Network-level policies block destructive operations before 
they run

Generate infrastructure code, debug Kubernetes, configure CI/CD, automate 
deployments, without giving an LLM the keys to production.

Technique: Pre-execution policy enforcement (not post-hoc review). Operations are classified and checked against rules before any destructive action executes. This is an "Iron Law" enforced at the runtime layer, not the prompt layer.

CLI Startup Pattern

stakpak init # understand your apps and tech stack
stakpak autopilot up # start the autonomous agent, running 24/7 in the background

The init command performs codebase analysis before any automation — a structured discovery phase.

09

Uniqueness

stakpak/agent — Uniqueness & Positioning

differs_from_seeds

Stakpak is closest to agent-os (always-on autonomous agent pattern) but domain-specialized to DevOps/infrastructure. Unlike agent-os (bash script bundle + markdown scaffolds), Stakpak is a 12-crate Rust workspace with Docker sandbox isolation, secret substitution, Warden guardrails, and mTLS-secured MCP. Unlike taskmaster-ai (task breakdown MCP server for coding), Stakpak manages Kubernetes, CI/CD, and deployments. The --enable-subagents mode resembles superpowers's subagent-via-Task-tool pattern but runs in Docker-isolated contexts. No seed has Warden (network-level pre-execution policy enforcement) or dynamic secret substitution.

Distinctive Positioning

  1. Dynamic secret substitution: The LLM manipulates placeholder tokens; real credentials injected at execution time. This is a novel approach — other frameworks either block secrets entirely or trust the LLM with them.

  2. Warden network-level guardrails: Operations blocked before execution based on policy rules, not post-hoc reviewed. "Network-level" framing suggests firewall/iptables-style enforcement rather than application logic.

  3. Infrastructure semantic indexing: Automatic indexing + semantic search over Terraform, Kubernetes, Dockerfile, and GitHub Actions is a domain-specific capability not found in general-purpose agents.

  4. DevOps-scoped Rule Books: SOPs and playbooks baked into the binary as curated DevOps knowledge — domain specialization rather than general-purpose instructions.

  5. mTLS MCP communication: Encrypted MCP connections as a production security requirement.

Observable Failure Modes

  • Docker required for Autopilot + subagents — heavyweight dependency
  • NEAR AI / NEAR account NOT required (unlike IronClaw) — simpler auth
  • Rulebook content is not open-source-visible — knowledge quality is opaque
  • 24/7 daemon on production-adjacent machines increases attack surface
  • No container isolation for main agent (only sandbox for subagents)
  • Slash-command surface is CLI-only — no Claude Code integration
  • Linger systemd config required for reliable server operation (ops burden)
04

Workflow

stakpak/agent — Workflow

Setup Phase

Step Artifact Gate
Install (brew install stakpak) Binary in PATH Automated
stakpak init App + tech stack understanding Automated analysis
Configure ~/.stakpak/config.toml Behavior profiles + credentials Manual
stakpak autopilot up 24/7 daemon running Auto (with preflight checks)

Preflight Checks (stakpak autopilot doctor)

Blocking:

  • Credentials configured
  • Docker installed and accessible
  • Safe memory conditions

Warnings:

  • Port conflicts, systemd linger, low memory, disk space

Autopilot Operation

Step Artifact
Cron trigger fires Schedule config in autopilot.toml
Agent runs with specified profile Tool calls with credential substitution
Warden checks each operation Policy enforcement (block/allow)
Results optionally sent to Slack Channel notification
Knowledge persisted Rulebook update / memory

Manual Task Execution

stakpak init           # Analyze codebase
stakpak autopilot up   # Start daemon
# Agent runs on schedule or via channel (Slack)

Approval Gates

  • auto_approve flag in profile config (default: interactive for manual, auto for Autopilot)
  • Warden blocking rule = hard gate (cannot proceed with blocked operation)
  • Bulk message approval: approve multiple tool calls at once in TUI

Subagent Mode

stakpak --enable-subagents <task>

Spawns specialized research subagents with different tool access levels in Docker sandbox.

06

Memory Context

stakpak/agent — Memory & Context

Persistent Knowledge

"Agent learns from interactions, remembers incidents, resources, and environment details to adapt to your workflow."

  • Knowledge accumulated during stakpak init (infrastructure indexing)
  • Incidents and environment details stored persistently
  • Rule Books can be updated with learned patterns

Infrastructure Code Indexing

  • Automatic local indexing of Terraform, Kubernetes, Dockerfile, GitHub Actions files
  • Semantic search over indexed artifacts
  • Enables contextually aware suggestions without re-reading files every session

Config-Based Memory

  • ~/.stakpak/config.toml — profile behavior (persistent preferences)
  • ~/.stakpak/autopilot.toml — runtime wiring (persistent schedules/channels)
  • Rule Books persist across sessions

Context Compaction

  • Profile-based max_turns setting limits context growth per session
  • Autopilot schedules create fresh contexts per run (natural compaction)
  • No explicit compaction hook documented

Cross-Session Handoff

  • Infrastructure indexing persists to disk (semantic search index)
  • Credentials in config.toml persist
  • Scheduled tasks persist in autopilot.toml
  • Incident history: stored in persistent knowledge store (format not documented)

Memory Isolation

  • Per-profile memory (different profiles can have different system prompts + context)
  • Docker sandbox provides tool execution isolation (not memory isolation)
07

Orchestration

stakpak/agent — Orchestration

Multi-Agent

Yes (optional) — --enable-subagents flag spawns specialized research subagents with different tool access levels.

Orchestration Pattern

Sequential (default): Single agent processes tasks in order.

With subagents (--enable-subagents): Agent spawns specialized research subagents for code exploration and sandboxed analysis. Parent agent coordinates.

Isolation Mechanism

Docker sandbox (primary for subagents and Autopilot):

  • Subagent tool calls isolated in Docker container
  • persistent mode: single container reused for all sessions
  • ephemeral mode: container spawned per-session

Secret substitution (security primitive):

  • Not filesystem isolation but credential isolation
  • LLM never sees raw credentials; host layer substitutes at execution time

Warden policy (network-level):

  • Blocks destructive operations before execution
  • Network-level enforcement, not application-level checks

Multi-Model

Yes — profile-based model selection:

  • Each profile specifies model field
  • Different schedules can use different profiles (different models)
  • model_role_mapping not formally documented — single model per profile

Execution Mode

Background daemon (Autopilot mode):

  • 24/7 operation
  • Cron-scheduled tasks
  • Channel-triggered tasks (Slack)
  • Heartbeat monitoring (stakpak autopilot doctor)

Interactive (manual mode): Direct CLI invocation with TUI

Consensus

None. Single-agent primary with optional research subagents.

08

Ui Cli Surface

stakpak/agent — UI & CLI Surface

CLI Binary (stakpak)

Key subcommands:

  • stakpak init — Analyze codebase + tech stack
  • stakpak up / stakpak down — Start/stop Autopilot daemon
  • stakpak autopilot up/down/status/logs/doctor — Autopilot control
  • stakpak autopilot schedule add — Add cron schedule
  • stakpak autopilot channel add slack — Add Slack channel
  • Full list in cli/README.md

Terminal UI (TUI)

  • Ratatui (Rust TUI library)
  • Live progress streaming
  • Bulk message approval (approve multiple tool calls at once)
  • Interactive session management

Slack Integration

  • Autopilot channel: stakpak autopilot channel add slack --bot-token ... --app-token ...
  • Channel-triggered tasks
  • Optional result notifications

Observability

  • stakpak autopilot status — Current daemon status
  • stakpak autopilot logs — View daemon logs
  • stakpak autopilot doctor — Deployment readiness check with diagnostic output:
    • Docker status
    • Credential configuration
    • Memory headroom
    • Port conflicts
    • Systemd linger

MCP Integration

  • libs/mcp/ — Full MCP client, server, proxy, and config crates
  • mTLS for encrypted MCP communication
  • Used for tool access and external service integration

No Web Dashboard

Stakpak is CLI/TUI-only. No web dashboard.

Related frameworks

same archetype · same primary tool · same memory type

Daytona ★ 72k

Provide secure, elastic, sub-90ms sandbox compute infrastructure for running AI-generated code, accessible via multi-language…

CUA ★ 17k

Unified SDK for building, benchmarking, and deploying agents that interact with full OS GUIs via isolated VMs.

E2B ★ 12k

Run AI-generated code safely in cloud-hosted isolated sandboxes via a 3-line SDK integration.

OpenSandbox ★ 11k

Protocol-first general-purpose sandbox platform for AI applications with multi-language SDKs and pluggable isolation backends.

Microsandbox ★ 6.3k

Spawn hardware-isolated microVMs as child processes directly from application code, with no server setup, in under 100ms.

CubeSandbox ★ 5.9k

Sub-60ms KVM microVM sandboxes for AI agents with E2B drop-in compatibility and <5MB memory overhead.