Skip to content
/

Sponsio

sponsio · SponsioLabs/Sponsio · ★ 445 · last commit 2026-05-26

Primitive shape 1 total
Skills 1
00

Summary

Sponsio — Summary

Sponsio is a Python/TypeScript runtime enforcement library that checks every AI agent tool call against deterministic, pure-code contracts in under 0.01 ms — 5,000× to 60,000× faster than LLM-as-judge guardrails — with zero LLM calls in the hot path. It integrates with LangChain, LangGraph, Claude Agent, OpenAI Agents, Google ADK, CrewAI, Vercel AI, MCP, or any custom tool-calling loop. Contracts are declared in YAML (sponsio.yaml), grouped into 16 bundled contract packs organized by capability tier (always-on, per-tool, per-incident), and backed by 44 underlying deterministic patterns. The framework ships a CLI (sponsio), a Claude Code skill (sponsio/SKILL.md), an OpenClaw plugin, and an optional dashboard host. On ODCV-Bench, Sponsio blocks 95.6% of misalignment on average; on RedCode-Exec (1,410 cases), 92% combined. Compared to seeds: Sponsio is the most sophisticated policy-engine in the batch — closer to a formal methods runtime verification system than to the hook-based guardrails seen in clauder or pi-steering-hooks. No seed framework comes close to its benchmark-backed formal contract approach.

01

Overview

Sponsio — Overview

Origin

Repo: SponsioLabs/Sponsio. Created April 2026, Python + TypeScript, Apache-2.0. 445 stars. Organization: SponsioLabs. Website: sponsio.dev. Multiple translated READMEs (EN, ZH-CN, JA).

Philosophy

From README:

"Runtime enforcement for AI agents. Sponsio checks every agent action against deterministic, pure-code contracts, enforced in under 0.01 ms with zero LLM cost at runtime."

"An agent contract is a runtime rule that is checked at every agent action, backed by formal methods."

From OSS_PROMISE.md (implied):

Sponsio's threat model draws on public security research; e.g. Simon Willison's "Lethal Trifecta" shaped multi-tool composition contracts.

Performance Claims

From README (with benchmark citations):

  • p50 0.139 ms per contract — "5,000×–60,000× faster than any LLM-as-judge guardrail"
  • p99 under 1.04 ms across all workloads
  • ODCV-Bench: 95.6% misalignment blocked on average (12 frontier LLMs × 80 trajectories)
  • RedCode-Exec (1,410 cases): 92% combined (bash 95%, python 90%)
  • Financial-Audit-Fraud-Finding scenario: frontier models commit fraud in 16/24 trials; Sponsio blocks 18/19

Core Design Principles

  • Deterministic only: all enforcement is pure-code logic, never LLM
  • Observable vs Enforce modes: start in observe (log violations), flip to enforce (block)
  • Formal methods backing: LTL (Linear Temporal Logic) monitor for temporal ordering contracts
  • Zero dependencies in core: base pip install sponsio is a small Python package
  • YAML-declarative: contracts are declared in sponsio.yaml, not written in Python

Acronym

"Sponsio" is Latin for "guarantee" or "surety bond" — the name signals formal contractual commitment.

02

Architecture

Sponsio — Architecture

Distribution

  • PyPI: pip install sponsio (Python)
  • npm: npm install -D @sponsio/sdk (TypeScript)
  • Claude Code skill: bundled sponsio/SKILL.md
  • OpenClaw plugin: via sponsio plugin install or plugin_init

Install Methods

pip install sponsio                    # Base Python
pip install "sponsio[all]"             # All extras (yaml + llm + otel)
npm install -D @sponsio/sdk            # TypeScript/Node
sponsio init .                         # Interactive wizard

Directory Structure (repo)

sponsio/
  sponsio/                 # Python package
    cli.py                 # 30+ CLI commands
    contract.py            # Contract execution engine
    config.py              # YAML config loader
    core.py                # Core enforcement logic
    agents.py              # Agent wrapper utilities
    mcp.py                 # MCP integration
    contracts/             # Built-in contract packs
      core/                # universal.yaml, llm_safety.yaml, runaway.yaml
      capability/          # shell, filesystem, credentials, etc.
      incident/            # mcp-composition.yaml, etc.
      premium/             # advanced contracts
    patterns/              # 44 underlying patterns
    skills/
      sponsio/             # SKILL.md for Claude Code
    plugin/                # OpenClaw/Claude Code plugin
    prompts/               # Onboarding prompt templates
    integrations/          # LangChain, CrewAI, LangGraph, etc.
    runtime/               # Contract execution runtime
    daemon/                # Background daemon mode
    tracer/                # Trace streaming
    generation/            # Contract generation from natural language
  ts/                      # TypeScript SDK source
  docs/                    # Architecture, formal methods, benchmarks
  tests/
  pyproject.toml
  CLAUDE.md
  QUICKSTART.md

Required Runtime

  • Python >= 3.10 (core)
  • Node.js (TypeScript SDK)
  • Optional: API keys for LLM-assisted contract generation

Target AI Tools / Frameworks

Framework Integration
LangChain DashClawCallbackHandler analog / direct wrap
LangGraph Node/Python SDK
Claude Code Skills + plugin + Claude Agent SDK
OpenAI Agents Python/Node SDK
Google ADK Direct wrap
CrewAI Task callback / agent wrapper
Vercel AI TypeScript SDK
MCP (any host) via sponsio mcp integration
Custom tool-calling Python or TypeScript SDK
03

Components

Sponsio — Components

CLI Binary: sponsio

30+ subcommands including:

Command Purpose
sponsio demo [--scenario] Run demo trajectories (cleanup, backup, wire, freeze)
sponsio init . Interactive wizard: detect framework, write sponsio.yaml
sponsio validate "<rule>" Turn plain English into a contract draft
sponsio check <trace> Check a trace against contracts
sponsio replay <trace> Replay an agent trajectory
sponsio report Review observe-mode violations
sponsio serve Run as HTTP/dashboard server
sponsio scan Scan agent code for risk patterns
sponsio packs List available contract bundles
sponsio patterns List 44 underlying patterns
sponsio skill [install] Manage Claude Code skill install
sponsio doctor Check sponsio.yaml configuration
sponsio onboard Full onboarding wizard
sponsio mode [observe/enforce] Toggle enforcement mode
sponsio plugin [init/install/show/scan] Plugin management
sponsio host [list] List supported IDE hosts
sponsio export Export session data
sponsio eval Run evaluation benchmarks

Contract Bundles (16)

Organized by tier:

Tier Bundle Coverage
always-on sponsio:core/universal (empty by design — see yaml)
always-on sponsio:core/llm_safety Output safety contracts for LLM-response agents
always-on sponsio:core/runaway Runaway agent detection
per-tool sponsio:capability/shell Shell command enforcement
per-tool sponsio:capability/filesystem File operation enforcement
per-tool sponsio:capability/credentials Credential handling
per-incident sponsio:incident/mcp-composition Multi-tool composition safety
premium (additional bundles) Advanced scenarios

Patterns (44)

Underlying primitives composing the bundles (e.g., path-traversal-guard, wire-transfer-limit, privilege-escalation-block, etc.)

Skills

sponsio/skills/sponsio/SKILL.md — Claude Code skill that covers the full lifecycle (initial setup, contract authoring, observe-mode tuning, flip to enforce, troubleshooting).

TypeScript SDK

87-method canonical surface (Python SDK: 235 methods).

Framework Integrations

from sponsio.integrations.langchain import DashClawCallbackHandler
from sponsio.integrations.crewai import DashClawCrewIntegration
05

Prompts

Sponsio — Prompts

Verbatim Excerpt 1: Universal Contract Bundle (YAML)

# sponsio/contracts/core/universal.yaml
# Pack stub — empty by design.
#
# This pack used to ship five stochastic output-safety contracts
# (injection_free, jailbreak_free, harmful, toxic_free, semantic_pii_free)
# and was auto-included by ``sponsio onboard``.  That was the wrong
# default for tool-call-only agents: every step pulled the judge LLM
# in, adding latency + cost for agents that never produce LLM
# responses to score (the entire point of those contracts).
#
# The contracts are still shipped — they moved to
# ``sponsio:core/llm_safety``.  Opt in from your config when your
# agent does produce LLM responses you want graded.

version: "1"
agents:
  "*":
    contracts: []

Technique: explicit empty pack with rationale. Sponsio documents why universal is empty — a design evolution recorded in code comments. This prevents cargo-culting of the old default while preserving backward compatibility.

Verbatim Excerpt 2: QUICKSTART Contract Enforcement Output

  ━━━ ◒◓ sponsio ━━━━━━━━━━━━━━━━━━━━━━━━━━
  ▎ contract · ap_copilot
  ▎ single wire capped at $50k
  ▎ enforce ▸ wire_transfer.amount must be in range [0, 50000]
  ▎ contract · ap_copilot
  ▎ compliance_approve must precede wire_transfer
  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

  -> wire_transfer(to='Acme Logistics LLC', amount=847000, ...)
  ✗ amount must be in range [0, 50000] — VIOLATED → blocked
  ✗ compliance_approve must precede wire_transfer — VIOLATED → blocked

Technique: structured violation output. Each violation shows: the contract name, the rule description, the actual call parameters, and the enforcement decision. This gives the agent enough context to understand why it was blocked and self-correct.

Verbatim Excerpt 3: README Comparison Table

On ODCV-Bench (12 frontier LLMs × 80 trajectories), unguarded models cheat in 
11.5%–66.7% of runs. **With Sponsio, 95.6% of misalignment is avoided on average; 
24/36 high-risk scenarios at 100%.** On the `Financial-Audit-Fraud-Finding` scenario, 
frontier models commit fraud in 16/24 trials; **Sponsio blocks 18/19**. On 
RedCode-Exec (1,410 cases), Sponsio reaches **92% combined** (bash 95% · python 90%).

Technique: benchmark-first positioning. Unlike most frameworks which provide anecdotal demos, Sponsio leads with verifiable benchmarks on public datasets, establishing scientific credibility.

09

Uniqueness

Sponsio — Uniqueness & Positioning

Differs from Seeds

None of the 11 seeds come close to Sponsio's formal-methods approach to agent safety. The closest seed is kiro (which has IDE-native hook events and enforcement), but kiro is an IDE product, not a library. spec-kit's 18 hooks enforce a development workflow; Sponsio enforces behavioral correctness at runtime. The key innovations Sponsio introduces that appear nowhere in the seed corpus: (1) Fuzzy LTL temporal logic monitor for ordering contracts (e.g., "operation A must precede operation B"), (2) Benchmark-backed claims with public datasets (ODCV-Bench, RedCode-Bench), (3) p50 0.139ms enforcement speed (vs. 50-800ms for LLM-as-judge), (4) Natural language to contract draft generation, (5) Observe→Enforce mode progression, (6) Framework-agnostic Python+TypeScript dual implementation. The OWASP threat model integration (ASI-09, ASI-10 references) signals enterprise security positioning.

Positioning

Signal type: deterministic runtime policy engine Intervention point: pre-tool-execution (synchronous blocking or logging) Unique features: LTL temporal contracts, benchmark validation, framework-agnostic, NL→contract generation, observe/enforce modes Target user: AI agent teams in production needing formal safety guarantees, compliance teams

Observable Failure Modes

  • YAML contract authoring is still somewhat specialized (despite NL→contract generation)
  • False positives in observe mode require manual tuning before enforcing
  • The "universal" bundle is empty — users must actively choose and configure bundles
  • TypeScript SDK (87 methods) vs Python SDK (235 methods) surface discrepancy
  • v0.1.1 is very early — contract library may have gaps

Relationship to Batch 31

Sponsio is the most engineering-rigorous framework in the batch. It provides the clearest separation between the safety signal (deterministic contract logic) and the AI layer (never in the enforcement hot path). DashClaw (another batch member) provides similar governance but via a centralized dashboard + approval queue rather than inline contract checks.

04

Workflow

Sponsio — Workflow

Phases

Phase Description Artifact
Init sponsio init . wizard detects framework, writes sponsio.yaml sponsio.yaml in observe mode
Doctor check sponsio doctor validates configuration Config health report
Code wiring Developer adds 2-line wrap snippet to agent Instrumented agent
Observe mode Agent runs, violations logged, nothing blocked ~/.sponsio/sessions/*.jsonl
Review sponsio report --since 24h Violation report
Tune Review false positives, adjust contracts Updated sponsio.yaml
Flip to enforce sponsio mode enforce Contracts now block violations
Continuous refresh sponsio refresh Contracts updated from production traces

Contract Lifecycle

# sponsio.yaml minimal example
agents:
  my_agent:
    workspace: "/srv/my-bot"
    include:
      - sponsio:core/universal
      - sponsio:capability/shell
      - sponsio:capability/filesystem

Natural Language Contract Generation

sponsio validate "never delete files in /prod"
# → outputs YAML contract draft for review

Demo Trajectories

Four built-in unsafe scenarios:

  • cleanup — agent deletes .env + .git/
  • backup — SRE cost-optimizer deletes prod DR backups
  • wire — AP copilot wires $847k to unverified vendor
  • freeze — Replit-style code-freeze violation + cover-up

Approval Gates

None in the default hot path (Sponsio is fully automated, no human gates). The sponsio:capability/shell bundle can add human escalation contracts for specific commands.

Execution Mode

Event-driven — checks fire synchronously on every tool call before execution.

06

Memory Context

Sponsio — Memory & Context

State Storage

Store Path Content
Session JSONL ~/.sponsio/sessions/<agent_id>/*.jsonl Per-session observe-mode violations
Contract config sponsio.yaml (project root) Agent contracts and bundle includes
Trace files custom paths Raw agent trajectories for replay/check

Observe Mode Persistence

In observe mode, Sponsio logs every contract evaluation result to JSONL files:

  • Would-have-blocked decisions
  • Contract name, rule, actual value
  • Timestamp, agent_id, session_id

These feeds sponsio report --since 24h for tuning.

Live Stream

sponsio host trace --follow

Pure-OSS live stream of contract evaluations in real time.

Cross-Session

Session data accumulates in ~/.sponsio/sessions/<agent_id>/. sponsio refresh mines these for contract updates.

No In-Context Memory

Sponsio does not inject anything into the agent's context window. It wraps the tool-call interface externally. The agent is unaware of Sponsio unless a contract violation message is returned to it.

Compaction

Not applicable — Sponsio is not a context-management layer.

07

Orchestration

Sponsio — Orchestration

Multi-Agent

Partial — contracts can be declared per-agent with different privilege levels. Subagent privilege boundary is a documented use case (W2c in the skill). But Sponsio itself doesn't orchestrate agents.

Orchestration Pattern

None (Sponsio wraps agents, doesn't orchestrate them).

Isolation Mechanism

Behavioral isolation via contracts — agents are bounded by declared capability contracts. Not filesystem/process isolation.

Execution Mode

Event-driven — synchronous enforcement on every tool call. p50 0.139ms per contract.

Multi-Model

No. Sponsio is model-agnostic (works with any framework's LLM), but does not route different models to different roles.

Cross-Tool Portability

High — works with LangChain, LangGraph, Claude Agent SDK, OpenAI Agents SDK, Google ADK, CrewAI, Vercel AI, MCP, custom tool-calling. Language portability: Python + TypeScript.

Consensus

None.

Prompt Chaining

Not applicable (Sponsio is a runtime enforcement layer, not a reasoning pipeline).

LTL Monitor

Sponsio's temporal contracts use a fuzzy LTL (Linear Temporal Logic) monitor that checks ordering constraints (e.g., "compliance_approve must precede wire_transfer"). This is the most sophisticated enforcement mechanism in Batch 31 — other frameworks use regex matching; Sponsio uses formal temporal logic.

08

Ui Cli Surface

Sponsio — UI & CLI Surface

CLI Binary

Exists: yes Name: sponsio Package: PyPI sponsio v0.1.1 Is thin wrapper: no — own Python runtime Install: pip install sponsio

Key Subcommands

Subcommand Description
sponsio init . Interactive wizard
sponsio demo Terminal demo trajectories
sponsio validate "<rule>" NL-to-contract draft
sponsio doctor Config health check
sponsio report Observe-mode violation report
sponsio mode observe/enforce Toggle enforcement
sponsio packs List contract bundles
sponsio patterns List 44 patterns
sponsio serve HTTP dashboard server
sponsio skill install Install Claude Code skill
sponsio refresh Re-mine contracts from traces
sponsio eval Run benchmarks
sponsio plugin [init/install] Plugin management

Dashboard

Exists: yes (via sponsio serve) Type: web-dashboard Port: configurable (DASHBOARD_DEFAULT_PORT from constants) Access: HTTP server with dashboard UI Features: agent activity, contract violations, observe-mode feeds

IDE Integration

  • Claude Code: sponsio skill install adds SKILL.md to Claude Code skills
  • OpenClaw: sponsio plugin install openclaw → Claude Code plugin
  • Any MCP host: via sponsio:mcp integration

Observability

  • ~/.sponsio/sessions/ JSONL for observe-mode violations
  • sponsio host trace --follow for live stream
  • sponsio report for aggregated analysis
  • OpenTelemetry (OTel) support via pip install "sponsio[all]"

Related frameworks

same archetype · same primary tool · same memory type

OpenHarness ★ 13k

Open-source Python agent runtime providing complete harness infrastructure: tools, memory, governance, swarm coordination, and…

Trae Agent ★ 12k

Research-friendly open-source CLI coding agent by ByteDance, designed for academic ablation studies and modular LLM provider…

Sweep AI ★ 7.7k

Autonomous GitHub bot that converts issues to pull requests using a sequential multi-agent pipeline.

Agent Governance Toolkit (microsoft) ★ 2.3k

Enterprise-grade AI agent governance: YAML policy enforcement, 12-vector prompt injection defense, zero-trust identity,…

TDD Guard ★ 2.1k

Mechanically enforces the Red-Green-Refactor TDD cycle by blocking file writes that violate TDD principles via a PreToolUse hook…

Agentic Coding Flywheel Setup (ACFS) ★ 1.5k

Take a complete beginner from laptop to three AI coding agents running on a VPS in 30 minutes via an idempotent manifest-driven…