Sponsio

sponsio · SponsioLabs/Sponsio · ★ 445 · last commit 2026-05-26

Primitive shape 1 total

Skills 1

Summary

Sponsio — Summary

Sponsio is a Python/TypeScript runtime enforcement library that checks every AI agent tool call against deterministic, pure-code contracts in under 0.01 ms — 5,000× to 60,000× faster than LLM-as-judge guardrails — with zero LLM calls in the hot path. It integrates with LangChain, LangGraph, Claude Agent, OpenAI Agents, Google ADK, CrewAI, Vercel AI, MCP, or any custom tool-calling loop. Contracts are declared in YAML (sponsio.yaml), grouped into 16 bundled contract packs organized by capability tier (always-on, per-tool, per-incident), and backed by 44 underlying deterministic patterns. The framework ships a CLI (sponsio), a Claude Code skill (sponsio/SKILL.md), an OpenClaw plugin, and an optional dashboard host. On ODCV-Bench, Sponsio blocks 95.6% of misalignment on average; on RedCode-Exec (1,410 cases), 92% combined. Compared to seeds: Sponsio is the most sophisticated policy-engine in the batch — closer to a formal methods runtime verification system than to the hook-based guardrails seen in clauder or pi-steering-hooks. No seed framework comes close to its benchmark-backed formal contract approach.

Overview

Sponsio — Overview

Origin

Repo: SponsioLabs/Sponsio. Created April 2026, Python + TypeScript, Apache-2.0. 445 stars. Organization: SponsioLabs. Website: sponsio.dev. Multiple translated READMEs (EN, ZH-CN, JA).

Philosophy

From README:

"Runtime enforcement for AI agents. Sponsio checks every agent action against deterministic, pure-code contracts, enforced in under 0.01 ms with zero LLM cost at runtime."

"An agent contract is a runtime rule that is checked at every agent action, backed by formal methods."

From OSS_PROMISE.md (implied):

Sponsio's threat model draws on public security research; e.g. Simon Willison's "Lethal Trifecta" shaped multi-tool composition contracts.

Performance Claims

From README (with benchmark citations):

p50 0.139 ms per contract — "5,000×–60,000× faster than any LLM-as-judge guardrail"
p99 under 1.04 ms across all workloads
ODCV-Bench: 95.6% misalignment blocked on average (12 frontier LLMs × 80 trajectories)
RedCode-Exec (1,410 cases): 92% combined (bash 95%, python 90%)
Financial-Audit-Fraud-Finding scenario: frontier models commit fraud in 16/24 trials; Sponsio blocks 18/19

Core Design Principles

Deterministic only: all enforcement is pure-code logic, never LLM
Observable vs Enforce modes: start in observe (log violations), flip to enforce (block)
Formal methods backing: LTL (Linear Temporal Logic) monitor for temporal ordering contracts
Zero dependencies in core: base pip install sponsio is a small Python package
YAML-declarative: contracts are declared in sponsio.yaml, not written in Python

Acronym

"Sponsio" is Latin for "guarantee" or "surety bond" — the name signals formal contractual commitment.

Architecture

Sponsio — Architecture

Distribution

PyPI: pip install sponsio (Python)
npm: npm install -D @sponsio/sdk (TypeScript)
Claude Code skill: bundled sponsio/SKILL.md
OpenClaw plugin: via sponsio plugin install or plugin_init

Install Methods

pip install sponsio                    # Base Python
pip install "sponsio[all]"             # All extras (yaml + llm + otel)
npm install -D @sponsio/sdk            # TypeScript/Node
sponsio init .                         # Interactive wizard

Directory Structure (repo)

sponsio/
  sponsio/                 # Python package
    cli.py                 # 30+ CLI commands
    contract.py            # Contract execution engine
    config.py              # YAML config loader
    core.py                # Core enforcement logic
    agents.py              # Agent wrapper utilities
    mcp.py                 # MCP integration
    contracts/             # Built-in contract packs
      core/                # universal.yaml, llm_safety.yaml, runaway.yaml
      capability/          # shell, filesystem, credentials, etc.
      incident/            # mcp-composition.yaml, etc.
      premium/             # advanced contracts
    patterns/              # 44 underlying patterns
    skills/
      sponsio/             # SKILL.md for Claude Code
    plugin/                # OpenClaw/Claude Code plugin
    prompts/               # Onboarding prompt templates
    integrations/          # LangChain, CrewAI, LangGraph, etc.
    runtime/               # Contract execution runtime
    daemon/                # Background daemon mode
    tracer/                # Trace streaming
    generation/            # Contract generation from natural language
  ts/                      # TypeScript SDK source
  docs/                    # Architecture, formal methods, benchmarks
  tests/
  pyproject.toml
  CLAUDE.md
  QUICKSTART.md

Required Runtime

Python >= 3.10 (core)
Node.js (TypeScript SDK)
Optional: API keys for LLM-assisted contract generation

Target AI Tools / Frameworks

Framework	Integration
LangChain	DashClawCallbackHandler analog / direct wrap
LangGraph	Node/Python SDK
Claude Code	Skills + plugin + Claude Agent SDK
OpenAI Agents	Python/Node SDK
Google ADK	Direct wrap
CrewAI	Task callback / agent wrapper
Vercel AI	TypeScript SDK
MCP (any host)	via `sponsio mcp` integration
Custom tool-calling	Python or TypeScript SDK

Components

Sponsio — Components

CLI Binary: `sponsio`

30+ subcommands including:

Command	Purpose
`sponsio demo [--scenario]`	Run demo trajectories (cleanup, backup, wire, freeze)
`sponsio init .`	Interactive wizard: detect framework, write sponsio.yaml
`sponsio validate "<rule>"`	Turn plain English into a contract draft
`sponsio check <trace>`	Check a trace against contracts
`sponsio replay <trace>`	Replay an agent trajectory
`sponsio report`	Review observe-mode violations
`sponsio serve`	Run as HTTP/dashboard server
`sponsio scan`	Scan agent code for risk patterns
`sponsio packs`	List available contract bundles
`sponsio patterns`	List 44 underlying patterns
`sponsio skill [install]`	Manage Claude Code skill install
`sponsio doctor`	Check sponsio.yaml configuration
`sponsio onboard`	Full onboarding wizard
`sponsio mode [observe/enforce]`	Toggle enforcement mode
`sponsio plugin [init/install/show/scan]`	Plugin management
`sponsio host [list]`	List supported IDE hosts
`sponsio export`	Export session data
`sponsio eval`	Run evaluation benchmarks

Contract Bundles (16)

Organized by tier:

Tier	Bundle	Coverage
always-on	`sponsio:core/universal`	(empty by design — see yaml)
always-on	`sponsio:core/llm_safety`	Output safety contracts for LLM-response agents
always-on	`sponsio:core/runaway`	Runaway agent detection
per-tool	`sponsio:capability/shell`	Shell command enforcement
per-tool	`sponsio:capability/filesystem`	File operation enforcement
per-tool	`sponsio:capability/credentials`	Credential handling
per-incident	`sponsio:incident/mcp-composition`	Multi-tool composition safety
premium	(additional bundles)	Advanced scenarios

Patterns (44)

Underlying primitives composing the bundles (e.g., path-traversal-guard, wire-transfer-limit, privilege-escalation-block, etc.)

Skills

sponsio/skills/sponsio/SKILL.md — Claude Code skill that covers the full lifecycle (initial setup, contract authoring, observe-mode tuning, flip to enforce, troubleshooting).

TypeScript SDK

87-method canonical surface (Python SDK: 235 methods).

Framework Integrations

from sponsio.integrations.langchain import DashClawCallbackHandler
from sponsio.integrations.crewai import DashClawCrewIntegration

Prompts

Sponsio — Prompts

Verbatim Excerpt 1: Universal Contract Bundle (YAML)

# sponsio/contracts/core/universal.yaml
# Pack stub — empty by design.
#
# This pack used to ship five stochastic output-safety contracts
# (injection_free, jailbreak_free, harmful, toxic_free, semantic_pii_free)
# and was auto-included by ``sponsio onboard``.  That was the wrong
# default for tool-call-only agents: every step pulled the judge LLM
# in, adding latency + cost for agents that never produce LLM
# responses to score (the entire point of those contracts).
#
# The contracts are still shipped — they moved to
# ``sponsio:core/llm_safety``.  Opt in from your config when your
# agent does produce LLM responses you want graded.

version: "1"
agents:
  "*":
    contracts: []

Technique: explicit empty pack with rationale. Sponsio documents why universal is empty — a design evolution recorded in code comments. This prevents cargo-culting of the old default while preserving backward compatibility.

Verbatim Excerpt 2: QUICKSTART Contract Enforcement Output

  ━━━ ◒◓ sponsio ━━━━━━━━━━━━━━━━━━━━━━━━━━
  ▎ contract · ap_copilot
  ▎ single wire capped at $50k
  ▎ enforce ▸ wire_transfer.amount must be in range [0, 50000]
  ▎ contract · ap_copilot
  ▎ compliance_approve must precede wire_transfer
  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

  -> wire_transfer(to='Acme Logistics LLC', amount=847000, ...)
  ✗ amount must be in range [0, 50000] — VIOLATED → blocked
  ✗ compliance_approve must precede wire_transfer — VIOLATED → blocked

Technique: structured violation output. Each violation shows: the contract name, the rule description, the actual call parameters, and the enforcement decision. This gives the agent enough context to understand why it was blocked and self-correct.

Verbatim Excerpt 3: README Comparison Table

On ODCV-Bench (12 frontier LLMs × 80 trajectories), unguarded models cheat in 
11.5%–66.7% of runs. **With Sponsio, 95.6% of misalignment is avoided on average; 
24/36 high-risk scenarios at 100%.** On the `Financial-Audit-Fraud-Finding` scenario, 
frontier models commit fraud in 16/24 trials; **Sponsio blocks 18/19**. On 
RedCode-Exec (1,410 cases), Sponsio reaches **92% combined** (bash 95% · python 90%).

Technique: benchmark-first positioning. Unlike most frameworks which provide anecdotal demos, Sponsio leads with verifiable benchmarks on public datasets, establishing scientific credibility.

Uniqueness

Sponsio — Uniqueness & Positioning

Differs from Seeds

None of the 11 seeds come close to Sponsio's formal-methods approach to agent safety. The closest seed is kiro (which has IDE-native hook events and enforcement), but kiro is an IDE product, not a library. spec-kit's 18 hooks enforce a development workflow; Sponsio enforces behavioral correctness at runtime. The key innovations Sponsio introduces that appear nowhere in the seed corpus: (1) Fuzzy LTL temporal logic monitor for ordering contracts (e.g., "operation A must precede operation B"), (2) Benchmark-backed claims with public datasets (ODCV-Bench, RedCode-Bench), (3) p50 0.139ms enforcement speed (vs. 50-800ms for LLM-as-judge), (4) Natural language to contract draft generation, (5) Observe→Enforce mode progression, (6) Framework-agnostic Python+TypeScript dual implementation. The OWASP threat model integration (ASI-09, ASI-10 references) signals enterprise security positioning.

Positioning

Signal type: deterministic runtime policy engine Intervention point: pre-tool-execution (synchronous blocking or logging) Unique features: LTL temporal contracts, benchmark validation, framework-agnostic, NL→contract generation, observe/enforce modes Target user: AI agent teams in production needing formal safety guarantees, compliance teams

Observable Failure Modes

YAML contract authoring is still somewhat specialized (despite NL→contract generation)
False positives in observe mode require manual tuning before enforcing
The "universal" bundle is empty — users must actively choose and configure bundles
TypeScript SDK (87 methods) vs Python SDK (235 methods) surface discrepancy
v0.1.1 is very early — contract library may have gaps

Relationship to Batch 31

Sponsio is the most engineering-rigorous framework in the batch. It provides the clearest separation between the safety signal (deterministic contract logic) and the AI layer (never in the enforcement hot path). DashClaw (another batch member) provides similar governance but via a centralized dashboard + approval queue rather than inline contract checks.

Workflow

Sponsio — Workflow

Phases

Phase	Description	Artifact
Init	`sponsio init .` wizard detects framework, writes sponsio.yaml	`sponsio.yaml` in observe mode
Doctor check	`sponsio doctor` validates configuration	Config health report
Code wiring	Developer adds 2-line wrap snippet to agent	Instrumented agent
Observe mode	Agent runs, violations logged, nothing blocked	`~/.sponsio/sessions/*.jsonl`
Review	`sponsio report --since 24h`	Violation report
Tune	Review false positives, adjust contracts	Updated sponsio.yaml
Flip to enforce	`sponsio mode enforce`	Contracts now block violations
Continuous refresh	`sponsio refresh`	Contracts updated from production traces

Contract Lifecycle

# sponsio.yaml minimal example
agents:
  my_agent:
    workspace: "/srv/my-bot"
    include:
      - sponsio:core/universal
      - sponsio:capability/shell
      - sponsio:capability/filesystem

Natural Language Contract Generation

sponsio validate "never delete files in /prod"
# → outputs YAML contract draft for review

Demo Trajectories

Four built-in unsafe scenarios:

cleanup — agent deletes .env + .git/
backup — SRE cost-optimizer deletes prod DR backups
wire — AP copilot wires $847k to unverified vendor
freeze — Replit-style code-freeze violation + cover-up

Approval Gates

None in the default hot path (Sponsio is fully automated, no human gates). The sponsio:capability/shell bundle can add human escalation contracts for specific commands.

Execution Mode

Event-driven — checks fire synchronously on every tool call before execution.

Memory Context

Sponsio — Memory & Context

State Storage

Store	Path	Content
Session JSONL	`~/.sponsio/sessions/<agent_id>/*.jsonl`	Per-session observe-mode violations
Contract config	`sponsio.yaml` (project root)	Agent contracts and bundle includes
Trace files	custom paths	Raw agent trajectories for replay/check

Observe Mode Persistence

In observe mode, Sponsio logs every contract evaluation result to JSONL files:

Would-have-blocked decisions
Contract name, rule, actual value
Timestamp, agent_id, session_id

These feeds sponsio report --since 24h for tuning.

Live Stream

sponsio host trace --follow

Pure-OSS live stream of contract evaluations in real time.

Cross-Session

Session data accumulates in ~/.sponsio/sessions/<agent_id>/. sponsio refresh mines these for contract updates.

No In-Context Memory

Sponsio does not inject anything into the agent's context window. It wraps the tool-call interface externally. The agent is unaware of Sponsio unless a contract violation message is returned to it.

Compaction

Not applicable — Sponsio is not a context-management layer.

Orchestration

Sponsio — Orchestration

Multi-Agent

Partial — contracts can be declared per-agent with different privilege levels. Subagent privilege boundary is a documented use case (W2c in the skill). But Sponsio itself doesn't orchestrate agents.

Orchestration Pattern

None (Sponsio wraps agents, doesn't orchestrate them).

Isolation Mechanism

Behavioral isolation via contracts — agents are bounded by declared capability contracts. Not filesystem/process isolation.

Execution Mode

Event-driven — synchronous enforcement on every tool call. p50 0.139ms per contract.

Multi-Model

No. Sponsio is model-agnostic (works with any framework's LLM), but does not route different models to different roles.

Cross-Tool Portability

High — works with LangChain, LangGraph, Claude Agent SDK, OpenAI Agents SDK, Google ADK, CrewAI, Vercel AI, MCP, custom tool-calling. Language portability: Python + TypeScript.

Consensus

None.

Prompt Chaining

Not applicable (Sponsio is a runtime enforcement layer, not a reasoning pipeline).

LTL Monitor

Sponsio's temporal contracts use a fuzzy LTL (Linear Temporal Logic) monitor that checks ordering constraints (e.g., "compliance_approve must precede wire_transfer"). This is the most sophisticated enforcement mechanism in Batch 31 — other frameworks use regex matching; Sponsio uses formal temporal logic.

Ui Cli Surface

Sponsio — UI & CLI Surface

CLI Binary

Exists: yes Name: sponsio Package: PyPI sponsio v0.1.1 Is thin wrapper: no — own Python runtime Install: pip install sponsio

Key Subcommands

Subcommand	Description
`sponsio init .`	Interactive wizard
`sponsio demo`	Terminal demo trajectories
`sponsio validate "<rule>"`	NL-to-contract draft
`sponsio doctor`	Config health check
`sponsio report`	Observe-mode violation report
`sponsio mode observe/enforce`	Toggle enforcement
`sponsio packs`	List contract bundles
`sponsio patterns`	List 44 patterns
`sponsio serve`	HTTP dashboard server
`sponsio skill install`	Install Claude Code skill
`sponsio refresh`	Re-mine contracts from traces
`sponsio eval`	Run benchmarks
`sponsio plugin [init/install]`	Plugin management

Dashboard

Exists: yes (via sponsio serve) Type: web-dashboard Port: configurable (DASHBOARD_DEFAULT_PORT from constants) Access: HTTP server with dashboard UI Features: agent activity, contract violations, observe-mode feeds

IDE Integration

Claude Code: sponsio skill install adds SKILL.md to Claude Code skills
OpenClaw: sponsio plugin install openclaw → Claude Code plugin
Any MCP host: via sponsio:mcp integration

Observability

~/.sponsio/sessions/ JSONL for observe-mode violations
sponsio host trace --follow for live stream
sponsio report for aggregated analysis
OpenTelemetry (OTel) support via pip install "sponsio[all]"

Related frameworks

same archetype · same primary tool · same memory type

OpenHarness ★ 13k

A11 Governance

Open-source Python agent runtime providing complete harness infrastructure: tools, memory, governance, swarm coordination, and…

Trae Agent ★ 12k

A11 Governance

Research-friendly open-source CLI coding agent by ByteDance, designed for academic ablation studies and modular LLM provider…

Sweep AI ★ 7.7k

A11 Governance

Autonomous GitHub bot that converts issues to pull requests using a sequential multi-agent pipeline.

Agent Governance Toolkit (microsoft) ★ 2.3k

A11 Governance

Enterprise-grade AI agent governance: YAML policy enforcement, 12-vector prompt injection defense, zero-trust identity,…

TDD Guard ★ 2.1k

A11 Governance

Mechanically enforces the Red-Green-Refactor TDD cycle by blocking file writes that violate TDD principles via a PreToolUse hook…

Agentic Coding Flywheel Setup (ACFS) ★ 1.5k

A11 Governance

Take a complete beginner from laptop to three AI coding agents running on a VPS in 30 minutes via an idempotent manifest-driven…

Distribution

Type: standalone-repo
License: Apache-2.0
Install: npm-install
Version: 0.1.1

Surfaces

CLI binary: sponsio
CLI subcmds: 20
Local UI: web-dashboard
Tech stack: Python (sponsio serve) + rich TUI for CLI

Components

Commands: 0
Skills: 1
Subagents: 0
Hooks: 0
MCP servers: 0
MCP tools: 0
Scripts: 0
Templates: 2

Workflow

Phases: 7
Approval gates: 0
Spec format: yaml
Spec storage: flat-files
Delta or full: whole-file

Orchestration

Multi-agent: No
Pattern: none
Max concurrent: 1
Isolation: none
Consensus: none
Prompt chaining: No

Multi-model

Multi-model: No
BYOK: Yes
Modal: text

Execution

Mode: event-driven
Crash recovery: No
Compaction: No
Session handoff: Yes
Streaming: Yes

Memory

Type: json-store
Persistence: session
Search: none
State files: 2 files

Quality

TDD: No
TDD mechanism: none
Validators: 6
Self-review: none

Git / Observability

Auto commit: No
Auto PR: No
Auto merge: No
Worktree/feat: No
Audit log: Yes
Audit format: jsonl
Replay: Yes

Tools

Primary: claude-code
Targets: 9
Portability: high

Signals

Stars: 445
Last commit: 2026-05-26
Contributors: 4
Maintainer: active
Quality score: 3.7/10

Summary

Sponsio — Summary

Overview

Sponsio — Overview

Origin

Philosophy

Performance Claims

Core Design Principles

Acronym

Architecture

Sponsio — Architecture

Distribution

Install Methods

Directory Structure (repo)

Required Runtime

Target AI Tools / Frameworks

Components

Sponsio — Components

CLI Binary: sponsio

Contract Bundles (16)

Patterns (44)

Skills

TypeScript SDK

Framework Integrations

Prompts

Sponsio — Prompts

Verbatim Excerpt 1: Universal Contract Bundle (YAML)

Verbatim Excerpt 2: QUICKSTART Contract Enforcement Output

Verbatim Excerpt 3: README Comparison Table

Uniqueness

Sponsio — Uniqueness & Positioning

Differs from Seeds

Positioning

Observable Failure Modes

Relationship to Batch 31

Workflow

Sponsio — Workflow

Phases

Contract Lifecycle

Natural Language Contract Generation

Demo Trajectories

Approval Gates

Execution Mode

Memory Context

Sponsio — Memory & Context

State Storage

Observe Mode Persistence

Live Stream

Cross-Session

No In-Context Memory

Compaction

Orchestration

Sponsio — Orchestration

Multi-Agent

Orchestration Pattern

Isolation Mechanism

Execution Mode

Multi-Model

Cross-Tool Portability

Consensus

Prompt Chaining

LTL Monitor

Ui Cli Surface

Sponsio — UI & CLI Surface

CLI Binary

Key Subcommands

Dashboard

IDE Integration

Observability

Related frameworks

CLI Binary: `sponsio`