Sponsio — Prompts
Verbatim Excerpt 1: Universal Contract Bundle (YAML)
# sponsio/contracts/core/universal.yaml
# Pack stub — empty by design.
#
# This pack used to ship five stochastic output-safety contracts
# (injection_free, jailbreak_free, harmful, toxic_free, semantic_pii_free)
# and was auto-included by ``sponsio onboard``. That was the wrong
# default for tool-call-only agents: every step pulled the judge LLM
# in, adding latency + cost for agents that never produce LLM
# responses to score (the entire point of those contracts).
#
# The contracts are still shipped — they moved to
# ``sponsio:core/llm_safety``. Opt in from your config when your
# agent does produce LLM responses you want graded.
version: "1"
agents:
"*":
contracts: []
Technique: explicit empty pack with rationale. Sponsio documents why universal is empty — a design evolution recorded in code comments. This prevents cargo-culting of the old default while preserving backward compatibility.
Verbatim Excerpt 2: QUICKSTART Contract Enforcement Output
━━━ ◒◓ sponsio ━━━━━━━━━━━━━━━━━━━━━━━━━━
▎ contract · ap_copilot
▎ single wire capped at $50k
▎ enforce ▸ wire_transfer.amount must be in range [0, 50000]
▎ contract · ap_copilot
▎ compliance_approve must precede wire_transfer
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
-> wire_transfer(to='Acme Logistics LLC', amount=847000, ...)
✗ amount must be in range [0, 50000] — VIOLATED → blocked
✗ compliance_approve must precede wire_transfer — VIOLATED → blocked
Technique: structured violation output. Each violation shows: the contract name, the rule description, the actual call parameters, and the enforcement decision. This gives the agent enough context to understand why it was blocked and self-correct.
Verbatim Excerpt 3: README Comparison Table
On ODCV-Bench (12 frontier LLMs × 80 trajectories), unguarded models cheat in
11.5%–66.7% of runs. **With Sponsio, 95.6% of misalignment is avoided on average;
24/36 high-risk scenarios at 100%.** On the `Financial-Audit-Fraud-Finding` scenario,
frontier models commit fraud in 16/24 trials; **Sponsio blocks 18/19**. On
RedCode-Exec (1,410 cases), Sponsio reaches **92% combined** (bash 95% · python 90%).
Technique: benchmark-first positioning. Unlike most frameworks which provide anecdotal demos, Sponsio leads with verifiable benchmarks on public datasets, establishing scientific credibility.