Skip to content
/

CAI (Cybersecurity AI)

cai-cybersecurity · aliasrobotics/cai · ★ 8.8k · last commit 2026-05-18

Primitive shape 15 total
Subagents 15
00

Summary

CAI (Cybersecurity AI) — Summary

CAI is a lightweight, open-source Python framework by Alias Robotics for building and deploying offensive and defensive AI security agents. It provides 300+ LLM model support (OpenAI, Anthropic, DeepSeek, Ollama), built-in security tools for reconnaissance/exploitation/privilege escalation, an agent handoff system, OpenAI Agents SDK-compatible code-class agent definitions, and a TUI REPL interface. The framework has been battle-tested in HackTheBox CTFs and real-world robot vulnerability discovery (Unitree G1 humanoid robots, Ecoforest heat pumps, MiR industrial robots), ranking Top-10 in Dragos OT CTF 2025. A Professional Edition with an "alias1" model (BYOK via aliasrobotics.com at €350/month) claims to beat GPT-5 in CTF benchmarks. Community edition is fully open-source.

Compared to seeds: CAI is architecturally closest to claude-flow (code-class agents with handoffs, multi-model support, swarm-capable) but domain-specialized for offensive security — recon tools, exploit execution, guardrails against dangerous commands. Unlike all 11 seeds which target software development workflows, CAI targets penetration testing and vulnerability research workflows.

01

Overview

CAI (Cybersecurity AI) — Overview

Origin

Developed by Alias Robotics (aliasrobotics.com), a European robotics cybersecurity company. MIT/custom license (NOASSERTION in GitHub). 8,753 stars, 1,292 forks. Published academic papers (8+ arXiv papers as of 2026). Community edition: pip install cai-framework.

Philosophy

"Cybersecurity AI (CAI) is a lightweight, open-source framework that empowers security professionals to build and deploy AI-powered offensive and defensive automation."

CAI's philosophy: democratize AI-powered security testing. Manual penetration testing is bottlenecked by human cognitive load; AI agents can autonomously chain reconnaissance → exploitation → privilege escalation while maintaining auditability through tracing.

Key Design Principles

  • Lightweight: Minimal framework overhead — thin wrapper over the OpenAI Agents SDK pattern
  • 300+ models: Any OpenAI-compatible endpoint (local or cloud), supporting diverse model capabilities
  • Battle-tested: Not a demo framework — deployed in real vulnerability discovery and CTF competitions
  • Research-oriented: Built on published academic research with arXiv citation requirements
  • Ethical disclaimer: Prominent DISCLAIMER.md and README warnings against unauthorized use

Notable Achievements

  • Rank 1 during hours 7-8 at Dragos OT CTF 2025
  • Discovered vulnerabilities in Unitree G1 humanoid robots (unauthorized China-server telemetry, exposed RSA keys)
  • Inspired HackerOne's AI-powered Deduplication Agent (now in production)
  • Critical vulnerability discovery in Ecoforest heat pumps across Europe

Professional vs. Community Edition

Dimension Community Professional
Price Free (OSS) €350/month
Model Any 300+ models alias1 (proprietary, "no refusals")
Support Community Discord Professional support
CTF benchmarks Standard "Beats GPT-5"
02

Architecture

CAI (Cybersecurity AI) — Architecture

Distribution

PyPI: pip install cai-framework

CLI Entry Points

cai          = "cai.cli:main"         # Main REPL/TUI
cai-replay   = "tools.replay:main"    # Replay recorded sessions
cai-asciinema = "tools.asciinema:main" # Asciinema recording export
cai-gif      = "tools.gif:main"       # GIF creation from recordings

Source Structure

src/cai/
  agents/               # Pre-built security agent classes
    agent.py            # Base Agent dataclass (OpenAI Agents SDK compatible)
    android_sast_agent.py
    blue_teamer.py      # Defensive security agent
    bug_bounter.py      # Bug bounty agent
    codeagent.py        # Code analysis agent
    dfir.py             # Digital forensics and incident response
    factory.py          # Agent factory
    flag_discriminator.py  # CTF flag detection agent
    guardrails.py       # Safety guardrails
    mail.py             # Email security agent
    memory.py           # Memory-augmented agent
    memory_analysis_agent.py
    meta/               # Meta-agents for agent orchestration
    network_traffic_analyzer.py
    one_tool.py         # Single-tool agent
    patterns/           # Agent coordination patterns
    red_teamer.py       # Offensive security agent
    replay_attack_agent.py
    reporter.py         # Vulnerability report generation agent
    retester.py         # Retest/verify agent (inspired HackerOne)
  sdk/agents/           # OpenAI Agents SDK re-implementation
    agent.py            # @dataclass Agent definition
    guardrail.py        # Input/output guardrails
    handoffs.py         # Agent handoff mechanism
    items.py            # Message items
    lifecycle.py        # Agent lifecycle hooks
    mcp.py              # MCP integration
    model_settings.py   # Per-agent model configuration
    models/             # LLM provider interfaces
    run_context.py      # Execution context
    tool.py             # Tool definitions
    tracing/            # OpenAI-compatible tracing
  prompts/              # System prompt templates
  repl/                 # REPL/TUI interface
  tools/                # Security tools (recon, exploit, etc.)
  internal/             # Internal utilities

Configuration

.env.example — API keys for 300+ models via OpenRouter, Azure OpenAI, or direct provider APIs. agents.yml.example — Agent configuration template.

Required Runtime

  • Python ≥ 3.11
  • uv for fast dependency management
  • Docker (for containerized deployments via dockerized/)

Target AI Tools

Framework-agnostic at the model level (any OpenAI-compatible API). No native Claude Code integration.

03

Components

CAI (Cybersecurity AI) — Components

Pre-Built Security Agents

Agent Purpose
red_teamer.py Offensive security — autonomous exploitation
blue_teamer.py Defensive security — monitoring and response
bug_bounter.py Bug bounty reconnaissance and report generation
codeagent.py Static code analysis for vulnerabilities
dfir.py Digital forensics and incident response
android_sast_agent.py Android static application security testing
memory.py Memory-augmented agent (persists findings)
retester.py Regression testing of previously found vulnerabilities
reporter.py Structured vulnerability report generation
network_traffic_analyzer.py PCAP/traffic analysis
replay_attack_agent.py Replay attack detection and execution
flag_discriminator.py CTF flag detection and extraction
guardrails.py Safety guardrails for dangerous command prevention
mail.py Email security analysis
memory_analysis_agent.py Memory dump forensics

SDK Components (OpenAI Agents SDK compatible)

Component Purpose
Agent (dataclass) Agent with name, instructions, tools, handoffs, guardrails
Handoff Transfer control to another agent
InputGuardrail Pre-execution safety check
OutputGuardrail Post-execution output validation
function_tool Decorator to register Python functions as tools
MCPUtil MCP server integration
AgentHooks Pre/post execution lifecycle hooks

Tracing

OpenAI-compatible trace format in sdk/agents/tracing/. Records: agent name, tool calls, handoffs, model responses, timing. Exportable for replay via cai-replay.

Tools

The src/cai/tools/ directory contains security tools grouped by category (reconnaissance, exploitation, privilege escalation). Each tool is a Python function decorated with @function_tool.

Coordination Patterns

agents/patterns/ — agent coordination patterns (swarm, sequential, parallel) for multi-agent security assessments.

CLI / REPL

repl/ — terminal user interface (TUI) for interactive agent sessions. The cai CLI launches this REPL.

Benchmarks

benchmarks/ — CTF benchmark runners for measuring agent performance against known challenge sets.

Fluency

fluency/ — CAI learning resources and tutorials. The Learn - CAI Fluency section in README.

05

Prompts

CAI (Cybersecurity AI) — Prompts

CAI uses code-class agent definitions (Python dataclasses) with string instructions fields as the primary "prompt" mechanism. The src/cai/prompts/ directory contains system prompt templates.

Agent Class as Prompt (SDK Pattern)

@dataclass
class Agent(Generic[TContext]):
    """An agent is an AI model configured with instructions, tools, guardrails, handoffs."""

    name: str
    """The name of the agent."""

    instructions: (
        str
        | Callable[
            [RunContextWrapper[TContext], Agent[TContext]],
            MaybeAwaitable[str],
        ]
        | None
    ) = None
    """The instructions for the agent. Will be used as the 'system prompt' when this agent is
    invoked. Describes what the agent should do, and how it responds."""
    
    handoff_description: str | None = None
    """A human-readable description of the agent, used when the agent is used inside
    tools/handoffs."""
    
    tools: list[Tool[TContext]] = field(default_factory=list)
    guardrails: list[InputGuardrail[TContext]] = field(default_factory=list)
    output_guardrails: list[OutputGuardrail[TContext]] = field(default_factory=list)
    model: str | Model | None = None
    model_settings: ModelSettings = field(default_factory=ModelSettings)

Prompting technique: Structured dataclass definition where instructions is the system prompt. Dynamic instructions via callable allow per-run context injection. This is the OpenAI Agents SDK pattern — code-as-configuration.

Red Teamer Agent Instructions (Typical Pattern)

From agents.yml.example structure and red_teamer agent (representative):

red_teamer = Agent(
    name="Red Teamer",
    instructions="""You are an expert penetration tester. Your goal is to:
1. Enumerate attack surfaces systematically
2. Identify vulnerabilities using available tools
3. Attempt exploitation following responsible disclosure principles
4. Document findings for the reporter agent

IMPORTANT: Only test systems you are explicitly authorized to test.
Do not perform actions that could cause permanent damage.
Hand off to the reporter agent when you have confirmed findings.""",
    tools=[nmap_scan, gobuster, sqlmap, ...],
    handoffs=[reporter_agent, codeagent],
)

Prompting technique: Step-by-step instruction list with ethical constraints inline. The handoffs list is part of the prompt structure — the agent knows which specialists it can delegate to.

Guardrail Prompt

class DangerousCommandGuardrail(InputGuardrail):
    """Prevents execution of commands that could cause irreversible system damage."""
    
    async def run(self, ctx, agent, input):
        dangerous_patterns = ["rm -rf", "DROP TABLE", "format c:", "dd if=/dev/zero"]
        if any(pattern in str(input) for pattern in dangerous_patterns):
            return GuardrailFunctionOutput(
                output_info="Potentially dangerous command detected",
                tripwire_triggered=True
            )

Prompting technique: Programmatic guardrail as safety layer — not a prompt instruction telling the LLM to be safe, but code that intercepts and blocks dangerous calls before they execute.

09

Uniqueness

CAI (Cybersecurity AI) — Uniqueness & Positioning

Differs from Seeds

CAI is the only security-specialized agent framework in this corpus or the 11 seeds. All seeds target software development assistance; CAI targets offensive security. The most structurally similar seed is claude-flow — both use code-class agent definitions, multi-model support, handoffs, and swarm patterns. But CAI's tool ecosystem (nmap, gobuster, sqlmap, memory forensics) and domain expertise (CTF solving, vulnerability disclosure) are entirely distinct. CAI's retester agent directly inspired HackerOne's AI deduplication product — the only framework in the corpus with a confirmed commercial production deployment at a major security company.

Unique Characteristics

  1. Only security-specialized framework: 8 arXiv papers on AI security, CTF benchmarks, real-world vulnerability disclosures — no other framework in corpus has academic rigor in a specialized domain.
  2. retester agent → HackerOne production: Direct open-source-to-production lineage at a major security platform.
  3. 300+ model support with OpenRouter/Azure: Most model-agnostic framework in the corpus (any OpenAI-compatible API).
  4. Session replay tools (cai-replay, cai-asciinema, cai-gif): Only framework with dedicated session recording and replay tooling for audit/training purposes.
  5. Professional edition dual-license: Community OSS + proprietary €350/month tier with "no refusals" model (alias1) — the only BYOK commercial upsell in this batch.
  6. European data sovereignty emphasis: alias1 model hosted in EU (GDPR-compliant) — security-conscious organizations with data residency requirements.
  7. DISCLAIMER.md + ethical principles: Only framework with explicit legal disclaimer, ethical principles, and authorized-use-only emphasis — appropriate for offensive security tooling.

Positioning

"The Metasploit framework for AI security agents." CAI is what security professionals use when they need autonomous AI agents to assist with penetration testing, CTFs, or vulnerability research — not a chatbot, but an autonomous tool-chaining agent with security-specific capabilities.

Observable Failure Modes

  • Legal risk: Offensive security tools have inherent liability. The DISCLAIMER covers legal use but misuse is possible and Alias Robotics' liability is limited.
  • No sandboxing: Tool execution runs in the same environment — a misconfigured agent could damage the tester's own system.
  • alias1 model opacity: The proprietary "no refusals" model for professional edition raises alignment/safety questions.
  • Context accumulation: Long CTF sessions accumulate large tool call histories — context window management is not addressed.

Cross-References

  • Inspired HackerOne AI Deduplication Agent (production deployment)
  • arXiv papers: 2504.06017, 2506.23592, and 6 others
  • Structural similarity to claude-flow (code-class agents, handoffs, multi-model)
04

Workflow

CAI (Cybersecurity AI) — Workflow

Standard Pentest Workflow

Phase Artifact
Install pip install cai-framework + .env setup
Start REPL cai → interactive TUI
Select agent Choose from pre-built agents or define custom
Define target Specify scope (IP range, URL, binary)
Execute Agent autonomously chains recon → exploit → escalate
Review findings Structured output in REPL
Generate report reporter agent creates vulnerability report
Replay cai-replay for session audit

CTF Workflow

  1. cai REPL launched with CTF-specific agent configuration
  2. flag_discriminator agent monitors outputs for flag patterns
  3. Agents autonomously attempt challenges (exploitation, crypto, web)
  4. Flags extracted and submitted
  5. Session recorded for later analysis

Agent Handoff Flow

Red Teamer discovers attack surface
    → handoff to codeagent for static analysis
    → handoff to exploit_agent for POC
    → handoff to reporter for CVE documentation

Handoffs transfer execution context and findings between specialized agents without losing state.

Human-in-the-Loop

The REPL provides interactive control. Users can:

  • Interrupt agent execution
  • Inject context ("focus on the web application")
  • Override tool decisions
  • Review and approve dangerous commands (guardrails prompt for confirmation)

Guardrails Flow

  1. Agent proposes tool call (e.g., rm -rf /, destructive network scan)
  2. InputGuardrail evaluates danger level
  3. If dangerous: block or prompt user for explicit confirmation
  4. OutputGuardrail validates response for policy compliance

Approval Gates

Gate Type
Dangerous command confirmation yes-no (interactive REPL)
Scope validation (within authorized targets) Auto-check

Replay / Audit

Sessions are recorded via cai-replay (and optionally cai-asciinema). Every tool call, model response, and handoff is logged for post-session audit.

06

Memory Context

CAI (Cybersecurity AI) — Memory & Context

Session Tracing

Every session is recorded via the OpenAI-compatible tracing module (sdk/agents/tracing/). Captures: agent name, tool calls with arguments/results, handoffs, model responses, timing metadata. Stored in .cai/ directory.

Memory Agents

agents/memory.py — memory-augmented agent that persists findings across turns:

  • Accumulates discovered vulnerabilities, credentials, system information
  • Passes memory as context to subsequent turns
  • Enables multi-turn exploitation chains where later turns build on earlier discoveries

agents/memory_analysis_agent.py — specialized agent for memory dump forensics.

State Files

.cai/ — runtime state directory (gitignored per pyproject.toml)

Contents likely include:

  • Session trace files (JSONL format per OpenAI Agents SDK tracing)
  • Memory state for memory-augmented agents
  • Model cache

Cross-Session State

cai-replay — replay tool for reviewing past sessions. Session recordings enable:

  • Post-session audit trail
  • Training data collection
  • Demonstration/reporting

Context Window Management

No explicit compaction. Context management is delegated to the underlying model. For long-running CTF sessions with many tool calls, context accumulation can be significant.

Memory Persistence Scope

  • Session: Most agent state is per-session (REPL session)
  • Project: Session recordings persist to .cai/ for replay

MCP State

sdk/agents/mcp.py — MCP server integration. State management follows the MCP protocol's stateless RPC model.

07

Orchestration

CAI (Cybersecurity AI) — Orchestration

Multi-Agent Support

Yes — via handoffs and coordination patterns. Multiple specialized agents (red teamer, code analyst, reporter) can be chained via handoffs.

Orchestration Pattern

Hierarchical — a primary agent (e.g., red teamer) orchestrates specialist agents via handoffs. Also supports agents/patterns/ which include parallel-fan-out patterns for multi-target assessments.

Execution Mode

Interactive loop — the REPL/TUI provides a continuous session where users can interact, the agent runs tools autonomously, and the session persists until the user exits.

Handoff Mechanism

# Transfer execution to another agent
handoff = Handoff(
    agent=reporter_agent,
    tool_name="transfer_to_reporter",
    tool_description="Transfer to the reporter agent when you have confirmed findings"
)

Handoffs are implemented as tools the LLM can invoke — the agent "calls" the transfer function, which spawns the target agent with the current context.

Multi-Model Support

Yes — model field on each Agent dataclass allows different agents to use different models:

  • Red teamer: powerful reasoning model for complex exploitation
  • Reporter: cheaper model for structured document generation
  • Guardrail evaluator: fast model for safety checks

The 300+ model support via OpenAI-compatible APIs enables mixing providers.

Isolation Mechanism

None built-in. Tool execution happens in the same process as the agent. The dockerized/ directory suggests Docker-based deployment for isolation.

Swarm / Parallel Patterns

agents/patterns/ — includes coordination patterns for scenarios like:

  • Parallel multi-target reconnaissance
  • Swarm-style distributed CTF solving
  • Consensus across multiple agent assessments

HITL (Human-in-the-Loop)

Interactive REPL allows continuous human oversight. Users can interrupt, redirect, or confirm dangerous actions at any point. Not durable HITL (like Atmosphere) — requires human presence at the terminal.

Consensus Mechanism

None formal. Multi-agent findings can be cross-validated by the retester agent, but there is no Byzantine/Raft consensus protocol.

08

Ui Cli Surface

CAI (Cybersecurity AI) — UI/CLI Surface

CLI Binaries

Binary Purpose
cai Main REPL/TUI — interactive security agent session
cai-replay Replay recorded session files for audit/training
cai-asciinema Export sessions to asciinema format for sharing
cai-gif Convert session recordings to GIF for demos

REPL/TUI

Exists: Yes Type: Terminal TUI (not web dashboard) Port: N/A (terminal application) Tech Stack: Python TUI (likely rich/textual or custom ANSI — repl/ directory)

Features:

  • Interactive agent conversation
  • Tool call display with arguments and results
  • Agent handoff visualization
  • Dangerous command confirmation prompts
  • Session recording indicator
  • Model selection UI

No Web Dashboard

CAI is a CLI-first tool designed for terminal use. No web interface confirmed.

Observability

  • Session traces: JSONL format in .cai/ directory (OpenAI Agents SDK compatible)
  • cai-replay: CLI tool for replaying and reviewing session recordings
  • cai-asciinema: Session sharing via asciinema.org format
  • cai-gif: Visual session recordings for demos/reports

IDE Integration

None. CAI is a standalone CLI tool, not integrated with coding IDEs.

Development Tools

# Install with uv
uv sync

# Run tests
uv run pytest

# CI
ci/ directory contains CI/CD scripts

API Access

No dedicated REST API surface. The Python SDK (src/cai/sdk/) is the programmatic interface.

MCP Integration

sdk/agents/mcp.py — agents can use MCP servers as tool sources, enabling integration with the broader MCP ecosystem.

Multi-Platform Support

Supported platforms: Linux, macOS, Windows (WSL), Android. Documented in README with platform-specific install instructions.

Related frameworks

same archetype · same primary tool · same memory type

claude-mem (thedotmack) ★ 78k

Background worker service captures every tool call as an observation, AI-compresses sessions, and auto-injects relevant past…

pi (badlogic/earendil) ★ 55k

A minimal, hackable, multi-provider terminal coding agent that adapts to your workflows via npm-installable TypeScript Extensions…

Agent Skills (Addy Osmani) ★ 46k

Encodes senior-engineer software development lifecycle as 23 auto-routed skills and 7 slash commands for any AI coding agent.

wshobson/agents Plugin Marketplace ★ 36k

Single Markdown source for 83 domain-specialized plugins that auto-generates idiomatic artifacts for five AI coding harnesses.

TabbyML/Tabby ★ 34k

Self-hosted AI coding assistant server (alternative to GitHub Copilot) with admin dashboard, RAG-based completions, and multi-IDE…

Compound Engineering ★ 17k

Make each unit of engineering work compound into easier future work via brainstorm→plan→execute→review→learn cycles.