Skip to content
/

Agentic OS (KbWen)

agentic-os-kbwen · KbWen/agentic-os · ★ 0 · last commit 2026-05-26

Governance-first AI coding framework with classification-locked phase gates, SSoT state machine, and 26 commands + 14 skills that enforce No Evidence = No Completion.

Best whenGate enforcement through explicit FAIL conditions in prompt instructions (not runtime hooks) is sufficient to prevent phase skipping while preserving zero-ru…
Skip ifUnauthorized refactoring, Completion claims without verifiable evidence
vs seeds
spec-kit's 18-hook mirror pattern.
Primitive shape 40 total
Commands 26 Skills 14
00

Summary

agentic-os-kbwen — Agentic OS (KbWen)

Agentic OS by KbWen is a governance-first operating framework for AI coding agents, described as "the governance-first operating system for AI coding agents." It enforces structured workflows through a mandatory phase system (bootstrap → plan → implement → review → test → ship) with hard delivery gates: no evidence means no completion, no gate means no progression. The framework ships 26 slash commands mapped to workflow phases, 14 professional skills (borrowed from the superpowers pattern), plus an .agentcortex/ Single Source of Truth state system with per-task Work Logs. It differentiates by task classification (tiny-fix, quick-win, feature, hotfix, architecture-change) with different required phase sets per classification. A Python-based validation script (validate.sh) checks metadata integrity and command sync at deploy time. The deploy mechanism uses installers/deploy_brain.sh to install the framework non-destructively into any existing project.

Compared to the seeds, Agentic OS is closest to superpowers (Archetype 1 — skills-only behavioral framework) but hybridizes with agent-os (Archetype 4) by adding 26 slash commands on top of the skill layer. The key deltas are: a mandatory gate engine that prevents phase skipping, a SSoT state machine with classification-locked task flows, and a deploy-time installer that works against any existing project rather than requiring a template clone.

01

Overview

Overview — Agentic OS (KbWen)

Origin

Agentic OS v1.1.2 is maintained by KbWen. It positions itself as a governance framework to solve the endemic AI agent problems: skipping steps, hallucinating completion, drifting from scope, losing context, and breaking things silently.

Philosophy

From README:

"No Evidence = No Completion. No Gate = No Progression. No Exceptions."

The framework enforces five principles:

  • No Evidence = No Completion — narrative claims are not proof
  • Scope Discipline — unauthorized refactoring is strictly prohibited
  • Destructive Command Blockingrm -rf, git reset --hard, force pushes require pre-approved rollback plans
  • OWASP Top 10 Auto-Scan — security checks run during /implement and /review
  • Confidence Gate — AI must declare confidence level; low confidence triggers escalation

Manifesto-Style Quotes

From AGENTS.md:

"Correctness first. MUST NOT claim completion without verifiable evidence. Small, reversible changes. UNAUTHORIZED REFACTORING STRICTLY PROHIBITED."

"No Bypass Rule: MUST NOT skip Gate/Evidence checks — unknown status = FAIL."

"Read-Once Discipline: Read governance files once at session start; do NOT re-read in later turns."

From README gate diagram:

   Intent          Gate           Workflow         Evidence        Ship
  ┌──────┐      ┌──────┐       ┌──────────┐     ┌──────────┐   ┌──────┐
  │ User │ ───▸ │ Gate │ ───▸  │ Workflow  │ ──▸ │ Evidence │ ─▸│ Ship │
  │ says │      │Engine│       │ + Skills  │     │ Required │   │ SSoT │
  └──────┘      └──────┘       └──────────┘     └──────────┘   └──────┘
                  │ FAIL                           │ FAIL
                  ▼                                ▼
               ⛔ STOP                          ⛔ STOP

Task Classification System

Classification Required Phases
tiny-fix Classify → Execute → Evidence → Done
quick-win Bootstrap → Plan → Implement → Evidence → Ship
feature Bootstrap → Spec → Plan → Implement → Review → Test → Handoff → Ship
hotfix Bootstrap → Research → Plan → Implement → Review → Test → Ship
architecture-change Bootstrap → ADR → Spec → Plan → Implement → Review → Test → Handoff → Ship
02

Architecture

Architecture — Agentic OS (KbWen)

Distribution

  • Type: Standalone repository with a deploy installer
  • Install (first time):
    git clone https://github.com/KbWen/agentic-os.git
    ./agentic-os/installers/deploy_brain.sh --dry-run /path/to/your-project
    ./agentic-os/installers/deploy_brain.sh /path/to/your-project
    
  • Update: bash installers/deploy_brain.sh . (runs from within target project)
  • Non-destructive: existing AGENTS.md, CLAUDE.md etc. are NOT overwritten

Directory Tree (after deploy into target project)

<your-project>/
├── AGENTS.md                         # Global governance directives
├── CLAUDE.md                         # Claude Code wrapper
├── .agent/
│   ├── rules/
│   │   ├── engineering_guardrails.md # Classification tiers, gate rules
│   │   ├── security_guardrails.md    # OWASP checks, destructive command blocks
│   │   └── state_machine.md          # Phase transitions
│   └── workflows/
│       ├── bootstrap.md
│       ├── plan.md
│       ├── implement.md
│       ├── review.md
│       ├── test.md
│       └── ship.md
├── .agentcortex/
│   ├── context/
│   │   ├── current_state.md          # SSoT (Single Source of Truth)
│   │   └── work/
│   │       └── <branch-name>.md      # Per-task Work Log
│   ├── bin/
│   │   └── validate.sh               # Deploy-time + CI validation
│   └── docs/
│       └── guides/
│           └── token-governance.md
├── .agents/
│   ├── skills/                       # 14 canonical skills (SKILL.md format)
│   └── workflows/
├── .claude/
│   ├── commands/                     # 26 slash commands
│   └── settings.json
└── installers/
    └── deploy_brain.sh

Required Runtime

  • Bash (for installer and validate.sh)
  • Python 3.9+ (recommended for full validate.sh — degrades gracefully without it)
  • SHA-256 tool (sha256sum, shasum, or openssl)
  • Git

Target AI Tools

Claude Code (primary), Cursor, GitHub Copilot, Google Antigravity, Codex — via AGENTS.md + per-agent wrappers.

03

Components

Components — Agentic OS (KbWen)

Slash Commands (26, in .claude/commands/)

Command Purpose
adr.md Architecture Decision Record workflow
app-init.md Initialize project-specific rules extensions
ask-openrouter.md Query OpenRouter models
audit.md Read-only codebase audit (zero-risk entry path)
bootstrap.md Mandatory task bootstrap — reads SSoT, classifies task
brainstorm.md Structured brainstorming
claude-cli.md Claude CLI integration
codex-cli.md Codex CLI integration
decide.md Structured decision-making
govern-docs.md Governance document management
handoff.md Hand off between AI sessions
help.md Help and command reference
hotfix.md Hotfix workflow
implement.md Implementation phase workflow
plan.md Planning phase workflow
research.md Research phase workflow
retro.md Retrospective workflow
review.md Code review phase workflow
ship.md Ship phase — requires handoff evidence
spec-intake.md New feature specification intake
spec.md Spec writing workflow
sync-docs.md Sync documentation
test-classify.md Test classification
test-skeleton.md Generate test skeleton
test.md Test phase workflow
worktree-first.md Git worktree first workflow

Skills (14, in .agents/skills/)

Skill Trigger Description
api-design API endpoint detection
auth-security Auth code detection
database-design Migration detection
dispatching-parallel-agents Complex multi-module tasks
doc-lookup Documentation needed
frontend-patterns UI component detection
karpathy-principles General coding guidance
production-readiness Pre-ship readiness check
red-team-adversarial Review/test classification
subagent-driven-development Multi-agent coordination
systematic-debugging Bug encounter
test-driven-development Feature/architecture-change tasks
using-git-worktrees Parallel branch work
verification-before-completion /ship phase

Validation Script

  • .agentcortex/bin/validate.sh — checks metadata integrity, encoding, command sync; runs at deploy time and in CI

State Files

File Purpose
.agentcortex/context/current_state.md SSoT: global project state, ship history, decisions
.agentcortex/context/work/<branch>.md Per-task Work Log: progress, evidence, gate receipts
docs/specs/_product-backlog.md Multi-feature backlog

Hooks

None — .claude/settings.json comment confirms: "Sentinel and Phase Summary enforcement is model self-attestation per AGENTS.md — no Python hooks shipped, keeping downstream zero-runtime-dep."

05

Prompts

Prompts — Agentic OS (KbWen)

Excerpt 1: /bootstrap Command

Source: .claude/commands/bootstrap.md Technique: Sequential workflow delegation with mandatory stop-after-output gate

# /bootstrap

Execute the canonical workflow: `.agent/workflows/bootstrap.md`

## Required reads before execution

1. `AGENTS.md` — global directives (Intent Router, Gate Engine, Sentinel)
2. `.agent/rules/engineering_guardrails.md` — classification tiers and gate rules
3. `.agent/rules/state_machine.md` — phase transitions
4. `.agentcortex/context/current_state.md` — SSoT

## Execution

Follow every step in `.agent/workflows/bootstrap.md` sequentially.
The user's task description is: $ARGUMENTS

- Do NOT skip any steps.
- Do NOT proceed past bootstrap in the same turn.
- Output the bootstrap report, then STOP and ask user for next step.
- End response with ⚡ ACX.

Excerpt 2: /ship Command

Source: .claude/commands/ship.md Technique: Hard evidence gate — missing fields = explicit FAIL, no fallback

# /ship

Execute the canonical workflow: `.agent/workflows/ship.md`

## Required reads before execution

1. `AGENTS.md` — global directives (Delivery Gates, No Evidence = No Ship)
2. `.agent/rules/engineering_guardrails.md` — §10.5 Handoff/Ship Hard Gate
3. `.agent/rules/security_guardrails.md` — final security check
4. Active Work Log — must contain handoff references:
   `ship:[doc=<path>][code=<path>][log=<path>]`

## Execution

Follow every step in `.agent/workflows/ship.md` sequentially.
If any gate field is missing, FAIL the gate and list missing fields — do NOT proceed.
End response with ⚡ ACX.

Excerpt 3: TDD Skill

Source: .agents/skills/test-driven-development/SKILL.md Technique: Red-Green-Refactor cycle enforcement with Ironclad Rules

# Test-Driven Development

## Workflow

1. **Red**: Write a failing test describing the expected behavior.
2. **Green**: Write the minimal code to make the test pass.
3. **Refactor**: Clean up naming, structure, and duplication.
4. Repeat the cycle until acceptance criteria are met.

## Ironclad Rules

- Do NOT write massive amounts of features before writing tests.
- Focus on one small goal per cycle.
- All tests MUST pass after refactoring.

Excerpt 4: AGENTS.md Core Directive

Source: AGENTS.md Technique: Token governance with read-once discipline and explicit re-read audit trail

**Read-Once Discipline**: Read governance files once at session start; do NOT re-read in later turns. **Safety Valve**: On genuine rule uncertainty, re-read ONE `##`-section only — MUST log in `## Drift Log` as `- Re-read: <file> §<section> — reason: <1-line>`. Un-logged re-reads = Token Leak violation.

**Response Brevity & Budget**: Short, information-dense output. No preamble/postamble. Expand only for gate blocks, plan artifacts, or ship evidence. Hard cap: ≤8 lines prose + required structured blocks.
09

Uniqueness

Uniqueness — Agentic OS (KbWen)

Differs From Seeds

Agentic OS is closest to superpowers (Archetype 1 — skills-only behavioral framework) in its 14 professional skills, many of which share names with superpowers skills (test-driven-development, systematic-debugging, using-git-worktrees, verification-before-completion). However, Agentic OS is a hybrid: it adds 26 slash commands on top of the skill layer, creating a command-AND-skill architecture unlike any pure seed archetype. It further hybridizes with agent-os (Archetype 4) in its deploy-into-project installer pattern and SSoT state machine. The distinctive addition is the mandatory gate engine with classification-locked phase chains: where superpowers provides behavioral guidelines, Agentic OS enforces phase gates that the AI literally cannot skip (at the prompt level — "If any gate field is missing, FAIL the gate and list missing fields — do NOT proceed"). The ⚡ ACX end-of-response marker is a session-wide sentinel the validate.sh script can check.

Positioning

This framework occupies a "governance-first" niche between superpowers (pure behavioral) and kiro (IDE-enforced gates). It imposes kiro-like mandatory phase flows without requiring a specialized IDE, relying instead on model self-attestation with explicit failure conditions. The zero-runtime-dependency design choice (no Python hooks) is explicitly documented as a downstream compatibility decision.

Observable Failure Modes

  1. Model self-attestation gap: Gate enforcement relies on the model following instructions, not on code-level enforcement. A model that ignores "FAIL the gate" will proceed anyway.
  2. Classification lock friction: Reclassification requires rolling back to CLASSIFIED state, which could frustrate users with evolving requirements.
  3. Read-Once Discipline violations: The Drift Log requirement for re-reading governance files is easy to forget; token leakage from re-reads is a stated risk.
  4. Multi-agent Work Log collision: The "One Branch = One Owner" rule prevents concurrent corruption but requires users to maintain branch discipline.
  5. Skill auto-trigger ambiguity: 14 skills that "auto-activate based on task classification and workflow phase" could fire unexpectedly on misclassified tasks.
04

Workflow

Workflow — Agentic OS (KbWen)

Phase System

The framework enforces a mandatory phase system based on task classification:

/bootstrap → classify → route to phase chain → /ship

Phase Chains by Classification

Classification Phase Chain
tiny-fix Classify → Execute → Evidence → Done
quick-win bootstrap → plan → implement → evidence → ship
feature bootstrap → spec → plan → implement → review → test → handoff → ship
hotfix bootstrap → research → plan → implement → review → test → ship
architecture-change bootstrap → ADR → spec → plan → implement → review → test → handoff → ship

Phase-to-Artifact Map

Phase Artifact
bootstrap .agentcortex/context/work/<branch>.md (Work Log created)
spec docs/specs/<feature>.md
plan Plan section in Work Log
adr docs/adrs/<date>-<title>.md
implement Code changes + checkpoints in Work Log
review Review evidence in Work Log
test Test evidence in Work Log
handoff Handoff references in Work Log (ship:[doc=...][code=...][log=...])
ship current_state.md updated, backlog entry closed

Approval Gates

  1. bootstrap: AI classifies task, user can override classification
  2. plan: Plan reviewed before implementation begins
  3. ship: Hard gate — Work Log must contain ship:[doc=<path>][code=<path>][log=<path>] references; missing fields = FAIL, do NOT proceed
  4. confidence gate: AI must declare confidence level; low confidence triggers escalation to user
  5. destructive command gate: rm -rf, git reset --hard, force pushes require pre-approved rollback plan

Gate Types

  • Evidence gates: Require verifiable artifacts (test results, coverage numbers, security scan output)
  • Handoff gates: Require explicit cross-reference tags in Work Log before shipping
  • Confidence gates: Self-attestation with escalation path
  • Classification lock: Task classification frozen after bootstrap; re-classification requires rollback to CLASSIFIED state

Token Efficiency Rules (from AGENTS.md)

  • Read-Once Discipline: Read governance files once at session start; re-read only with explicit ## Drift Log entry
  • Context Pruning: At 8+ turns on same task, proactively suggest handoff + new conversation
  • Conditional Loading: tiny-fix skips guardrails (~5,000 tokens saved)
06

Memory Context

Memory & Context — Agentic OS (KbWen)

Single Source of Truth (SSoT)

Agentic OS uses a two-file state architecture:

Global State

  • File: .agentcortex/context/current_state.md
  • Contains: Global project state, decisions, lessons, ship history, active spec index
  • Updated by: Only /ship command (via guard_context_write.py if Python available, or direct append with Drift Log entry)
  • Read by: Every session at start

Per-Task Work Logs

  • File: .agentcortex/context/work/<branch-name>.md
  • Contains: Per-task progress, evidence, gate receipts, phase summaries
  • Updated by: Agent during active task
  • Isolation: "One Branch = One Owner" — prevents concurrent Work Log corruption

Multi-Agent State Safety

  • Advisory Locking: Lock files signal active sessions without blocking
  • Ship Guard: Checks for SSoT conflicts before merging
  • Session Identity: Every AI session writes its model name and timestamp
  • Write Isolation: Agents write only to their own Work Log; only /ship updates SSoT

Context Loading Discipline

From AGENTS.md:

  • Init Read: MUST read current_state.md (SSoT) + active Work Log at session start
  • Exception: tiny-fix tasks MAY skip SSoT and Work Log reads
  • Prohibited: Blind directory scanning (ls -R .agentcortex/context/)
  • Context Pruning: At 8+ turns on same task → suggest handoff + new conversation

Skill Context

Skills use metadata-first loading: agent reads name, description, path, and agents/openai.yaml metadata before loading full SKILL.md. This reduces per-session token cost when many skills are registered.

Compaction Handling

Explicit compaction guidance in AGENTS.md:

  • Phase Summary: 1-line compact summary per phase written to Work Log for low-token resume
  • Archived Work Logs: Reviewed by validate.sh
  • Token cap: Hard cap of ≤8 lines prose per response

Cross-Session Handoff

Yes — Work Logs persist between sessions. The handoff.md command writes structured handoff references. /resume_handoff (pattern referenced in commands) restores context from Work Log.

Product Backlog

  • File: docs/specs/_product-backlog.md
  • Purpose: Living index for multi-feature work; updated by bootstrap checks and ship
  • Token cost: ~200 tokens (free-read)
07

Orchestration

Orchestration — Agentic OS (KbWen)

Multi-Agent Support

Yes — framework has explicit multi-agent skills:

  • dispatching-parallel-agents skill: coordinated subagent execution for complex tasks
  • subagent-driven-development skill: multi-agent coordination for multi-module tasks

Multi-Agent Rules

  • One Branch = One Owner: prevents concurrent Work Log corruption
  • Advisory Locking: lock files signal active sessions without blocking
  • Ship Guard: checks for SSoT conflicts before merging
  • Session Identity: every AI session writes its model name and timestamp

Orchestration Pattern

Sequential (per phase chain) with optional parallel fan-out via dispatching-parallel-agents skill.

Isolation Mechanism

Git worktree per feature (via using-git-worktrees skill and worktree-first.md command). git-worktrees isolation is documented and skill-supported but not automatically enforced.

Multi-Model Support

No multi-model routing configured. Framework is model-agnostic; model selected by user's runtime.

Execution Mode

Interactive-loop — user invokes commands sequentially per phase. The framework enforces that each command stops after completion and does not proceed automatically to the next phase.

Consensus Mechanism

None (no distributed consensus).

Prompt Chaining

Yes — bootstrap output becomes input for plan command; plan artifacts gate implement; implement evidence gates review; review gates test; test gates ship. Each phase reads the Work Log from the previous phase.

Cross-Tool Portability

High — via AGENTS.md (shared), plus per-runtime adapters. README lists: Claude Code, Cursor, GitHub Copilot, Google Antigravity, Codex.

08

Ui Cli Surface

UI & CLI Surface — Agentic OS (KbWen)

Dedicated CLI Binary

No dedicated CLI binary. Interaction is via AI agent slash commands (/bootstrap, /plan, etc.) in the user's chosen IDE or agent runtime.

Local Web Dashboard

None.

Deploy Installer

installers/deploy_brain.sh is a bash installer that:

  • Accepts --dry-run flag for preview
  • Deploys framework files non-destructively into any existing project
  • Auto-fetches latest version from GitHub on update runs
  • Prints deployment manifest and status

Validation Script

.agentcortex/bin/validate.sh:

  • Checks metadata integrity, encoding, command sync
  • Runs at deploy time and in CI (GitHub Actions)
  • Degrades gracefully without Python 3.9+ (--no-python flag)
  • Reports WARN instead of FAIL for Python-dependent checks when Python unavailable

CI Integration

GitHub Actions workflows at .github/workflows/validate.yml and .github/workflows/security.yml — CI status badges visible in README.

IDE Integration

No dedicated extension. Works in any agent runtime via file-based commands and skills.

Observability

  • Work Logs (.agentcortex/context/work/<branch>.md) provide human-readable per-task audit trail
  • SSoT (current_state.md) provides global project state history
  • validate.sh provides deploy-time and CI integrity checks
  • No structured JSONL log, no replay capability

Related frameworks

same archetype · same primary tool · same memory type

BMAD-METHOD ★ 48k

Provides a full agile delivery lifecycle with named expert-persona AI collaborators that elicit the human's best thinking rather…

Agent OS ★ 4.6k

Extracts implicit codebase conventions into token-efficient markdown standards files and injects them selectively into AI agent…

Claude Conductor ★ 367

Gives Claude Code a persistent, cross-linked, auto-analyzed documentation system so it retains codebase context across sessions.

Spec-Driver (Greenfield Spec-Driven Development) ★ 25

Prevents spec rot in AI-assisted development by making implementation changes flow back into evergreen, authoritative specs via…

Anthropic Knowledge Work Plugins ★ 16k

Role-specialized plugin bundles with live MCP connectors that turn Claude into a domain expert for enterprise knowledge workers.

Codex Integration for Claude Code (skill-codex) ★ 1.3k

Single Claude Code skill that handles Codex CLI invocation correctly (stdin blocking, thinking token suppression, session resume)…