Agentic OS (KbWen)

agentic-os-kbwen · KbWen/agentic-os · ★ 0 · last commit 2026-05-26

Governance-first AI coding framework with classification-locked phase gates, SSoT state machine, and 26 commands + 14 skills that enforce No Evidence = No Completion.

Best whenGate enforcement through explicit FAIL conditions in prompt instructions (not runtime hooks) is sufficient to prevent phase skipping while preserving zero-ru…

Skip ifUnauthorized refactoring, Completion claims without verifiable evidence

vs seeds

spec-kit's 18-hook mirror pattern.

Primitive shape 40 total

Commands 26 Skills 14

Summary

agentic-os-kbwen — Agentic OS (KbWen)

Agentic OS by KbWen is a governance-first operating framework for AI coding agents, described as "the governance-first operating system for AI coding agents." It enforces structured workflows through a mandatory phase system (bootstrap → plan → implement → review → test → ship) with hard delivery gates: no evidence means no completion, no gate means no progression. The framework ships 26 slash commands mapped to workflow phases, 14 professional skills (borrowed from the superpowers pattern), plus an .agentcortex/ Single Source of Truth state system with per-task Work Logs. It differentiates by task classification (tiny-fix, quick-win, feature, hotfix, architecture-change) with different required phase sets per classification. A Python-based validation script (validate.sh) checks metadata integrity and command sync at deploy time. The deploy mechanism uses installers/deploy_brain.sh to install the framework non-destructively into any existing project.

Compared to the seeds, Agentic OS is closest to superpowers (Archetype 1 — skills-only behavioral framework) but hybridizes with agent-os (Archetype 4) by adding 26 slash commands on top of the skill layer. The key deltas are: a mandatory gate engine that prevents phase skipping, a SSoT state machine with classification-locked task flows, and a deploy-time installer that works against any existing project rather than requiring a template clone.

Overview

Overview — Agentic OS (KbWen)

Origin

Agentic OS v1.1.2 is maintained by KbWen. It positions itself as a governance framework to solve the endemic AI agent problems: skipping steps, hallucinating completion, drifting from scope, losing context, and breaking things silently.

Philosophy

From README:

"No Evidence = No Completion. No Gate = No Progression. No Exceptions."

The framework enforces five principles:

No Evidence = No Completion — narrative claims are not proof
Scope Discipline — unauthorized refactoring is strictly prohibited
Destructive Command Blocking — rm -rf, git reset --hard, force pushes require pre-approved rollback plans
OWASP Top 10 Auto-Scan — security checks run during /implement and /review
Confidence Gate — AI must declare confidence level; low confidence triggers escalation

Manifesto-Style Quotes

From AGENTS.md:

"Correctness first. MUST NOT claim completion without verifiable evidence. Small, reversible changes. UNAUTHORIZED REFACTORING STRICTLY PROHIBITED."

"No Bypass Rule: MUST NOT skip Gate/Evidence checks — unknown status = FAIL."

"Read-Once Discipline: Read governance files once at session start; do NOT re-read in later turns."

From README gate diagram:

   Intent          Gate           Workflow         Evidence        Ship
  ┌──────┐      ┌──────┐       ┌──────────┐     ┌──────────┐   ┌──────┐
  │ User │ ───▸ │ Gate │ ───▸  │ Workflow  │ ──▸ │ Evidence │ ─▸│ Ship │
  │ says │      │Engine│       │ + Skills  │     │ Required │   │ SSoT │
  └──────┘      └──────┘       └──────────┘     └──────────┘   └──────┘
                  │ FAIL                           │ FAIL
                  ▼                                ▼
               ⛔ STOP                          ⛔ STOP

Task Classification System

Classification	Required Phases
tiny-fix	Classify → Execute → Evidence → Done
quick-win	Bootstrap → Plan → Implement → Evidence → Ship
feature	Bootstrap → Spec → Plan → Implement → Review → Test → Handoff → Ship
hotfix	Bootstrap → Research → Plan → Implement → Review → Test → Ship
architecture-change	Bootstrap → ADR → Spec → Plan → Implement → Review → Test → Handoff → Ship

Architecture

Architecture — Agentic OS (KbWen)

Distribution

Type: Standalone repository with a deploy installer

Install (first time):

git clone https://github.com/KbWen/agentic-os.git
./agentic-os/installers/deploy_brain.sh --dry-run /path/to/your-project
./agentic-os/installers/deploy_brain.sh /path/to/your-project

Update: bash installers/deploy_brain.sh . (runs from within target project)
Non-destructive: existing AGENTS.md, CLAUDE.md etc. are NOT overwritten

Directory Tree (after deploy into target project)

<your-project>/
├── AGENTS.md                         # Global governance directives
├── CLAUDE.md                         # Claude Code wrapper
├── .agent/
│   ├── rules/
│   │   ├── engineering_guardrails.md # Classification tiers, gate rules
│   │   ├── security_guardrails.md    # OWASP checks, destructive command blocks
│   │   └── state_machine.md          # Phase transitions
│   └── workflows/
│       ├── bootstrap.md
│       ├── plan.md
│       ├── implement.md
│       ├── review.md
│       ├── test.md
│       └── ship.md
├── .agentcortex/
│   ├── context/
│   │   ├── current_state.md          # SSoT (Single Source of Truth)
│   │   └── work/
│   │       └── <branch-name>.md      # Per-task Work Log
│   ├── bin/
│   │   └── validate.sh               # Deploy-time + CI validation
│   └── docs/
│       └── guides/
│           └── token-governance.md
├── .agents/
│   ├── skills/                       # 14 canonical skills (SKILL.md format)
│   └── workflows/
├── .claude/
│   ├── commands/                     # 26 slash commands
│   └── settings.json
└── installers/
    └── deploy_brain.sh

Required Runtime

Bash (for installer and validate.sh)
Python 3.9+ (recommended for full validate.sh — degrades gracefully without it)
SHA-256 tool (sha256sum, shasum, or openssl)
Git

Target AI Tools

Claude Code (primary), Cursor, GitHub Copilot, Google Antigravity, Codex — via AGENTS.md + per-agent wrappers.

Components

Components — Agentic OS (KbWen)

Slash Commands (26, in `.claude/commands/`)

Command	Purpose
adr.md	Architecture Decision Record workflow
app-init.md	Initialize project-specific rules extensions
ask-openrouter.md	Query OpenRouter models
audit.md	Read-only codebase audit (zero-risk entry path)
bootstrap.md	Mandatory task bootstrap — reads SSoT, classifies task
brainstorm.md	Structured brainstorming
claude-cli.md	Claude CLI integration
codex-cli.md	Codex CLI integration
decide.md	Structured decision-making
govern-docs.md	Governance document management
handoff.md	Hand off between AI sessions
help.md	Help and command reference
hotfix.md	Hotfix workflow
implement.md	Implementation phase workflow
plan.md	Planning phase workflow
research.md	Research phase workflow
retro.md	Retrospective workflow
review.md	Code review phase workflow
ship.md	Ship phase — requires handoff evidence
spec-intake.md	New feature specification intake
spec.md	Spec writing workflow
sync-docs.md	Sync documentation
test-classify.md	Test classification
test-skeleton.md	Generate test skeleton
test.md	Test phase workflow
worktree-first.md	Git worktree first workflow

Skills (14, in `.agents/skills/`)

Skill	Trigger Description
api-design	API endpoint detection
auth-security	Auth code detection
database-design	Migration detection
dispatching-parallel-agents	Complex multi-module tasks
doc-lookup	Documentation needed
frontend-patterns	UI component detection
karpathy-principles	General coding guidance
production-readiness	Pre-ship readiness check
red-team-adversarial	Review/test classification
subagent-driven-development	Multi-agent coordination
systematic-debugging	Bug encounter
test-driven-development	Feature/architecture-change tasks
using-git-worktrees	Parallel branch work
verification-before-completion	/ship phase

Validation Script

.agentcortex/bin/validate.sh — checks metadata integrity, encoding, command sync; runs at deploy time and in CI

State Files

File	Purpose
`.agentcortex/context/current_state.md`	SSoT: global project state, ship history, decisions
`.agentcortex/context/work/<branch>.md`	Per-task Work Log: progress, evidence, gate receipts
`docs/specs/_product-backlog.md`	Multi-feature backlog

Hooks

None — .claude/settings.json comment confirms: "Sentinel and Phase Summary enforcement is model self-attestation per AGENTS.md — no Python hooks shipped, keeping downstream zero-runtime-dep."

Prompts

Prompts — Agentic OS (KbWen)

Excerpt 1: /bootstrap Command

Source: .claude/commands/bootstrap.md Technique: Sequential workflow delegation with mandatory stop-after-output gate

# /bootstrap

Execute the canonical workflow: `.agent/workflows/bootstrap.md`

## Required reads before execution

1. `AGENTS.md` — global directives (Intent Router, Gate Engine, Sentinel)
2. `.agent/rules/engineering_guardrails.md` — classification tiers and gate rules
3. `.agent/rules/state_machine.md` — phase transitions
4. `.agentcortex/context/current_state.md` — SSoT

## Execution

Follow every step in `.agent/workflows/bootstrap.md` sequentially.
The user's task description is: $ARGUMENTS

- Do NOT skip any steps.
- Do NOT proceed past bootstrap in the same turn.
- Output the bootstrap report, then STOP and ask user for next step.
- End response with ⚡ ACX.

Excerpt 2: /ship Command

Source: .claude/commands/ship.md Technique: Hard evidence gate — missing fields = explicit FAIL, no fallback

# /ship

Execute the canonical workflow: `.agent/workflows/ship.md`

## Required reads before execution

1. `AGENTS.md` — global directives (Delivery Gates, No Evidence = No Ship)
2. `.agent/rules/engineering_guardrails.md` — §10.5 Handoff/Ship Hard Gate
3. `.agent/rules/security_guardrails.md` — final security check
4. Active Work Log — must contain handoff references:
   `ship:[doc=<path>][code=<path>][log=<path>]`

## Execution

Follow every step in `.agent/workflows/ship.md` sequentially.
If any gate field is missing, FAIL the gate and list missing fields — do NOT proceed.
End response with ⚡ ACX.

Excerpt 3: TDD Skill

Source: .agents/skills/test-driven-development/SKILL.md Technique: Red-Green-Refactor cycle enforcement with Ironclad Rules

# Test-Driven Development

## Workflow

1. **Red**: Write a failing test describing the expected behavior.
2. **Green**: Write the minimal code to make the test pass.
3. **Refactor**: Clean up naming, structure, and duplication.
4. Repeat the cycle until acceptance criteria are met.

## Ironclad Rules

- Do NOT write massive amounts of features before writing tests.
- Focus on one small goal per cycle.
- All tests MUST pass after refactoring.

Excerpt 4: AGENTS.md Core Directive

Source: AGENTS.md Technique: Token governance with read-once discipline and explicit re-read audit trail

**Read-Once Discipline**: Read governance files once at session start; do NOT re-read in later turns. **Safety Valve**: On genuine rule uncertainty, re-read ONE `##`-section only — MUST log in `## Drift Log` as `- Re-read: <file> §<section> — reason: <1-line>`. Un-logged re-reads = Token Leak violation.

**Response Brevity & Budget**: Short, information-dense output. No preamble/postamble. Expand only for gate blocks, plan artifacts, or ship evidence. Hard cap: ≤8 lines prose + required structured blocks.

Uniqueness

Uniqueness — Agentic OS (KbWen)

Differs From Seeds

Agentic OS is closest to superpowers (Archetype 1 — skills-only behavioral framework) in its 14 professional skills, many of which share names with superpowers skills (test-driven-development, systematic-debugging, using-git-worktrees, verification-before-completion). However, Agentic OS is a hybrid: it adds 26 slash commands on top of the skill layer, creating a command-AND-skill architecture unlike any pure seed archetype. It further hybridizes with agent-os (Archetype 4) in its deploy-into-project installer pattern and SSoT state machine. The distinctive addition is the mandatory gate engine with classification-locked phase chains: where superpowers provides behavioral guidelines, Agentic OS enforces phase gates that the AI literally cannot skip (at the prompt level — "If any gate field is missing, FAIL the gate and list missing fields — do NOT proceed"). The ⚡ ACX end-of-response marker is a session-wide sentinel the validate.sh script can check.

Positioning

This framework occupies a "governance-first" niche between superpowers (pure behavioral) and kiro (IDE-enforced gates). It imposes kiro-like mandatory phase flows without requiring a specialized IDE, relying instead on model self-attestation with explicit failure conditions. The zero-runtime-dependency design choice (no Python hooks) is explicitly documented as a downstream compatibility decision.

Observable Failure Modes

Model self-attestation gap: Gate enforcement relies on the model following instructions, not on code-level enforcement. A model that ignores "FAIL the gate" will proceed anyway.
Classification lock friction: Reclassification requires rolling back to CLASSIFIED state, which could frustrate users with evolving requirements.
Read-Once Discipline violations: The Drift Log requirement for re-reading governance files is easy to forget; token leakage from re-reads is a stated risk.
Multi-agent Work Log collision: The "One Branch = One Owner" rule prevents concurrent corruption but requires users to maintain branch discipline.
Skill auto-trigger ambiguity: 14 skills that "auto-activate based on task classification and workflow phase" could fire unexpectedly on misclassified tasks.

Workflow

Workflow — Agentic OS (KbWen)

Phase System

The framework enforces a mandatory phase system based on task classification:

/bootstrap → classify → route to phase chain → /ship

Phase Chains by Classification

Classification	Phase Chain
tiny-fix	Classify → Execute → Evidence → Done
quick-win	bootstrap → plan → implement → evidence → ship
feature	bootstrap → spec → plan → implement → review → test → handoff → ship
hotfix	bootstrap → research → plan → implement → review → test → ship
architecture-change	bootstrap → ADR → spec → plan → implement → review → test → handoff → ship

Phase-to-Artifact Map

Phase	Artifact
bootstrap	`.agentcortex/context/work/<branch>.md` (Work Log created)
spec	`docs/specs/<feature>.md`
plan	Plan section in Work Log
adr	`docs/adrs/<date>-<title>.md`
implement	Code changes + checkpoints in Work Log
review	Review evidence in Work Log
test	Test evidence in Work Log
handoff	Handoff references in Work Log (`ship:[doc=...][code=...][log=...]`)
ship	`current_state.md` updated, backlog entry closed

Approval Gates

bootstrap: AI classifies task, user can override classification
plan: Plan reviewed before implementation begins
ship: Hard gate — Work Log must contain ship:[doc=<path>][code=<path>][log=<path>] references; missing fields = FAIL, do NOT proceed
confidence gate: AI must declare confidence level; low confidence triggers escalation to user
destructive command gate: rm -rf, git reset --hard, force pushes require pre-approved rollback plan

Gate Types

Evidence gates: Require verifiable artifacts (test results, coverage numbers, security scan output)
Handoff gates: Require explicit cross-reference tags in Work Log before shipping
Confidence gates: Self-attestation with escalation path
Classification lock: Task classification frozen after bootstrap; re-classification requires rollback to CLASSIFIED state

Token Efficiency Rules (from AGENTS.md)

Read-Once Discipline: Read governance files once at session start; re-read only with explicit ## Drift Log entry
Context Pruning: At 8+ turns on same task, proactively suggest handoff + new conversation
Conditional Loading: tiny-fix skips guardrails (~5,000 tokens saved)

Memory Context

Memory & Context — Agentic OS (KbWen)

Single Source of Truth (SSoT)

Agentic OS uses a two-file state architecture:

Global State

File: .agentcortex/context/current_state.md
Contains: Global project state, decisions, lessons, ship history, active spec index
Updated by: Only /ship command (via guard_context_write.py if Python available, or direct append with Drift Log entry)
Read by: Every session at start

Per-Task Work Logs

File: .agentcortex/context/work/<branch-name>.md
Contains: Per-task progress, evidence, gate receipts, phase summaries
Updated by: Agent during active task
Isolation: "One Branch = One Owner" — prevents concurrent Work Log corruption

Multi-Agent State Safety

Advisory Locking: Lock files signal active sessions without blocking
Ship Guard: Checks for SSoT conflicts before merging
Session Identity: Every AI session writes its model name and timestamp
Write Isolation: Agents write only to their own Work Log; only /ship updates SSoT

Context Loading Discipline

From AGENTS.md:

Init Read: MUST read current_state.md (SSoT) + active Work Log at session start
Exception: tiny-fix tasks MAY skip SSoT and Work Log reads
Prohibited: Blind directory scanning (ls -R .agentcortex/context/)
Context Pruning: At 8+ turns on same task → suggest handoff + new conversation

Skill Context

Skills use metadata-first loading: agent reads name, description, path, and agents/openai.yaml metadata before loading full SKILL.md. This reduces per-session token cost when many skills are registered.

Compaction Handling

Explicit compaction guidance in AGENTS.md:

Phase Summary: 1-line compact summary per phase written to Work Log for low-token resume
Archived Work Logs: Reviewed by validate.sh
Token cap: Hard cap of ≤8 lines prose per response

Cross-Session Handoff

Yes — Work Logs persist between sessions. The handoff.md command writes structured handoff references. /resume_handoff (pattern referenced in commands) restores context from Work Log.

Product Backlog

File: docs/specs/_product-backlog.md
Purpose: Living index for multi-feature work; updated by bootstrap checks and ship
Token cost: ~200 tokens (free-read)

Orchestration

Orchestration — Agentic OS (KbWen)

Multi-Agent Support

Yes — framework has explicit multi-agent skills:

dispatching-parallel-agents skill: coordinated subagent execution for complex tasks
subagent-driven-development skill: multi-agent coordination for multi-module tasks

Multi-Agent Rules

One Branch = One Owner: prevents concurrent Work Log corruption
Advisory Locking: lock files signal active sessions without blocking
Ship Guard: checks for SSoT conflicts before merging
Session Identity: every AI session writes its model name and timestamp

Orchestration Pattern

Sequential (per phase chain) with optional parallel fan-out via dispatching-parallel-agents skill.

Isolation Mechanism

Git worktree per feature (via using-git-worktrees skill and worktree-first.md command). git-worktrees isolation is documented and skill-supported but not automatically enforced.

Multi-Model Support

No multi-model routing configured. Framework is model-agnostic; model selected by user's runtime.

Execution Mode

Interactive-loop — user invokes commands sequentially per phase. The framework enforces that each command stops after completion and does not proceed automatically to the next phase.

Consensus Mechanism

None (no distributed consensus).

Prompt Chaining

Yes — bootstrap output becomes input for plan command; plan artifacts gate implement; implement evidence gates review; review gates test; test gates ship. Each phase reads the Work Log from the previous phase.

Cross-Tool Portability

High — via AGENTS.md (shared), plus per-runtime adapters. README lists: Claude Code, Cursor, GitHub Copilot, Google Antigravity, Codex.

Ui Cli Surface

UI & CLI Surface — Agentic OS (KbWen)

Dedicated CLI Binary

No dedicated CLI binary. Interaction is via AI agent slash commands (/bootstrap, /plan, etc.) in the user's chosen IDE or agent runtime.

Local Web Dashboard

None.

Deploy Installer

installers/deploy_brain.sh is a bash installer that:

Accepts --dry-run flag for preview
Deploys framework files non-destructively into any existing project
Auto-fetches latest version from GitHub on update runs
Prints deployment manifest and status

Validation Script

.agentcortex/bin/validate.sh:

Checks metadata integrity, encoding, command sync
Runs at deploy time and in CI (GitHub Actions)
Degrades gracefully without Python 3.9+ (--no-python flag)
Reports WARN instead of FAIL for Python-dependent checks when Python unavailable

CI Integration

GitHub Actions workflows at .github/workflows/validate.yml and .github/workflows/security.yml — CI status badges visible in README.

IDE Integration

No dedicated extension. Works in any agent runtime via file-based commands and skills.

Observability

Work Logs (.agentcortex/context/work/<branch>.md) provide human-readable per-task audit trail
SSoT (current_state.md) provides global project state history
validate.sh provides deploy-time and CI integrity checks
No structured JSONL log, no replay capability

Related frameworks

same archetype · same primary tool · same memory type

BMAD-METHOD ★ 48k

A4 Markdown scaffold

Provides a full agile delivery lifecycle with named expert-persona AI collaborators that elicit the human's best thinking rather…

Agent OS ★ 4.6k

A4 Markdown scaffold

Extracts implicit codebase conventions into token-efficient markdown standards files and injects them selectively into AI agent…

Claude Conductor ★ 367

A4 Markdown scaffold

Gives Claude Code a persistent, cross-linked, auto-analyzed documentation system so it retains codebase context across sessions.

Spec-Driver (Greenfield Spec-Driven Development) ★ 25

A4 Markdown scaffold

Prevents spec rot in AI-assisted development by making implementation changes flow back into evergreen, authoritative specs via…

Anthropic Knowledge Work Plugins ★ 16k

A4 Markdown scaffold

Role-specialized plugin bundles with live MCP connectors that turn Claude into a domain expert for enterprise knowledge workers.

Codex Integration for Claude Code (skill-codex) ★ 1.3k

A4 Markdown scaffold

Single Claude Code skill that handles Codex CLI invocation correctly (stdin blocking, thinking token suppression, session resume)…

Distribution

Type: standalone-repo
License: MIT
Install: clone-and-configure
Version: v1.1.2

Surfaces

CLI binary: No
CLI subcmds: 0
Local UI: No

Components

Commands: 26
Skills: 14
Subagents: 0
Hooks: 0
MCP servers: 0
MCP tools: 0
Scripts: 2
Templates: 0

Workflow

Phases: 10
Approval gates: 5
Spec format: markdown
Spec storage: per-feature-folder
Delta or full: whole-file

Orchestration

Multi-agent: Yes
Pattern: sequential
Isolation: git-worktree
Consensus: none
Prompt chaining: Yes

Multi-model

Multi-model: No
BYOK: Yes
Modal: text

Execution

Mode: interactive-loop
Crash recovery: No
Compaction: Yes
Session handoff: Yes
Streaming: No

Memory

Type: file-based
Persistence: project
Search: none
State files: 3 files

Quality

TDD: Optional
TDD mechanism: dedicated-skill
Self-review: none

Git / Observability

Auto commit: No
Auto PR: No
Auto merge: No
Worktree/feat: Yes
Audit log: Yes
Audit format: structured-md
Replay: No

Tools

Primary: claude-code
Targets: 5
Portability: high

Signals

Stars: 0
Last commit: 2026-05-26
Contributors: 1
Maintainer: active
Quality score: 4.7/10

Summary

agentic-os-kbwen — Agentic OS (KbWen)

Overview

Overview — Agentic OS (KbWen)

Origin

Philosophy

Manifesto-Style Quotes

Task Classification System

Architecture

Architecture — Agentic OS (KbWen)

Distribution

Directory Tree (after deploy into target project)

Required Runtime

Target AI Tools

Components

Components — Agentic OS (KbWen)

Slash Commands (26, in .claude/commands/)

Skills (14, in .agents/skills/)

Validation Script

State Files

Hooks

Prompts

Prompts — Agentic OS (KbWen)

Excerpt 1: /bootstrap Command

Excerpt 2: /ship Command

Excerpt 3: TDD Skill

Excerpt 4: AGENTS.md Core Directive

Uniqueness

Uniqueness — Agentic OS (KbWen)

Differs From Seeds

Positioning

Observable Failure Modes

Workflow

Workflow — Agentic OS (KbWen)

Phase System

Phase Chains by Classification

Phase-to-Artifact Map

Approval Gates

Gate Types

Token Efficiency Rules (from AGENTS.md)

Memory Context

Memory & Context — Agentic OS (KbWen)

Single Source of Truth (SSoT)

Global State

Per-Task Work Logs

Multi-Agent State Safety

Context Loading Discipline

Skill Context

Compaction Handling

Cross-Session Handoff

Product Backlog

Orchestration

Orchestration — Agentic OS (KbWen)

Multi-Agent Support

Multi-Agent Rules

Orchestration Pattern

Isolation Mechanism

Multi-Model Support

Execution Mode

Consensus Mechanism

Prompt Chaining

Cross-Tool Portability

Ui Cli Surface

UI & CLI Surface — Agentic OS (KbWen)

Dedicated CLI Binary

Local Web Dashboard

Deploy Installer

Validation Script

CI Integration

IDE Integration

Observability

Related frameworks

Slash Commands (26, in `.claude/commands/`)

Skills (14, in `.agents/skills/`)