Skip to content
/

VNX Orchestration

vnx-orchestration · Vinix24/vnx-orchestration · ★ 34 · last commit 2026-05-26

Governance-first multi-agent orchestration: every agent action produces a receipt, quality gates use deterministic file-based verdicts (not LLM judgment), and full provenance is maintained across all providers.

Best whenMulti-agent frameworks solve orchestration; VNX solves governance — the audit trails and deterministic gates that make multi-agent workflows trustworthy, wit…
Skip ifLLM judging its own work, Using OAuth tokens for API access (compliance risk)
vs seeds
claude-flowin multi-agent scope and infrastructure complexity, but VNX focuses on governance (deterministic file-based quality gate…
Primitive shape 47 total
Skills 21 Subagents 21 Hooks 5
00

Summary

VNX Orchestration — Summary

VNX (vnx-orchestration) is a governance-first multi-agent orchestration runtime for AI CLI workers (Claude Code, Codex CLI, Gemini CLI, Kimi CLI, LiteLLM bridge). With 34 stars and active development, it is the most governance-focused framework in this batch: every agent action produces a structured receipt in an append-only NDJSON ledger, quality gates use deterministic file-based verdicts (not LLM judgment), and a multi-provider code review pipeline (Codex gate + Gemini review) must pass before merge is allowed. The framework ships a Python CLI (vnx), a skills directory with 21 named skills/agent roles, a dashboard web UI, a tmux 2x2 grid operator mode, a headless subprocess mode for CI, and a receipt/ledger system documenting 1,400+ entries in production use. VNX explicitly addresses Anthropic's April 2026 OAuth policy (which affected OpenClaw) by spawning claude CLI processes via subprocess rather than using OAuth tokens. Closest seed comparison: closest to claude-flow in its multi-agent coordination complexity and explicit governance, but VNX is infrastructure-first (receipts, quality gates, tmux grid, dashboard) while claude-flow is protocol-first (MCP tools, consensus algorithms).

01

Overview

VNX Orchestration — Overview

Origin

Created by Vinix24 (Vincent van Deth). Single contributor. Active to 2026-05-26. Release v1.0.0-rc3 milestone (multi-provider, May 2026). 34 stars.

Philosophy

"Most multi-agent frameworks solve orchestration. VNX solves governance — the audit trails, quality gates, and human checkpoints that make multi-agent workflows trustworthy."

"Governance, provenance, and operator control built in. No framework to import. No cloud dependency."

Core Design Principles

  1. Receipts: every agent action writes a structured receipt to an append-only NDJSON ledger
  2. Quality gates are deterministic: file size limits, test coverage thresholds, open blocker counts — not LLM judgment
  3. Gate verdicts: APPROVE, HOLD, or ESCALATE — LLM never judges its own work
  4. Provider-agnostic: 5 providers (Claude, Codex, Gemini, Kimi, LiteLLM bridge) in production
  5. CLI subprocess only: spawns claude binary via subprocess; never touches OAuth tokens or api.anthropic.com

Compliance Positioning

The README includes a formal compliance audit table confirming VNX does not call Anthropic OAuth endpoints, does not use subscription credentials for API calls, and only launches claude CLI processes. This was written specifically in response to the April 2026 Anthropic policy that affected OpenClaw (340K stars).

Modes

  • Starter Mode: single terminal, one provider, sequential dispatch
  • Operator Mode: 4-terminal tmux grid (T0 orchestrator + T1/T2/T3 workers), multi-provider, parallel tracks
  • Demo Mode: replay real sessions without API keys
  • Headless Mode: all terminals as claude -p subprocesses (fully autonomous CI/overnight)

Key Quote

"~$200/month Max subscription versus ~$3,000/month in equivalent API tokens for the same workload. CLI subprocess trades some observability for a ~15x cost reduction."

02

Architecture

VNX Orchestration — Architecture

Distribution

  • Python package with CLI (vnx)
  • Install: git clone && ./install.sh /path/to/project
  • pyproject.toml-based; Python runtime

Required Runtime

  • Python >= 3.x
  • Claude CLI, Codex CLI, Gemini CLI (one or more)
  • tmux (Operator Mode)
  • Git

Directory Tree

vnx-orchestration/
├── vnx_cli/              — Python CLI
│   ├── main.py           — argparse entry point
│   └── commands/         — init, doctor, status, dispatch_agent, pool, update, version
├── agents/               — agent persona SKILL.md files
│   ├── t0-orchestrator/
│   ├── planner/
│   ├── architect/
│   ├── backend-developer/
│   ├── frontend-developer/
│   ├── database-engineer/
│   ├── test-engineer/
│   ├── quality-engineer/
│   ├── security-engineer/
│   ├── reviewer/
│   ├── debugger/
│   ├── performance-profiler/
│   ├── python-optimizer/
│   ├── api-developer/
│   ├── intelligence-engineer/
│   ├── monitoring-specialist/
│   ├── data-analyst/
│   ├── supabase-expert/
│   ├── excel-reporter/
│   ├── vnx-manager/
│   └── skills.yaml       — skill manifest
├── skills/               — (same directory, 21 agent roles)
├── hooks/                — session lifecycle scripts
│   ├── sessionstart.sh
│   ├── vnx_context_monitor.sh
│   ├── vnx_handover_detector.sh
│   ├── vnx_hook_payload_logger.sh
│   └── vnx_rotate.sh
├── dashboard/            — Web dashboard (Python served)
│   ├── index.html
│   ├── serve_dashboard.py
│   └── api_*.py
├── ledger/               — Receipt storage
│   ├── engine/
│   └── t0_ledger_interface.py
├── database/             — State database
├── bin/                  — vnx binary
├── lib/                  — shared libraries
├── scripts/              — orchestration scripts
│   ├── headless_trigger.py
│   └── headless_orchestrator.py
├── schemas/              — Receipt schema definitions
├── configs/              — Provider configurations
└── claudedocs/           — Roadmap and planning docs

Target AI Tools

  • Claude Code (T0 orchestrator: Claude Opus only per t0-orchestrator skill)
  • Codex CLI (T1/T2/T3 workers, Sonnet-pinned)
  • Gemini CLI (T1/T2/T3 workers)
  • Kimi CLI
  • LiteLLM bridge (DeepSeek V4 Pro/Flash, GLM-5.1)

State Storage

  • .vnx-data/ — dispatch receipts, unified_reports, unified_ledger.ndjson
  • .vnx/ — project state
  • database/ — intelligence schema, code_snippets, project state
03

Components

VNX Orchestration — Components

CLI Binary: vnx (Python)

Subcommand Purpose
vnx init --starter Scaffold starter mode project
vnx init --operator Scaffold operator mode (4-terminal tmux)
vnx doctor Validate prerequisites and structure
vnx doctor --json JSON output
vnx doctor --strict Fail on warnings
vnx status Show dispatch and agent status
vnx start Launch 2x2 tmux grid
vnx start claude-codex T1: Codex CLI, T2: Claude Code
vnx start claude-gemini T1: Gemini CLI, T2: Claude Code
vnx start full-multi T1: Codex CLI, T2: Gemini CLI
vnx demo Demo mode with sample state
vnx gate-check --pr N Run quality gates for PR
vnx cost-report API spend per agent/task type
vnx update Update VNX
vnx version Show version

Agent Skills (21)

Agent Role
t0-orchestrator Master orchestration authority (Claude Opus only)
planner Task decomposition and planning
architect System architecture design
backend-developer General server-side code
frontend-developer UI/UX, dashboards
database-engineer Schema changes, migrations, SQLite, FTS5
test-engineer Testing strategy and implementation
quality-engineer Code quality enforcement
security-engineer Security review
reviewer Code review
debugger Debugging specialist
performance-profiler Performance analysis
python-optimizer Python-specific optimization
api-developer API endpoints and scripts
intelligence-engineer VNX intelligence schema, central state DBs
monitoring-specialist Monitoring and alerting
data-analyst Data analysis
supabase-expert Supabase integration
excel-reporter Excel/reporting
vnx-manager VNX self-management

Hooks (5 shell scripts)

Script Event/Purpose
sessionstart.sh SessionStart — context priming
vnx_context_monitor.sh Monitor context usage
vnx_handover_detector.sh Detect when handover needed
vnx_hook_payload_logger.sh Log hook payloads
vnx_rotate.sh Trigger context rotation on threshold

Quality Gates

  • Codex gate: OpenAI gpt-5.2-codex — bugs, data loss, security
  • Gemini review: Google gemini-2.5-flash — architecture, patterns, trade-offs
  • CI gate: CI green required
  • Gate verdicts: pass, fail, blocked with severity-rated findings
  • Triple gate policy: codex pass + gemini pass + CI green → merge allowed
  • Gate locks are file-based (not LLM-based)

Dashboard (Web)

dashboard/index.html + serve_dashboard.py — served locally

  • API endpoints: api_agent_stream.py, api_health.py, api_intelligence.py, api_operator.py, api_recommendations.py, api_register_stream.py, api_token_stats.py
  • Token dashboard sub-module

Ledger

NDJSON append-only receipt log: .vnx-data/unified_ledger.ndjson Schema defined in schemas/ 1,400+ entries in production use

05

Prompts

VNX Orchestration — Prompts

Verbatim Excerpt 1: t0-orchestrator/SKILL.md (Decision Framework)

Prompting technique: Deterministic 8-step decision tree; efficiency rule with fast-path; skill routing table

## 3. Decision Framework

Apply this 8-step decision tree in order. The first matching rule wins.

1. GHOST CHECK     → receipt.dispatch_id starts with "unknown-" or is empty → WAIT
2. DUPLICATE CHECK → dispatch_id already in recent_receipts                  → WAIT
3. REJECTION GATE  → status=failure OR risk > 0.8 OR blocking findings       → REJECT
4. ESCALATION GATE → architectural change OR new dependency OR policy         → ESCALATE
5. INVESTIGATION   → risk 0.3–0.8 OR advisory=hold                           → DISPATCH follow-up to T3
6. TERMINAL CHECK  → all terminals busy (none ready)                         → WAIT
7. COMPLETION CHECK → completion_pct=100 AND no blockers AND no pending      → COMPLETE
8. DEFAULT         → receipt valid AND work pending                           → DISPATCH

**Efficiency rule**: Be efficient — accept clean work, investigate anomalies, reject failures.
- Fast path: risk ≤ 0.3 + success + no blockers → skip deep verification, go directly to DISPATCH.
- Verification (spot-check 3 claims) only when risk > 0.3.
- If status=failure or blocking findings → REJECT immediately.

Verbatim Excerpt 2: t0-orchestrator/SKILL.md (Skill Routing Table)

Prompting technique: Domain-specific routing with historical learning rationale

### Skill Routing (specialist dispatch)

| Work type | Skill | Why |
|---|---|---|
| Schema changes, migrations, SQLite, FTS5, multi-tenant patterns | **`database-engineer`** | Has migration defense checklist + SQLite gotcha references; learned from P4's 5-round chain |
| VNX intelligence schema, central state DBs, dispatch lifecycle | **`intelligence-engineer`** | Knows VNX-specific table semantics |
| API endpoints, scripts, refactoring | `backend-developer` | Generalist; has Codex Defense Checklist as baseline |
| UI/UX, dashboards, frontend frameworks | `frontend-developer` | |

Verbatim Excerpt 3: Context Rotation (from README)

Prompting technique: Structured handover forcing agent to document state before context clear

Agent hits 65% context → blocked from further tool calls
  → Agent writes structured ROTATION-HANDOVER.md
    → VNX sends /clear to terminal
      → Fresh session resumes with handover + original task
09

Uniqueness

VNX Orchestration — Uniqueness

Differs From Seeds

Most complex governance infrastructure in the batch. Closest seed is claude-flow (multi-agent, multi-model, extensive tooling) but VNX differs fundamentally: VNX focuses on governance (receipts, provenance, deterministic gates) while claude-flow focuses on protocols (consensus algorithms, MCP tools). VNX's deterministic file-based quality gates (not LLM judgment) are unique in the entire corpus — no seed has a "LLM never judges its own work" gate design. The NDJSON receipt ledger with 1,400+ production entries and replay capability exceeds any seed's observability story. VNX is also unique in the corpus for explicitly addressing a specific vendor policy (Anthropic April 2026 OAuth ban) with a compliance audit.

Positioning

  • Only framework with a formal compliance audit published in the README
  • Strongest observability story in this batch (NDJSON ledger, token cost reports, web dashboard)
  • Explicit 15x cost savings rationale (CLI subprocess vs API tokens)
  • Production-proven: 1,400+ ledger entries cited
  • Multi-provider in production: 5 providers simultaneously

Observable Failure Modes

  • tmux complexity: 4-terminal grid requires tmux literacy
  • Python runtime: non-standard for skill packs (most are Node.js/bash)
  • Single contributor: 1 contributor despite active development; bus factor = 1
  • Dutch README sections: some CHANGELOG entries are in Dutch, suggesting personal use pattern
  • ADR-bounded: API contract stability promised but still rc3; 14 ADRs may not cover all edge cases

Notable Quotes

From README on governance:

"The same governance applies regardless of mode. Receipts, quality gates, and provenance records are generated identically whether a worker is interactive or headless. Headless does not mean ungoverned."

04

Workflow

VNX Orchestration — Workflow

Orchestration Cycle (Receipt → Review → Dispatch)

  1. Read latest receipt(s)
  2. Read QUALITY advisory
  3. Review open items for the PR
  4. Validate evidence quality (tests, logs, behavior proof)
  5. Close/defer/wontfix items with explicit reasons
  6. Complete PR only if blocker/warn criteria are satisfied
  7. Check dispatch guard (terminals + queue + dependencies)
  8. Verify required review-gate evidence
  9. Choose one action: WAIT / DISPATCH one manager block / ESCALATE

T0 Decision Framework (8-step)

1. GHOST CHECK     → receipt.dispatch_id starts with "unknown-" → WAIT
2. DUPLICATE CHECK → dispatch_id already in recent_receipts → WAIT
3. REJECTION GATE  → status=failure OR risk > 0.8 OR blocking findings → REJECT
4. ESCALATION GATE → architectural change OR new dependency OR policy → ESCALATE
5. INVESTIGATION   → risk 0.3–0.8 OR advisory=hold → DISPATCH follow-up to T3
6. TERMINAL CHECK  → all terminals busy → WAIT
7. COMPLETION CHECK → completion_pct=100 AND no blockers → COMPLETE
8. DEFAULT         → receipt valid AND work pending → DISPATCH

Context Rotation (Automatic)

Agent hits 65% context → blocked from further tool calls
  → Agent writes structured ROTATION-HANDOVER.md
    → VNX sends /clear to terminal
      → Fresh session resumes with handover + original task

Quality Gate Pipeline

PR created
  → Codex gate (OpenAI gpt-5.2-codex): bugs, security, data loss
  → Gemini review (gemini-2.5-flash): architecture, patterns
  → CI pipeline
  → All pass? → merge allowed (file-based lock enforced)

Phases and Artifacts

Phase Actor Artifact Gate
Dispatch T0 orchestrator Manager block Decision framework
Execution T1/T2 workers Code changes none
Receipt T1/T2 NDJSON receipt Auto
Quality review Codex gate + Gemini JSON verdict Deterministic
PR completion T0 PR merged Triple gate

Context Rotation Handover

ROTATION-HANDOVER.md written by agent before context cleared:

  • Current task context
  • What was done
  • What remains
  • Dependencies/blockers
06

Memory Context

VNX Orchestration — Memory & Context

Primary: NDJSON Ledger

.vnx-data/unified_ledger.ndjson — append-only receipt log
1,400+ entries in production use
Schema: schemas/ directory
Every agent action: dispatch_id, what was dispatched, what was produced, files changed, git commit, duration, cost

Database

database/ — intelligence schema, code_snippets/snippet_metadata, intelligence_injections, project_id propagation
Used by intelligence-engineer skill for VNX-specific table semantics

Context Rotation

Automatic at 65% context usage:

  1. Agent writes ROTATION-HANDOVER.md (current task, done, remaining, blockers)
  2. VNX sends /clear to terminal
  3. Fresh session resumes with handover document + original task
  4. Receipt ledger maintains chain across rotations

State Files

Path Content Persistence
.vnx-data/unified_ledger.ndjson Append-only receipts Project
.vnx-data/unified_reports/ Agent work reports Project
.vnx/ Project state Project
ROTATION-HANDOVER.md Context rotation handover Temporary
PR_QUEUE.md PR dispatch queue Project
HANDOFF.md Session handoff doc Project

Cross-Session Handoff

HANDOFF.md in root — session handoff documentation
Ledger provides complete audit trail for replay

Replay Capability

Yes — NDJSON ledger is designed for replay; vnx demo --replay governance-pipeline replays real sessions

07

Orchestration

VNX Orchestration — Orchestration

Multi-Agent: Yes

4-terminal model:

  • T0: Orchestrator (Claude Opus, tmux or subprocess)
  • T1, T2: Worker terminals (any of 5 providers, tmux or subprocess)
  • T3: Investigation terminal (for follow-up on risk 0.3-0.8)

Orchestration Pattern

hierarchical (T0 dispatches to T1/T2/T3) + parallel (T1 and T2 run simultaneously)

Multi-Model: Yes

Per-terminal provider assignment:

Terminal Default Model Note
T0 Claude Opus (claude CLI) Policy: Claude Opus only
T1 Sonnet-pinned (Codex CLI) Unless reconfigured
T2 Sonnet-pinned (Codex CLI) Unless reconfigured
T3 Investigation terminal Flexible

Mix presets:

  • vnx start claude-codex — T1: Codex, T2: Claude Code
  • vnx start claude-gemini — T1: Gemini, T2: Claude Code
  • vnx start full-multi — T1: Codex, T2: Gemini

5 provider families: Claude (Opus/Sonnet/Haiku), Codex (gpt-5.2-codex), Gemini (2.5 Pro/Flash), Kimi CLI (K2.6), LiteLLM bridge (DeepSeek, GLM-5.1)

Isolation Mechanism

process (each terminal is an independent CLI subprocess with its own context window)
Plus: each dispatch is scoped (150-300 lines) to limit blast radius

Execution Mode

interactive-loop (tmux Operator Mode) + background-daemon (headless subprocess mode) + scheduled (CI cron)

Headless Mode

Any terminal can run as claude -p subprocess. Fully autonomous overnight execution with same governance controls.

Quality Gate Pattern

Triple gate (Codex + Gemini + CI) before merge. File-based locks prevent bypass. LLM never judges its own work.

Context Compaction

Automatic rotation at 65% context with ROTATION-HANDOVER.md protocol.

08

Ui Cli Surface

VNX Orchestration — UI & CLI Surface

CLI Binary: vnx (Python)

Binary: vnx (from bin/vnx, Python)
Install: git clone && ./install.sh /path/to/project
Is thin wrapper: No — full governance runtime

Subcommands

init, doctor, status, start, demo, gate-check, cost-report, update, version

Local Web Dashboard

Exists: Yes
Tech stack: HTML + Python (serve_dashboard.py + Flask/FastAPI)
Features:

  • Agent stream monitoring
  • Intelligence queries
  • Operator control panel
  • Recommendations
  • Token statistics
  • Token dashboard sub-module
  • PR queue visualization (PR_QUEUE.md + API)

Dashboard screenshot shown in README: "VNX multi-terminal orchestration — T0 orchestrator coordinating Claude Code, Codex CLI, and Claude Opus across parallel tracks"

tmux Grid (Operator Mode)

4-terminal 2x2 tmux grid:

  • T0 pane (orchestrator)
  • T1 pane (worker)
  • T2 pane (worker)
  • T3 pane (investigation)

Ctrl+G opens dispatch queue in terminal — shows pending tasks with role, priority, git ref.

Headless Mode

No tmux needed — all terminals can be claude -p subprocesses for CI/cron/autonomous operation.

Observability

  • .vnx-data/unified_ledger.ndjson — 1,400+ entry production receipt log
  • vnx cost-report — API spend per agent, per task type
  • Dashboard: token stats, agent streams
  • Hook payload logger: vnx_hook_payload_logger.sh

Related frameworks

same archetype · same primary tool · same memory type

OpenHarness ★ 13k

Open-source Python agent runtime providing complete harness infrastructure: tools, memory, governance, swarm coordination, and…

Trae Agent ★ 12k

Research-friendly open-source CLI coding agent by ByteDance, designed for academic ablation studies and modular LLM provider…

Sweep AI ★ 7.7k

Autonomous GitHub bot that converts issues to pull requests using a sequential multi-agent pipeline.

Agent Governance Toolkit (microsoft) ★ 2.3k

Enterprise-grade AI agent governance: YAML policy enforcement, 12-vector prompt injection defense, zero-trust identity,…

TDD Guard ★ 2.1k

Mechanically enforces the Red-Green-Refactor TDD cycle by blocking file writes that violate TDD principles via a PreToolUse hook…

Agentic Coding Flywheel Setup (ACFS) ★ 1.5k

Take a complete beginner from laptop to three AI coding agents running on a VPS in 30 minutes via an idempotent manifest-driven…