Skip to content
/

Loki Mode

loki-mode · asklokesh/loki-mode · ★ 944 · last commit 2026-05-25

Takes a spec (PRD/issue/OpenAPI) and autonomously produces a production-ready Git repository through RARV cycles, 11 blocking quality gates, and 41 specialized agents.

Best whenAutonomous agents should never stop or ask questions — RARV cycles with graduated retry (3→simplify→5→dead-letter) and 11 quality gates are the only acceptab…
Skip ifAsking clarifying questions during autonomous execution, Shipping code that fails any of the 11 quality gates
vs seeds
claude-flow's Node.js, ships a web das…
Primitive shape 116 total
Commands 30 Skills 20 Subagents 41 MCP tools 25
00

Summary

Loki Mode — Summary

Loki Mode is an autonomous multi-agent system that takes a spec input (Markdown PRD, GitHub issue, OpenAPI/YAML doc, or one-line brief) and produces a Git repository with source code, tests, Docker configs, CI/CD pipelines, and audit logs — with minimal human intervention. Its central execution model is RARV (Reason-Act-Reflect-Verify) cycles driven by 41 specialized agent types organized into 8 domain swarms (Engineering, Operations, Business, Data, Product, Growth, Review, Orchestration), with 11 quality gates that must pass before code is considered done. It ships an npm CLI (loki), a Python MCP server (25 tools for task queue, memory, state management, and code search), a vanilla-JS web dashboard (port 57374), and an episodic/semantic/procedural memory system with ChromaDB vector search. At version 7.7.11 with 944 stars, Loki Mode is the most technically ambitious framework in this batch — it spans the full stack from PRD ingestion through deployment with automated rollback, blind 3-reviewer code review, anti-sycophancy checks, and a legacy code healing mode. Compared to seeds, it most closely resembles claude-flow's MCP-anchored, multi-agent swarm architecture but with a different primary language (Bash→Bun migration) and a stronger focus on PRD-to-deployed-product automation rather than task graph management.

01

Overview

Loki Mode — Overview

Origin

Created by asklokesh (asklokesh/loki-mode). BUSL-1.1 license (free for personal/internal/academic, commercial requires license). Version 7.7.11, pushed 2026-05-25 (very active). Python + TypeScript/Bun.

Philosophy

From SKILL.md:

"You are an autonomous agent. You make decisions. You do not ask questions. You do not stop." "Spec in, product out."

Loki Mode is built on the premise that a truly autonomous system should be able to convert any spec format into shipped code without hand-holding. The RARV (Reason-Act-Reflect-Verify) cycle is the fundamental execution primitive: every action is preceded by reasoning about priority, followed by reflection on outcome, and gated by automated verification. Failure triggers retry with a different approach (up to 3 times), then simplified approach (up to 5 times), then dead-letter queue — never silent failure or infinite loops.

The 11 quality gates are not optional checkpoints; they are blocking conditions. Critical/High/Medium severity findings block merges. The blind 3-reviewer system with anti-sycophancy check (if all 3 agree, run a Devil's Advocate reviewer) is explicitly designed to prevent the model from rubber-stamping its own work.

Key design philosophies

  1. Autonomy over interactivity: Unlike other frameworks in this batch, Loki Mode is explicitly designed to run unattended. It requires --dangerously-skip-permissions.
  2. Production quality as default: 11 quality gates, test mutation detector, backward compatibility gate, documentation coverage gate.
  3. Memory compounds over time: The compound learning system extracts novel solutions (bug fixes, non-obvious patterns) to ~/.loki/solutions/ for future reuse.
  4. Multi-provider resilience: Claude, Codex, Cline, Aider with automatic failover.
  5. Legacy healing as first-class concern: loki heal archaeology/stabilize/isolate/modernize/validate phases for legacy codebases.

Runtime migration

Loki is undergoing a Bash-to-Bun migration. Most commands still run on the Bash runtime (autonomy/loki). Read-only commands (version, status, doctor, etc.) have been ported to Bun (loki-ts/). Rollback: LOKI_LEGACY_BASH=1.

02

Architecture

Loki Mode — Architecture

Distribution

  • npm package: loki-mode
  • Binary: loki
  • Version: 7.7.11
  • Install: bun install -g loki-mode (recommended) or npm install -g loki-mode
  • Also: brew tap asklokesh/tap && brew install loki-mode
  • Also: docker pull asklokesh/loki-mode:7.5.11

Required runtime

  • Bun 1.3.0+ (recommended; bash fallback if unavailable)
  • Python 3.x (for MCP server and memory engine)
  • Docker (optional, for sandbox mode)

Directory tree (source)

asklokesh/loki-mode/
├── bin/loki                   # Shim: routes to Bun CLI or bash fallback
├── autonomy/loki              # Main Bash runtime (30+ subcommands)
├── loki-ts/                   # TypeScript/Bun runtime (migration target)
│   └── dist/loki.js           # Bundled Bun binary (~152KB)
├── mcp/
│   └── server.py              # Python MCP server (25 tools)
├── memory/
│   └── engine.py              # Episodic/semantic/procedural memory + ChromaDB
├── dashboard-ui/              # Vanilla JS web components (port 57374)
├── dashboard/                 # Dashboard API server
├── skills/                    # 20 skill/reference markdown files
├── references/                # 24 reference documents
├── agents/                    # Agent type references
├── swarm/                     # Swarm coordination
├── vscode-extension/          # VS Code extension (bundled)
├── .loki/                     # Per-project state directory
│   ├── state/orchestrator.json
│   ├── queue/pending.json
│   ├── session.json
│   └── PAUSE / STOP           # Control files
└── SKILL.md                   # Claude Code skill entry point

State files (per project)

.loki/
├── state/orchestrator.json    # currentPhase, tasksCompleted, tasksFailed
├── queue/pending.json         # task queue
├── queue/dead-letter.json     # failed tasks after 5 retries
├── session.json               # session registration (pid, provider, status)
├── docs/                      # API documentation coverage
├── healing/                   # Healing mode state
│   ├── friction-map.json
│   ├── behavioral-baseline/
│   └── characterization-tests/
├── metrics/                   # Performance benchmarks
└── solutions/                 # Compound learning repository

Target AI tools

Claude Code (primary), Codex, Cline, Aider (multi-provider via abstraction layer)

MCP server port

Default STDIO; HTTP mode available (--transport http)

Dashboard port

57374 (hardcoded default)

03

Components

Loki Mode — Components

CLI Subcommands (30+)

Key subcommands identified from autonomy/loki help text:

Command Purpose
`loki start [PRD ISSUE
loki quick "<brief>" One-line task without scaffolding
loki init <dir> Initialize project
loki issue <url/ref> Build from GitHub/GitLab/Jira issue
loki plan <prd> Generate plan without executing
loki heal Legacy codebase healing mode
loki memory list/index Memory management
loki dashboard Start web dashboard
loki web Start web server
loki doctor Health check
loki provider show/list Provider management
loki self-update In-place upgrade
loki checkpoint Checkpoint/restore
loki audit Audit report
loki metrics Metrics report
loki status Current project status
loki config Configuration management
loki stop Stop running session
loki cluster Multi-machine cluster management
loki magic Magic module commands

MCP Server Tools (25)

Exposed via mcp/server.py:

Tool Purpose
loki_memory_retrieve Retrieve from memory
loki_memory_store_pattern Store a solution pattern
loki_task_queue_list List task queue
loki_task_queue_add Add task to queue
loki_task_queue_update Update task status
loki_state_get Get orchestrator state
loki_metrics_efficiency Efficiency metrics
loki_consolidate_memory Consolidate memory
loki_complete_task Mark task complete
loki_start_project Initialize project
loki_project_status Project status
loki_agent_metrics Agent performance metrics
loki_checkpoint_restore Checkpoint management
loki_quality_report Quality gate report
loki_code_search ChromaDB code search
loki_code_search_stats Code search statistics
mem_search Memory search
mem_timeline Memory timeline
mem_get Get memory entry
loki_get_hotspots Code hotspots
loki_get_co_changes Co-change analysis
loki_get_doc_coverage Documentation coverage
loki_findings Review findings
loki_learnings Compound learnings
loki_counter_evidence_template Counter-evidence for review

Agent Types (41 across 8 swarms)

Engineering (8): eng-frontend, eng-backend, eng-database, eng-mobile, eng-api, eng-qa, eng-perf, eng-infra

Operations (8): ops-devops, ops-sre, ops-security, ops-monitor, ops-incident, ops-release, ops-cost, ops-compliance

Business (8): biz-marketing, biz-sales, biz-finance, biz-legal, biz-support, biz-hr, biz-investor, biz-partnerships

Data (3): data-analyst, data-ml, data-infra

Product (3+): product management, UX research roles

Growth, Review, Orchestration swarms

Skills / Reference Files (20)

skills/00-index.md, skills/agents.md, skills/quality-gates.md, skills/memory.md, skills/parallel-workflows.md, skills/healing.md, skills/testing.md, skills/providers.md, skills/documentation.md, skills/github-integration.md, and 10 more.

Dashboard UI Components

Vanilla JS Web Components (Shadow DOM):

  • loki-task-board — Kanban drag-and-drop
  • loki-session-control — Start/stop/pause buttons
  • loki-log-stream — Live log streaming
  • loki-metrics-panel — Performance metrics
05

Prompts

Loki Mode — Prompts

Excerpt 1: RARV Cycle — Autonomy Rule

Source: SKILL.md

## PRIORITY 2: Execute (RARV Cycle)

Every action follows this cycle. No exceptions.

REASON: What is the highest priority unblocked task? | v ACT: Execute it. Write code. Run commands. Commit atomically. | v REFLECT: Did it work? Log outcome. | v VERIFY: Run tests. Check build. Validate against spec. | +--[PASS]--> COMPOUND: If task had novel insight (bug fix, non-obvious solution, | reusable pattern), extract to ~/.loki/solutions/{category}/{slug}.md | with YAML frontmatter (title, tags, symptoms, root_cause, prevention). | See skills/compound-learning.md for format. | Then mark task complete. Return to REASON. | +--[FAIL]--> Capture error in "Mistakes & Learnings". Rollback if needed. Retry with new approach. After 3 failures: Try simpler approach. After 5 failures: Log to dead-letter queue, move to next task.


Technique: State machine with explicit failure escalation tiers. The RARV cycle is a fixed 4-step loop where VERIFY is non-optional and COMPOUND is an automatic learning step on success. Failure follows a graduated retry policy (3 retries → simplify → dead-letter) rather than infinite looping or silent failure.


Excerpt 2: Context Load — Every Turn

Source: SKILL.md PRIORITY 1

## PRIORITY 1: Load Context (Every Turn)

Execute these steps IN ORDER at the start of EVERY turn:
  1. IF first turn of session:

    • Read skills/00-index.md
    • Load 1-2 modules matching your current phase
    • Register session: Write .loki/session.json with: {"pid": null, "startedAt": "", "provider": "", "invokedVia": "skill", "status": "running", "updatedAt": ""}
  2. Read .loki/state/orchestrator.json

    • Extract: currentPhase, tasksCompleted, tasksFailed
  3. Read .loki/queue/pending.json

    • IF empty AND phase incomplete: Generate tasks for current phase
    • IF empty AND phase complete: Advance to next phase
  4. Check .loki/PAUSE - IF exists: Stop work, wait for removal. Check .loki/STOP - IF exists: End session, update session.json status to "stopped".

  5. EVERY TURN: Update .loki/session.json "updatedAt" field to current ISO timestamp. This keeps the dashboard aware the skill session is alive.


Technique: Mandatory per-turn context reload. Every turn starts with reading the persistent state files (orchestrator.json, pending.json), checking control files (PAUSE, STOP), and updating the heartbeat timestamp. This enables the dashboard to detect stale sessions and enables external control of the agent via filesystem signals.


Excerpt 3: Quality Gate 4 — Anti-Sycophancy

Source: skills/quality-gates.md

## Gate 4: Anti-Sycophancy Check

If [blind reviewer] unanimous approval, run Devil's Advocate reviewer

**Checks:**
- If all 3 reviewers approve → spawn additional "Devil's Advocate" reviewer
- Devil's Advocate is explicitly instructed to find problems, not validate
- If Devil's Advocate finds issues → treat as reviewer finding
- If Devil's Advocate finds nothing → unanimous approval confirmed

Technique: Adversarial subagent as sycophancy check. The Devil's Advocate reviewer is only spawned when unanimous approval would otherwise pass — turning reviewer consensus into a trigger for additional scrutiny rather than a fast exit.

09

Uniqueness

Loki Mode — Uniqueness

Differs from Seeds

Closest seed is claude-flow (MCP-anchored, 41+ agent types, SQLite+vector memory, swarm patterns). Loki Mode differs from claude-flow in three ways: (1) Loki uses Python + Bash/Bun as its primary runtime vs claude-flow's Node.js, and ships a web dashboard as a first-class UI rather than TUI; (2) Loki's 11 quality gates are more prescriptive and blocking — the anti-sycophancy check (run Devil's Advocate on unanimous reviewer approval) and test mutation detector (flag assertion value changes alongside implementation changes) are novel gate types not seen in claude-flow; (3) Loki's compound learning system (~/.loki/solutions/) builds a global knowledge base of novel solutions across projects, whereas claude-flow's memory is per-project. Loki Mode is also unique in shipping a legacy healing mode (loki heal) with a friction map, characterization tests, and backward compatibility gate — no seed addresses legacy codebase modernization.

Positioning

Loki Mode is the "autonomous startup builder" — it targets developers who want to describe what they're building and walk away, trusting the system to produce tested, documented, deployed code. It is the most fully-featured autonomous agent framework in this batch, with the largest CLI surface, most MCP tools, richest quality gate system, and only framework with a web dashboard and Docker support.

Observable Failure Modes

  1. BUSL-1.1 license: Not truly open source for commercial use. Creates adoption friction for enterprise teams.
  2. Bash runtime deprecation timeline: Phase 6 (sunset bash) has no firm date. Users must track UPGRADING.md carefully.
  3. Gemini CLI deprecated v7.5.18: Teams relying on Gemini had no warning period.
  4. Autonomy by default: --dangerously-skip-permissions is required for normal operation. Security-conscious teams may be uncomfortable.
  5. Context window size: 41 agent types, 25 MCP tools, 20 skill files — loading the full system for a simple task is expensive.
  6. Python dependency: The MCP server and memory engine require Python in addition to Bun, making the install more complex than other npm tools.
04

Workflow

Loki Mode — Workflow

Main Execution Flow

loki start <spec>
    ↓
detect_complexity() → simple (3 phases) | complex (8 phases)
    ↓
assemble agent team (5-10 for simple, more for complex)
    ↓
RARV cycles per task:
    REASON: highest priority unblocked task
    ACT: execute → write code, run commands, commit atomically
    REFLECT: did it work? log outcome
    VERIFY: run tests, check build, validate against spec
        ↓ PASS → COMPOUND: extract novel insight to ~/.loki/solutions/
        ↓ FAIL → retry (up to 3) → simplify (up to 5) → dead-letter queue
    ↓
11 Quality Gates (blocking)
    ↓
Output: Git repo with source, tests, Docker, CI/CD, audit logs

11 Quality Gates

Gate Check Block Level
1 Input Guardrails (scope validation, injection detection) Critical
2 Static Analysis (CodeQL, ESLint/Pylint, type checking) High
3 Blind Review (3 parallel reviewers, no cross-visibility) High
4 Anti-Sycophancy (if all 3 agree → run Devil's Advocate) High
5 Output Guardrails (code quality, spec compliance, no secrets) Critical
6 Severity-Based Blocking (Critical/High/Medium = BLOCK) Variable
7 Test Coverage (Unit: 100% pass, >80% coverage; Integration: 100% pass) High
8 Mock Detector (flags tautological assertions, high internal mock ratio) Medium
9 Test Mutation Detector (detects assertion value changes alongside impl changes) Medium
10 Backward Compatibility (friction map, behavioral preservation, healing mode) High
11 Documentation Coverage (README, API docs, staleness check within 10 commits) Medium

Session Control

  • .loki/PAUSE file → agent stops work, waits for removal
  • .loki/STOP file → agent ends session, writes terminal status
  • Session heartbeat: session.json updated every turn; sessions not updated in 5 minutes marked stale

Phase Detection

Simple complexity: 3 phases Complex complexity: 8 phases Override: --simple or --complex flags

Compound Learning

Every RARV pass-with-novel-insight extracts a solution to ~/.loki/solutions/{category}/{slug}.md with YAML frontmatter (title, tags, symptoms, root_cause, prevention). Builds over time into project-specific institutional knowledge.

06

Memory Context

Loki Mode — Memory & Context

Memory System

Three memory types implemented in memory/engine.py:

Type Storage Purpose
Episodic File-based + vector Records of specific bead outcomes, fix attempts
Semantic ChromaDB vector DB Concept-level knowledge, searchable by similarity
Procedural File-based Workflows, patterns, compound learning solutions

State Files

File Purpose
.loki/state/orchestrator.json currentPhase, tasksCompleted, tasksFailed
.loki/queue/pending.json Active task queue
.loki/queue/dead-letter.json Tasks that failed after 5 retries
.loki/session.json Session registration with heartbeat timestamp
.loki/PAUSE Control file: agent pauses when present
.loki/STOP Control file: agent terminates when present
~/.loki/solutions/ Global compound learning repository

Memory Persistence

Project-level (.loki/) + global (~/.loki/solutions/). Solutions extracted during compound learning are global, persisting across projects.

Search Mechanism

Vector search via ChromaDB (loki_code_search MCP tool). Also BM25 full-text search for code search (rank-bm25 dependency in pyproject.toml).

Context Compaction

Loki reads only 1-2 skill modules matching the current phase at session start (lazy loading from skills/00-index.md). The session heartbeat mechanism also implicitly handles compaction — a new session reads fresh state files rather than replaying conversation history.

Cross-Session Handoff

Yes — state files, task queue, and session registration all persist. loki status reads from .loki/state/ to show progress from any terminal.

07

Orchestration

Loki Mode — Orchestration

Multi-Agent

Yes. 41 agent types across 8 swarms. The orchestrator spawns typically 5-10 agents for simple projects, more for complex. Gate 3 (Blind Review) spawns 3 parallel review agents with no cross-visibility, plus a 4th Devil's Advocate on unanimous approval.

Orchestration Pattern

Parallel fan-out with hierarchical coordination. The orchestrator dispatches tasks to specialized agents (fan-out), collects results, and runs aggregate review gates (hierarchical). Gate 3's blind parallel review is explicitly a parallel-fan-out with isolation.

Agent Spawn Mechanism

Claude Code Task tool (general-purpose subagent_type with role-specific prompts). From skills/agents.md:

Task(
    subagent_type="general-purpose",
    model="opus",
    description="Security review: auth module",
    prompt="You are a security reviewer. Focus on: ..."
)

Isolation Mechanism

Git worktrees for parallel mode (loki start --parallel). Also Docker sandbox mode (loki start --sandbox).

Multi-Model

Yes. 4 providers with automatic failover: Claude, Codex, Cline, Aider. Model tiers (abstract) mapped to providers via loki provider configuration. Gemini CLI deprecated as of v7.5.18.

Execution Mode

Continuous/background daemon. loki start runs autonomously until complete. loki start --bg runs in background. --watch mode polls for new tasks continuously.

Consensus Mechanism

Blind review quorum (3 reviewers + optional Devil's Advocate) for code review gating. Not a distributed consensus protocol (not raft/byzantine) but a simulated multi-reviewer quorum.

Crash Recovery

Yes — session registration with heartbeat, task queue with dead-letter, checkpoint/restore (loki checkpoint). The STOP control file provides graceful termination.

Context Compaction

Yes — lazy skill module loading. Only 1-2 skill modules loaded per session phase. Session.json heartbeat enables fresh state on restart.

Cross-Session Handoff

Yes — full state persistence in .loki/.

Prompt Chaining Pattern

Yes — each RARV cycle's output (code commits, test results) feeds the next cycle's REASON step via the task queue and orchestrator state.

08

Ui Cli Surface

Loki Mode — UI & CLI Surface

CLI Binary

  • Name: loki
  • Package: loki-mode (npm)
  • Entry: bin/loki (shim → Bun or Bash)
  • Version: 7.7.11

Key CLI Subcommands

start, quick, init, issue, plan, heal, memory, dashboard, doctor, provider, self-update, checkpoint, audit, metrics, status, config, stop, cluster, magic + ~15 more

Local Web Dashboard

  • Port: 57374
  • Tech stack: Vanilla JavaScript ES6 Modules, Shadow DOM Web Components, no framework
  • Components: loki-task-board (Kanban), loki-session-control (start/stop/pause), loki-log-stream (live logs), loki-metrics-panel
  • Launch: loki dashboard or loki start --api
  • Features: Kanban task tracking, session control, live log streaming, metrics

VS Code Extension

Ships vscode-extension/ — not published separately, bundled with the npm package.

MCP Server

  • mcp/server.py — Python FastMCP server
  • 25 tools
  • Transports: STDIO (default) or HTTP (--transport http)
  • Config: .mcp.json in project root

Observability

  • .loki/session.json with heartbeat — dashboard reads this to detect stale sessions
  • .loki/state/orchestrator.json — phase progress
  • .loki/metrics/migration_bench_soak.jsonl — performance benchmarks
  • loki audit — structured audit report
  • loki metrics — efficiency report

Docker Support

docker pull asklokesh/loki-mode:7.5.11
docker run --rm asklokesh/loki-mode:7.5.11 start prd.md

Also: docker-compose.yml for multi-service (API + dashboard), sandbox containers.

Related frameworks

same archetype · same primary tool · same memory type

Claude-Flow / Ruflo ★ 55k

Eliminates single-agent context limits and sequential bottlenecks by orchestrating fault-tolerant swarms of specialized AI agents…

Hermes Agent (NousResearch) ★ 168k

Self-improving personal AI agent with closed learning loop, 7 terminal backends, and messaging gateway — not tied to any AI…

OpenCode ★ 165k

Terminal-first AI coding agent with multi-model routing, native desktop app, and a typed .opencode/ configuration system for…

OpenHands ★ 75k

Open-source AI software development platform (open-source Devin alternative) with Docker sandbox isolation, 77.6% SWE-bench…

DeerFlow ★ 70k

Long-horizon superagent that researches, codes, and creates by orchestrating parallel sub-agents with isolated contexts in Docker…

oh-my-openagent (omo) ★ 60k

Multi-provider AI agent orchestration for OpenCode: escape vendor lock-in by routing Sisyphus (Claude/Kimi/GLM) and Hephaestus…