Skip to content
/

CogniLayer v4

cognilayer · LakyFx/CogniLayer · ★ 28 · last commit 2026-05-18

Primitive shape 56 total
Commands 10 Hooks 5 MCP tools 41
00

Summary

CogniLayer v4 — Summary

CogniLayer v4 is a Python MCP server with 41 tools spanning memory management, code intelligence (tree-sitter AST), safety gates (Identity Card system), and a novel multi-agent orchestration protocol. The claimed token savings are 80-200K+ tokens/session via a subagent memory protocol where research subagents compress 40K-token findings into 500-token summaries stored in SQLite, plus semantic search replacing file reads (3 targeted queries at ~800 tokens vs 15 file reads at ~60K tokens).

The "14 Fact Types" memory model (error_fix, gotcha, api_contract, decision, pattern, procedure, etc.) with heat-decay relevance scoring and "Session Bridges" for crash recovery distinguish it from simpler key-value memory stores. It installs via python install.py, ships a Textual TUI dashboard with 8 tabs, integrates with both Claude Code and Codex CLI, and generates AGENTS.md instructions for Codex compatibility.

Compared to seeds, CogniLayer is closest to ccmemory (typed facts, graph relationships) but adds code intelligence (tree-sitter code_context, code_impact), a TUI dashboard, multi-agent orchestration tools, and a safety layer (Identity Card deployment guards). Unlike ccmemory's Neo4j + Docker, CogniLayer uses SQLite with optional vector search (fastembed sqlite-vec). The explicit "80-200K+ tokens saved" claim is verbatim from the README and is the most specific token-reduction claim in this batch.

01

Overview

CogniLayer v4 — Overview

Origin

LakyFx. License: NOASSERTION (Elastic License 2.0 per badge in README). Version 4.3.0. Python 3.11+. Bilingual (English + Czech, via i18n.py).

Philosophy

From README:

"Stop re-explaining your codebase to AI. Infinite speed memory + code graph for Claude Code & Codex CLI. 17 MCP tools, subagent protocol, hybrid search, TUI dashboard, crash recovery. Save 80-200K+ tokens/session."

The core insight is the Subagent Memory Protocol:

"Without CogniLayer, the subagent returns a 40K-token dump into parent context. With CogniLayer's Subagent Memory Protocol: Parent receives ~500 tokens. 40K tokens compressed to 500."

Key design principles:

  1. 14 Fact Types — not dumb notes but typed semantic memory (error_fix, gotcha, api_contract, decision, pattern, procedure, etc.)
  2. Heat Decay — hot facts surface first, cold facts fade; each search hit boosts relevance
  3. Code Intelligencecode_context + code_impact powered by tree-sitter AST, not grep
  4. Session Bridges — every session starts with what happened last time, including the blocker
  5. Safety Gates — Identity Card system blocks deploy to wrong server; audit trail on every safety change
  6. Agent Interop — Claude Code and Codex CLI share the same SQLite brain

README Claims

  • "3 targeted queries (~800 tokens) replace 15 file reads (~60K tokens)"
  • "40K tokens compressed to 500" (subagent protocol)
  • "80-200K+ tokens saved per session"
  • "Longer sessions with subagents save more"
02

Architecture

CogniLayer v4 — Architecture

Distribution

Standalone repo — clone and run python install.py.

Install Methods

git clone https://github.com/LakyFx/CogniLayer.git
cd CogniLayer
python install.py          # Claude Code
python install.py --codex  # Codex CLI only
python install.py --both   # Both

Directory Structure

CogniLayer/
├── mcp-server/              # MCP server (Python)
│   ├── server.py            # 41 tools registered
│   ├── tools/               # One file per tool
│   ├── search/              # FTS5 + vector search
│   ├── code/                # Tree-sitter AST
│   ├── indexer/             # File indexer
│   ├── db.py                # SQLite access
│   ├── embedder.py          # fastembed wrapper
│   ├── i18n.py              # EN/CS translation
│   └── agent_orchestration.py  # Multi-agent policy
├── hooks/                   # Claude Code hooks
│   ├── on_session_start.py
│   ├── on_session_end.py
│   ├── on_pre_compact.py
│   ├── on_file_change.py
│   └── register.py          # Hook registration
├── commands/                # Slash commands (CS + EN)
├── tui/                     # Textual TUI
│   ├── app.py
│   ├── screens/
│   └── widgets/
├── config.yaml              # Configuration
├── install.py               # Installer
├── diagnose.py              # Diagnostic tool
└── bin/
    └── cognilayer           # TUI launcher script

State Files

  • ~/.cognilayer/memory.db — SQLite database (facts, code index, embeddings via sqlite-vec)
  • ~/.cognilayer/logs/cognilayer.log — server log
  • ~/.cognilayer/active_session.json — current session state
  • ~/.cognilayer/sessions/ — per-session state files
  • ~/.cognilayer/config.yaml — configuration
  • ~/.cognilayer/codex/ — Codex CLI AGENTS instructions

Required Runtime

  • Python 3.11+
  • pip packages: mcp, pyyaml, textual
  • Optional: fastembed, sqlite-vec (vector search), tree-sitter-language-pack (code intelligence)

Target AI Tools

Claude Code (hooks + MCP), Codex CLI (AGENTS.md static instructions + MCP).

03

Components

CogniLayer v4 — Components

MCP Tools (41, from server.py)

Memory Tools (9)

memory_search, memory_write, operational_lesson_search, operational_lesson_record, memory_delete, project_context, session_bridge, decision_log, session_init

Knowledge Graph Tools (2)

memory_link, memory_chain

File Intelligence Tools (2)

file_search, file_index

Code Intelligence Tools (4)

code_index, code_search, code_context, code_impact

Safety / Identity Tools (3)

verify_identity, identity_set, recommend_tech

Multi-Agent Orchestration Tools (21)

agent_policy_read, agent_run_start, agent_run_finish, agent_event_write, claim_scope, release_scope, list_claims, register_handoff, agent_memory_stage, agent_memory_promote, agent_run_heartbeat, agent_delegate_plan, agent_run_digest, agent_run_list, agent_handoff_inbox, agent_handoff_resolve, agent_context_brief, agent_writer_run_start, agent_review_run_start, agent_research_wave_start, agent_research_wave_collect, agent_research_wave_finalize

Hooks (5, Claude Code)

Hook File Purpose
SessionStart on_session_start.py Inject session bridge, crash recovery state
SessionEnd on_session_end.py Save session state, update memory
PreCompact on_pre_compact.py Backup state before compaction
FileChange on_file_change.py Trigger code re-indexing on edit
(codex) generate_agents_md.py Generate AGENTS.md for Codex CLI

Slash Commands (in commands/cs/ and commands/en/)

/status, /recall, /harvest, /onboard, /onboard-all, /forget, /identity, /consolidate, /tui, /cognihelp

TUI Dashboard

cognilayer binary → Textual TUI with 8 tabs:

  • Overview, Facts, Code, Sessions, Safety, Search, Config, Help

Scripts

  • install.py — installer (registers hooks + MCP in Claude Code / Codex)
  • diagnose.py — diagnostic tool with --fix flag
  • hooks/register.py — hook registration
  • hooks/install_codex_command_lessons.py — Codex integration
  • hooks/codex_command_lessons.py — command lessons for Codex

Subagents

No explicitly spawned subagents; the multi-agent tools provide a protocol for agents to coordinate when running in parallel (e.g., agent_research_wave_start / collect / finalize for fan-out research).

05

Prompts

CogniLayer v4 — Prompts

Prompt File 1: SessionStart Hook Output (from on_session_start.py)

Technique: Dynamic context injection with structured sections delimited by a magic string.

COGNILAYER_START = "# === COGNILAYER (do not delete) ==="
COGNILAYER_END = "# === END COGNILAYER ==="

The hook injects a block like:

# === COGNILAYER ===
## Session Bridge
Progress: Migrated 3/5 API endpoints to v2 format.
Done: /users, /products, /orders. Open: /payments, /shipping.
Blocker: /payments needs Stripe SDK v12 upgrade first.

## Memory Briefs
- gotcha: Stripe SDK v12 changed webhook signature verification (async)
- error_fix: Fixed duplicate orders by adding idempotency key check
# === END COGNILAYER ===

Technique used: Magic-string delimited injection block; structured fact types; actionable blocker highlighting.

Prompt File 2: Codex CLI AGENTS.md (generated by hooks/generate_agents_md.py)

Technique: Static instruction file generated at install time; tells Codex what tools to use and how.

From the codex workflow files:

# ~/.cognilayer/codex/onboard.md
Scan the project and save key facts to memory using memory_write.
Fact types to capture: api_contract, pattern, decision, gotcha.

# ~/.cognilayer/codex/harvest.md
After completing work, extract and save learnings to memory.

Technique used: Role-playing instruction files (Codex reads AGENTS.md as instructions), one file per workflow phase.

Prompt File 3: Tool Description (i18n.py)

All tool descriptions are internationalized. Example (from server.py → i18n.py):

"tool.memory_search.desc": "Search factual memory for relevant knowledge. Use before reading files to check what's already known."

Technique used: Constraint injection ("Use before reading files") in MCP tool description to shape agent retrieval behavior.

09

Uniqueness

CogniLayer v4 — Uniqueness

Differs From Seeds

Closest seed: ccmemory — both use typed fact storage (ccmemory: Decision/Correction/FailedApproach nodes; CogniLayer: 14 named fact types including error_fix, gotcha, api_contract). Key deltas: (1) CogniLayer uses SQLite + optional sqlite-vec rather than Neo4j + Docker, lowering infrastructure bar. (2) CogniLayer adds a full code intelligence layer (code_context, code_impact via tree-sitter AST) not present in ccmemory. (3) CogniLayer ships 21 multi-agent orchestration tools (claim_scope, handoff, research_wave) compared to ccmemory's zero. (4) CogniLayer ships a Textual TUI dashboard; ccmemory has no local UI. (5) CogniLayer targets both Claude Code AND Codex CLI; ccmemory targets only Claude Code. Against claude-flow (the other multi-agent seed): CogniLayer's orchestration is memory-first (agents coordinate via a shared fact store), while claude-flow uses hive-mind consensus + HNSW vector store + 305 MCP tools.

Positioning

"The knowledge layer + code intelligence + safety + multi-agent coordination, all in one SQLite database." CogniLayer is trying to be a complete "second brain" for AI coding agents rather than just a memory store.

Observable Failure Modes

  1. Elastic License — NOASSERTION license, appears to be Elastic License 2.0 which prohibits SaaS use. Users building commercial products need to check licensing carefully.
  2. Token savings unverified — "80-200K+" is a plausible range but no reproducible benchmark is cited; actual savings depend heavily on workflow.
  3. 41 tools = large tool list overhead — injecting 41 tool descriptions into Claude's context uses tokens before any work begins.
  4. Optional vector search — without fastembed + sqlite-vec, search degrades to FTS5 only; semantic retrieval for different-wording queries fails.
  5. Code index stalenesson_file_change.py hook triggers re-indexing, but changes made outside Claude Code (git pull, etc.) require manual code_index call.
  6. Multi-agent protocol complexity — 21 orchestration tools require careful protocol coordination; misuse (e.g., forgetting release_scope after claim_scope) can deadlock agents.
04

Workflow

CogniLayer v4 — Workflow

Phases

Phase What Happens Artifact
Install python install.py ~/.cognilayer/, hooks registered, MCP configured
Onboard /onboard scans project, builds initial memory ~100-500 facts in memory.db
SessionStart Hook injects session bridge, crash recovery state Context in Claude's conversation
Active Work Agent uses code_context, memory_search, file_search Code graph + memory consulted
Subagent Research Subagent uses agent_research_wave_start/collect/finalize ~500 token summary stored in DB
Harvest /harvest extracts knowledge from current session New facts written to memory.db
SessionEnd Hook saves session state Session bridge for next session
Consolidate /consolidate clusters facts, detects contradictions Organized memory

Subagent Memory Protocol (Key Innovation)

Parent starts research wave: agent_research_wave_start("MCP frameworks")
→ Subagents write findings: agent_research_wave_collect(wave_id, "Saved 3 facts, search 'MCP'")
→ Finalize: agent_research_wave_finalize(wave_id)
← Parent gets: 500-token summary instead of 40K tokens
→ On-demand detail: memory_search("MCP ecosystem")

Safety Gate

Before any deployment:

  1. verify_identity checks if current environment matches the Identity Card
  2. If mismatch: deploy blocked
  3. Override requires explicit identity_set — creates audit trail

Approval Gates

None automated — safety gates are advisory (block deploy, but agent could override).

Crash Recovery

On SessionStart hook:

  1. Check ~/.cognilayer/active_session.json
  2. If stale session found, load last known state
  3. Inject "Progress: Migrated 3/5 API endpoints..." into context

Spec Format

None — CogniLayer stores typed facts, not structured specs.

06

Memory Context

CogniLayer v4 — Memory & Context

Memory Model

sqlite + optional vector-db — primary store is SQLite (memory.db). Optional vector search via fastembed + sqlite-vec extension. Tree-sitter code index stored in same SQLite.

Persistence Scope

global~/.cognilayer/memory.db stores all facts across all projects. Facts are tagged by project for filtering.

14 Fact Types

CogniLayer distinguishes knowledge by semantic type:

Type Meaning
error_fix Something that broke and how it was fixed
gotcha Subtle behaviors or surprising edge cases
api_contract Expected API behavior, request/response shapes
decision Architectural or implementation choices made
pattern Reusable code patterns in this codebase
procedure Step-by-step processes (deploy, test, etc.)
fact General project facts
reference External documentation links
+ 6 more (not exhaustively listed in README)

Heat Decay

Facts have a heat score. Each memory_search hit that retrieves a fact boosts its heat. Old, never-retrieved facts decay. Hot facts surface first in search results.

Code Intelligence Layer

Separate from the fact memory, CogniLayer maintains a code graph:

  • code_index — tree-sitter parses project source code into AST (10+ languages)
  • code_context — returns definition + callers + callees for a symbol
  • code_impact — returns blast radius (depth 1, 2, 3) for a symbol change

This is stored in the same SQLite database alongside fact memory.

Context Compaction Handling

yes — explicit on_pre_compact.py hook backs up session state before compaction.

Cross-Session Handoffs (Session Bridges)

yes — the core feature:

  • on_session_end.py writes a session state summary
  • on_session_start.py reads the latest session state and injects it
  • Session bridge format: "Progress: [what done]. Open: [what remains]. Blocker: [current obstacle]"

Subagent Memory Protocol

When a research subagent runs, instead of returning 40K tokens to parent context:

  1. Subagent calls agent_research_wave_collect(wave_id, summary_500_tokens)
  2. Full findings stored in memory.db
  3. Parent gets only the 500-token summary
  4. On-demand: memory_search("topic") retrieves specific findings

Claimed savings: "40K tokens compressed to 500. The findings persist in DB across sessions — not just for this conversation, but forever."

Search Mechanism

hybrid — FTS5 (full-text search) + optional vector embeddings via fastembed + sqlite-vec. When fastembed is not installed, falls back to FTS5 only.

State Files

  • ~/.cognilayer/memory.db — all facts, code index, embeddings
  • ~/.cognilayer/active_session.json — current session tracking
  • ~/.cognilayer/sessions/ — per-session state archives
  • ~/.cognilayer/logs/cognilayer.log — server log
  • ~/.cognilayer/config.yaml — configuration
  • ~/.cognilayer/codex/ — Codex CLI instruction files

Token Reduction Claims (verbatim from README)

  • "80-200K+ tokens saved per session"
  • "3 targeted queries (~800 tokens) replace 15 file reads (~60K tokens)"
  • "40K tokens compressed to 500" (subagent research protocol)
  • "Longer sessions with subagents save more"
07

Orchestration

CogniLayer v4 — Orchestration

Multi-Agent

yes — the 21 multi-agent orchestration tools provide a full coordination protocol:

  • agent_research_wave_start/collect/finalize — fan-out research with compressed handoff
  • claim_scope / release_scope / list_claims — distributed lock management for shared resources
  • agent_handoff_inbox / agent_handoff_resolve — async handoff queue
  • agent_writer_run_start / agent_review_run_start — writer + reviewer agent roles
  • agent_run_heartbeat — liveness tracking
  • agent_delegate_plan — task delegation with planning

Orchestration Pattern

parallel-fan-outagent_research_wave_start spawns parallel research; agent_research_wave_collect aggregates results; agent_research_wave_finalize closes the wave. The 21 orchestration tools suggest a hierarchical pattern with a parent agent delegating to typed sub-agents (writer, reviewer, researcher).

Multi-Agent Policy

agent_policy_read — reads a JSON policy file that controls which multi-agent operations are permitted. This is the governance layer for multi-agent runs.

Isolation Mechanism

none — all agents share the same memory.db. Scope isolation is logical (via claim_scope locking, not process/container isolation).

Execution Mode

event-driven (hooks) + interactive-loop (MCP tools on demand).

Multi-Model

No — model-agnostic (works with Claude Code and Codex CLI). No model role mapping.

Context Compaction Handling

yes — explicit on_pre_compact.py hook.

Crash Recovery

yeson_session_start.py checks active_session.json for stale sessions and auto-recovers:

Next session starts with full context - no re-reading, no re-explaining

Auto Validators

None — no code quality gates.

Prompt Chaining

yes — the research wave protocol is explicit prompt chaining: wave start prompt → subagent research prompts → collection prompt → finalization prompt. Each stage's output IS the next stage's input (parent gets the summary that was written by the collector).

08

Ui Cli Surface

CogniLayer v4 — UI & CLI Surface

CLI Binary

  • Name: cognilayer
  • Type: Shell script in bin/cognilayer that launches the Textual TUI
  • Not a thin wrapper: launches the Textual Python TUI application

TUI Dashboard

Built with Textual (Python TUI framework):

cognilayer                    # All projects
cognilayer --project my-app   # Specific project
cognilayer --demo             # Demo mode with sample data

8 Tabs:

  1. Overview — stats at a glance
  2. Facts — searchable, filterable, color-coded by heat
  3. Code — AST code graph explorer
  4. Sessions — session bridges and history
  5. Safety — Identity Card management
  6. Search — interactive memory search
  7. Config — configuration editor
  8. Help — command reference

Slash Commands (Claude Code)

/status, /recall [query], /harvest, /onboard, /onboard-all, /forget [query], /identity, /consolidate, /tui, /cognihelp

Diagnostic Tool

python diagnose.py          # Check everything
python diagnose.py --fix    # Check + auto-fix missing dependencies

Observability

  • ~/.cognilayer/logs/cognilayer.log — server log
  • /status command — memory stats and project health
  • TUI Overview tab — stats at a glance
  • agent_run_digest MCP tool — agent run summary
  • agent_run_list MCP tool — list active runs

Transport

stdio (MCP to Claude Code / Codex CLI).

IDE Integration

Claude Code plugin + Codex CLI via AGENTS.md static instructions. No VS Code extension.

Related frameworks

same archetype · same primary tool · same memory type

Context Mode ★ 16k

Keeps raw tool output data out of the context window via sandbox execution and SQLite+FTS5 session indexing, reducing context…

lean-ctx ★ 2.2k

A full-session context runtime that compresses file reads (10 modes), shell output (60+ patterns), and session memory (CCP) to…

Nemp Memory ★ 101

Persists AI agent context across sessions as 100%-local plain JSON files with zero dependencies, zero cloud, and agent identity…

cursor-coding-agent-os (Mugiwara555343) ★ 3

Lean/Verbose dual-mode Agent OS fork for solo developers on token budgets.

rtk (Real Token Killer) ★ 55k

Intercepts Claude Code's Bash tool calls at the PreToolUse hook and compresses verbose CLI output (git status, test runners,…

Code-Mode Library ★ 1.5k

Replaces traditional tool-calling with TypeScript code execution in a sandbox, collapsing N sequential tool calls into 1 code…