Skip to content
/

ccmemory

ccmemory · patrickkidd/ccmemory · ★ 1 · last commit 2026-01-22

Accumulates decisions, corrections, and failed approaches from Claude Code sessions into a queryable Neo4j graph so each new session starts with full project context rather than from zero.

Best whenGraph databases with typed relationships and vector indexes are superior to markdown files or plain vector search for AI memory because they enable completen…
Skip ifCLAUDE.md-based memory (creates meta-loop: instructions to update instructions), Pure vector/RAG search (returns probably-relevant results, not guaranteed completeness)
Primitive shape 6 total
Skills 1 Hooks 4 MCP tools 1
00

Summary

ccmemory — Summary

ccmemory is a Claude Code plugin that provides persistent, graph-structured institutional memory across AI coding sessions, using a Neo4j database as the backing store. It solves the fundamental "stranger problem" of AI assistants: every new session starts from zero, requiring users to re-explain project context, decisions, and past failures repeatedly. What makes it distinct is its use of a Neo4j knowledge graph (rather than flat files, SQLite, or vector search alone) to capture not just facts but the relational "why" — decisions with rationale, corrections that fix wrong beliefs, failed approaches that prevent repeated mistakes, and explicit relationships (CONTINUES, SUPERSEDES, CITES, CONFLICTS_WITH) between nodes. The primary audience is individual developers and engineering teams who run long-lived projects in Claude Code and want the AI's effectiveness to compound over time rather than reset each session. The project is experimental and early-stage (v0.1.1, 1 GitHub star, no releases, created January 2026), with a running Docker-based infrastructure required — it is not production-ready.

01

Overview

ccmemory — Overview

Tagline

Persistent memory for Claude Code. The longer you use it, the smarter it gets.

Origin Story

ccmemory was created by Dr. Patrick Stinson (patrickkidd) and was first committed in January 2026. The project is explicitly inspired by the Foundation Capital essay "AI's Trillion-Dollar Opportunity: Context Graphs" (Gupta & Garg), which argues that context graphs — not model scale — are the next frontier for AI productivity. The repo's CLAUDE.md instructs the development AI itself to read that essay and a companion piece ("How to Build a Context Graph" by Koratana) before making design decisions.

Problem Framing (in author's own words)

From doc/PROJECT_VISION.md:

Every time you start a conversation with ChatGPT, Claude, or any AI assistant, you're talking to a stranger. It doesn't know your project. It doesn't know what you tried last week. It doesn't remember that "the client" means Acme Corp and they have unusual requirements.

From the README:

Every AI conversation starts from zero. You explain your project, your preferences, your constraints—then the session ends and it's all forgotten. Tomorrow, you're talking to a stranger again.

ccmemory fixes this. Decisions, corrections, and context accumulate over time. Session 50 is dramatically more effective than session 1.

Philosophy

The project's guiding principles, from CLAUDE.md:

  • Event clock, not state clock — Capture reasoning/decisions, not just current state
  • Decision traces, not containers — Organize by time + entity links, not sessions
  • Schema as output — Let structure emerge from use, don't over-specify upfront
  • World models, not retrieval — Goal is simulation ("what if?"), not just search

The PROJECT_VISION.md explicitly differentiates ccmemory from both enterprise RAG (e.g., Microsoft Copilot Work mode) and from plain instructions files like CLAUDE.md:

Copilot Work is a librarian who searches your company's files. ccmemory is a colleague who was in every meeting, remembers every decision, and learns your preferences over time.

On the choice of graphs over pure vector search:

Search returns probably relevant results ranked by similarity. Ask for "all decisions about authentication" and you might get 8 of 10. A graph stores explicit relationships and returns everything connected — not a ranked guess.

The "Cognitive Coefficient"

The project introduces a named metric — the cognitive coefficient — to measure how much accumulated context improves session effectiveness. This grows as more decisions are captured, corrections are learned from, and failed approaches prevent repeated mistakes. The coefficient is queryable via the getMetrics MCP tool.

Team Mode

The project includes a team collaboration layer: decisions are created as developmental (private) and can be promoted to curated (team-visible) status via a web dashboard, enabling shared institutional memory across a team pointing to the same Neo4j instance.

02

Architecture

ccmemory — Architecture

Distribution Type

claude-plugin (with bundled MCP server deployed via Docker)

Install Method

# 1. Clone and start the containers
git clone https://github.com/patrickkidd/ccmemory
cd ccmemory
docker compose up -d

# 2. Install the Claude Code plugin
/plugin marketplace add patrickkidd/ccmemory
/plugin install ccmemory@patrickkidd/ccmemory

File/Directory Layout

ccmemory/
├── .claude-plugin/
│   └── plugin.json              # Plugin manifest — declares skills + MCP SSE endpoint
├── .claude/                     # Claude Code project settings
├── .github/workflows/           # CI workflows
├── .mcp.json                    # Local MCP config (chrome-devtools for development)
├── CLAUDE.md                    # Development instructions for Claude itself
├── dashboard/                   # Flask web dashboard (port 8765)
├── doc/                         # Design docs, cookbook, vision, telemetry specs
├── docker-compose.yml           # Orchestrates 4 containers
├── hooks/
│   ├── hooks.json               # Hook event → script mapping
│   ├── activity_log.sh          # Shared logging helper
│   ├── ensure-running.sh        # SessionStart: starts Docker containers if needed
│   ├── session_start.sh         # SessionStart: fetches context from MCP server
│   ├── prompt_submit.sh         # UserPromptSubmit: checks for pending backfills
│   ├── message_response.sh      # Stop: sends transcript to MCP for detection
│   └── session_end.sh           # SessionEnd: finalizes session in MCP
├── mcp-server/
│   ├── Dockerfile
│   ├── init.cypher              # Neo4j schema (constraints, indexes, vector indexes)
│   ├── pyproject.toml
│   └── src/ccmemory/
│       ├── server.py            # FastMCP + Starlette HTTP server
│       ├── graph.py             # Neo4j client
│       ├── embeddings.py        # Ollama local embeddings (all-minilm, 768 dims)
│       ├── backfill.py          # Conversation/markdown import
│       ├── context.py           # Session context manager
│       ├── detection/           # LLM-based decision/correction detection
│       ├── tools/
│       │   ├── record.py        # MCP record tools (recordDecision, etc.)
│       │   ├── query.py         # MCP query tools (searchSemantic, etc.)
│       │   ├── reference.py     # MCP reference tools (cacheUrl, cachePdf)
│       │   └── backfill.py      # MCP backfill tools
│       └── reranker.py          # LLM reranking for semantic search
├── scripts/                     # Utility scripts
├── skills/ccmemory/
│   └── SKILL.md                 # Agent instructions for how to use ccmemory tools
└── tests/                       # pytest test suite

Container Topology (docker-compose.yml)

Container Image Port Role
ccmemory-ollama ollama/ollama:latest 11434 Local embedding model server (all-minilm)
ccmemory-neo4j neo4j:5.15-community 7474 (HTTP), 7687 (Bolt) Graph database with APOC plugin
ccmemory-mcp Custom Python build 8766 FastMCP + Starlette HTTP hooks API
ccmemory-dashboard Custom Python build 8765 Flask web dashboard

Required Dependencies

  • Docker (required — all services run in containers)
  • Claude Code CLI
  • Python 3.x (for local development only; containers are self-contained)
  • An LLM API key: ANTHROPIC_API_KEY, OPENAI_API_KEY, or GOOGLE_API_KEY (for detection + reranking)

Configuration Files

File Purpose
.claude-plugin/plugin.json Plugin manifest — declares SSE MCP endpoint and skills path
hooks/hooks.json Wires Claude Code hook events to shell scripts
skills/ccmemory/SKILL.md Agent-level instructions for tool use behavior
CLAUDE.md Development CLAUDE.md for the ccmemory repo itself
~/.ccmemory/config.json Runtime API key storage (auto-created on first run)

MCP Transport

The MCP server communicates with Claude Code via SSE (Server-Sent Events) over HTTP on localhost:8766/sse. This is declared in plugin.json:

"mcpServers": {
  "ccmemory": {
    "type": "sse",
    "url": "http://localhost:8766/sse"
  }
}
03

Components

ccmemory — Components

Commands

(none — no slash commands defined in plugin.json or the repo)

Skills

Count: 1

Name File Purpose
ccmemory skills/ccmemory/SKILL.md Agent instructions that define when and how Claude should call MCP tools to record context (decisions, corrections, exceptions, insights, failed approaches) and query the graph; also defines session startup behavior for pending imports and error recovery

Subagents

(none — no separate subagent definition files; all agent behavior is encoded in SKILL.md)

Hooks

Count: 4 event types, 5 hook entries

Event Script What It Does
SessionStart (1st) ensure-running.sh Checks if MCP health endpoint responds; if not, starts all Docker containers via docker compose up -d, pulls all-minilm embedding model if missing, waits up to 30s for MCP to become healthy. Prompts user for LLM API key on first run and stores it in ~/.ccmemory/config.json.
SessionStart (2nd) session_start.sh Reads cwd from hook stdin; enumerates recent JSONL conversation files (5 KB–500 KB, up to 200 files) for backfill awareness; POSTs to POST /hooks/session-start on the MCP server; outputs the returned context Markdown string to Claude's context window.
UserPromptSubmit prompt_submit.sh On first prompt of each session only (tracked by ~/.ccmemory/prompted-<session_id> sentinel file): counts pending JSONL conversation files; if >0, injects a SYSTEM REMINDER instructing Claude to use AskUserQuestion to offer import options.
Stop message_response.sh After each Claude response, POSTs the full transcript to POST /hooks/message-response; the MCP server runs LLM-based detection on the conversation for decisions, corrections, etc.; outputs detection count if any.
SessionEnd session_end.sh POSTs session info to POST /hooks/session-end for MCP server to finalize the session record.

MCP Servers

Count: 1

ccmemory MCP Server

Transport: SSE at http://localhost:8766/sse

Record Tools (from tools/record.py):

Tool Purpose
recordDecision Store a decision with description, rationale, options_considered, revisit_trigger, sets_precedent, topics
recordCorrection Store wrong_belief → right_belief correction with severity
recordException Store an exception to a normal rule with justification
recordInsight Store a realization, analysis, or strategic conclusion
recordQuestion Store a meaningful Q&A exchange
recordFailedApproach Store an approach that was tried and failed, with lesson learned
recordReference Store a URL or file path with context

Query Tools (from tools/query.py):

Tool Purpose
queryContext Get recent context for current project (limit, include_team flags)
searchPrecedent Full-text search across all context types
searchSemantic Ollama embedding retrieval + LLM reranking semantic search
queryByTopic Get all context related to a specific topic string
traceDecision Get full context around a specific decision (backward + forward graph traversal)
queryStaleDecisions Find decisions with old revisit_trigger dates
queryFailedApproaches Get failed approaches to avoid repeating mistakes
getMetrics Return cognitive coefficient and node counts

Reference Tools (from tools/reference.py):

Tool Purpose
cacheUrl Fetch and cache a URL as markdown to Domain 2
cachePdf Extract PDF content to markdown
indexReference Rebuild reference knowledge index (embeddings)
queryReference Semantic search over cached references
listReferences List all cached reference files

Backfill Tools (from tools/backfill.py):

Tool Purpose
ccmemory_backfill_conversation Import a JSONL conversation file from ~/.claude/projects/
ccmemory_backfill_markdown Import a markdown file into Domain 2 reference index

Management Tools:

Tool Purpose
promoteDecisions Promote developmental (private) decisions to curated (team-visible) status

Scripts/Binaries

Script Purpose
hooks/ensure-running.sh Docker container lifecycle management for SessionStart
hooks/session_start.sh Context retrieval for SessionStart
hooks/prompt_submit.sh Backfill reminder for UserPromptSubmit
hooks/message_response.sh Post-response detection trigger for Stop
hooks/session_end.sh Session finalization for SessionEnd
hooks/activity_log.sh Shared logging utility sourced by all hooks
05

Prompts

ccmemory — Prompts (Verbatim)

1. Skill Prompt: skills/ccmemory/SKILL.md (full, verbatim)

This is the primary agent instruction file. It teaches Claude when and how to use the MCP tools.

# ccmemory: Context Graph Skill

You have access to a persistent context graph that captures decisions, corrections, insights, and other valuable context from Claude Code sessions. The graph has two domains:

- **Domain 1 (Your Specifics)**: High-confidence lived experience — decisions you made, corrections to Claude's understanding, exceptions to rules, failed approaches
- **Domain 2 (Reference Knowledge)**: Curated reference material — cached URLs, PDFs, indexed documentation

## Available Tools

### Recording Context

Use these tools to explicitly capture important context:

- `recordDecision` — Record a decision with rationale, options considered, and revisit triggers
- `recordCorrection` — Record when the user corrects your understanding (highest value!)
- `recordException` — Record when normal rules don't apply in this context
- `recordInsight` — Record realizations, analyses, or strategic conclusions
- `recordQuestion` — Record meaningful Q&A exchanges
- `recordFailedApproach` — Record what was tried and didn't work
- `recordReference` — Record URLs or file paths mentioned

### Querying Context

Use these tools to retrieve relevant context:

- `queryContext` — Get recent context for the current project
- `searchPrecedent` — Full-text search across all context types
- `searchSemantic` — Semantic similarity search using embeddings
- `queryByTopic` — Get all context related to a specific topic
- `traceDecision` — Get full context around a specific decision
- `queryStaleDecisions` — Find decisions that may need review
- `queryFailedApproaches` — Get failed approaches to avoid repeating mistakes
- `getMetrics` — Get context graph metrics (cognitive coefficient, etc.)

### Reference Knowledge

- `cacheUrl` — Fetch and cache a URL as markdown
- `cachePdf` — Extract PDF content to markdown
- `indexReference` — Rebuild the reference knowledge index
- `queryReference` — Semantic search over cached references
- `listReferences` — List all cached reference files

### Management

- `promoteDecisions` — Promote developmental decisions to curated (team-visible) status

### Backfilling Historical Data

- `ccmemory_backfill_conversation` — Import a JSONL conversation file
- `ccmemory_backfill_markdown` — Import a markdown file

**CRITICAL:** When backfilling, only import from the CURRENT project's folder.
The JSONL files are at `~/.claude/projects/` in folders matching the full path with slashes replaced by dashes.

Example: If cwd is `/Users/patrick/theapp`, import ONLY from `~/.claude/projects/-Users-patrick-theapp/`
Do NOT import from `-Users-patrick-theapp-planner` or any other folder.

## Behaviors

### When the user makes a decision

1. Record it immediately with `recordDecision`
2. Include the rationale if stated
3. Note any revisit triggers ("if X changes, reconsider")
4. Check for related prior decisions with `searchPrecedent`

### When the user corrects your understanding

**This is the highest-value capture.** When you get something wrong and the user corrects you:

1. Immediately call `recordCorrection` with:
   - `wrong_belief`: What you incorrectly believed
   - `right_belief`: The correct understanding
   - `severity`: How significant the error was

Example triggers:
- "No, that's not right"
- "Actually, in this project we..."
- "That's the wrong approach because..."

### When the user grants an exception

Record with `recordException` when:
- "In this case, skip the normal..."
- "Just this once, we'll..."
- "Despite the rule about X, here we should Y"

### When something doesn't work

Record with `recordFailedApproach` when:
- "That didn't work"
- "Let's try something else"
- After debugging reveals a dead end

### When insights emerge

Record with `recordInsight` for:
- Realizations about the situation
- Strategic conclusions
- Pattern recognition
- Synthesized understanding

### Session Startup: Check for Pending Imports

**IMPORTANT:** At session start, if the injected context contains a "## Pending History Import" section showing conversations not yet imported, you MUST use `AskUserQuestion` to offer the user import options:

- Option 1: "Import 10 conversations" (Recommended) — imports the 10 most recent quality conversations
- Option 2: "Import all conversations" — imports all pending conversations
- Option 3: "Skip import" — proceed without importing

If the user chooses to import, use `ccmemory_list_conversations` to get the session list, then call `ccmemory_backfill_conversation` for each session (reading the JSONL file content first).

### Proactive Context Use

At the start of each session, context is automatically injected. Additionally:

1. **Check for related context** before giving advice — use `searchPrecedent` or `searchSemantic`
2. **Surface failed approaches** before suggesting solutions — use `queryFailedApproaches`
3. **Reference prior decisions** when they're relevant
4. **Flag stale decisions** that may need review

## The Cognitive Coefficient

The system tracks a "cognitive coefficient" — a measure of how much the accumulated context improves effectiveness. This grows as:

- More decisions are captured and reused
- Corrections are learned from
- Failed approaches prevent repeated mistakes
- The graph density increases

Current project metrics can be retrieved with `getMetrics`.

## Team Mode

In team mode (`CCMEMORY_USER_ID` set):
- `developmental` decisions are only visible to their creator
- `curated` decisions are visible to all team members
- Use `promoteDecisions` to make decisions team-visible after they're validated

## Error Handling: Session Lost

If any ccmemory tool returns a response with `"error": "session_not_found"` and `"ask_user": true`, the MCP server session was lost (typically due to server restart).

**You MUST use AskUserQuestion** to ask the user:

> "The ccmemory server session was lost. Would you like to:"
> - **Retry** — Re-establishes the session (the session_start hook will run again on your next message)
> - **Continue without saving** — Skip saving this context to the knowledge graph

If the user chooses "Retry", inform them the session will re-establish on the next interaction. If they choose "Continue without saving", proceed normally but note that the current context won't be persisted.

2. Hook Script: hooks/session_start.sh (verbatim)

This hook fires at SessionStart, retrieves accumulated context from Neo4j via the MCP server, and injects it into the Claude session.

#!/bin/bash
# SessionStart hook - calls MCP server HTTP endpoint
# Reads JSON from stdin, forwards to server, outputs context to stdout

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
source "$SCRIPT_DIR/activity_log.sh"

CCMEMORY_URL="${CCMEMORY_URL:-http://localhost:8766}"
HOOK_NAME="session_start"

hookStart "$HOOK_NAME"

# Quality filter thresholds
MIN_SIZE=5000      # 5KB minimum
MAX_SIZE=500000    # 500KB maximum

input=$(cat)
activityLogDebug "hook:$HOOK_NAME" "stdin: ${input:0:200}..."

cwd=$(echo "$input" | jq -r '.cwd // ""' 2>/dev/null)
if [ -z "$cwd" ]; then
    activityLogError "hook:$HOOK_NAME" "No cwd in session start input"
    echo "# Context Graph: unknown"
    echo "Error: No cwd in session start input"
    hookEnd "$HOOK_NAME"
    exit 0
fi

activityLogDebug "hook:$HOOK_NAME" "cwd=$cwd"

folder_name=$(echo "$cwd" | tr '/' '-')
folder_name="${folder_name#-}"
folder_name="-$folder_name"
claude_dir="$HOME/.claude/projects/$folder_name"

conversation_stems="[]"
if [ -d "$claude_dir" ]; then
    # Find files in quality range, sorted by recency, limit 200
    stems=$(find "$claude_dir" -name "*.jsonl" -size +${MIN_SIZE}c -size -${MAX_SIZE}c -print0 2>/dev/null | \
        xargs -0 ls -t 2>/dev/null | \
        head -200 | \
        xargs -I{} basename {} .jsonl 2>/dev/null)
    if [ -n "$stems" ]; then
        stem_count=$(echo "$stems" | wc -l | tr -d ' ')
        activityLogDebug "hook:$HOOK_NAME" "Found $stem_count conversation files in quality range"
        conversation_stems=$(echo "$stems" | jq -R . 2>/dev/null | jq -s . 2>/dev/null) || conversation_stems="[]"
    fi
fi

payload=$(echo "$input" | jq --argjson stems "$conversation_stems" '. + {conversation_stems: $stems}' 2>/dev/null) || payload="$input"

activityLogInfo "hook:$HOOK_NAME" "POST ${CCMEMORY_URL}/hooks/session-start"
response=$(curl -s -X POST "${CCMEMORY_URL}/hooks/session-start" \
    -H "Content-Type: application/json" \
    -d "$payload" \
    --connect-timeout 5 \
    --max-time 10 2>/dev/null)

if [ -z "$response" ]; then
    project=$(basename "$cwd")
    activityLogError "hook:$HOOK_NAME" "Server not responding"
    echo "# Context Graph: $project"
    echo "Server not running. Start with: ccmemory start"
    hookEnd "$HOOK_NAME"
    exit 0
fi

activityLogDebug "hook:$HOOK_NAME" "Response: ${response:0:200}..."

context=$(echo "$response" | jq -r '.context // ""' 2>/dev/null)
if [ -n "$context" ]; then
    context_len=${#context}
    activityLogInfo "hook:$HOOK_NAME" "Got context: ${context_len} chars"
    echo "$context"
fi

hookEnd "$HOOK_NAME"

3. Hook Script: hooks/prompt_submit.sh (verbatim)

Fires on UserPromptSubmit once per session to prompt the user to import conversation history:

#!/bin/bash
# UserPromptSubmit hook - checks for pending backfills and reminds Claude to ask
# Only runs once per session (on first prompt)

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
source "$SCRIPT_DIR/activity_log.sh"

CCMEMORY_URL="${CCMEMORY_URL:-http://localhost:8766}"
STATE_DIR="$HOME/.ccmemory"
mkdir -p "$STATE_DIR"
HOOK_NAME="prompt_submit"

hookStart "$HOOK_NAME"

input=$(cat)

session_id=$(echo "$input" | jq -r '.session_id // ""' 2>/dev/null)
cwd=$(echo "$input" | jq -r '.cwd // ""' 2>/dev/null)

activityLogDebug "hook:$HOOK_NAME" "session_id=$session_id cwd=$cwd"

# Track if we've already reminded this session
state_file="$STATE_DIR/prompted-$session_id"
if [ -f "$state_file" ]; then
    activityLogDebug "hook:$HOOK_NAME" "Already prompted this session, skipping"
    hookEnd "$HOOK_NAME"
    exit 0
fi

# Check for pending backfills (quick local check)
folder_name=$(echo "$cwd" | tr '/' '-')
folder_name="${folder_name#-}"
folder_name="-$folder_name"
claude_dir="$HOME/.claude/projects/$folder_name"

MIN_SIZE=5000
MAX_SIZE=500000

pending_count=0
if [ -d "$claude_dir" ]; then
    pending_count=$(find "$claude_dir" -name "*.jsonl" -size +${MIN_SIZE}c -size -${MAX_SIZE}c 2>/dev/null | wc -l | tr -d ' ')
fi

activityLogDebug "hook:$HOOK_NAME" "Pending imports: $pending_count"

# Mark this session as prompted
touch "$state_file"

# If there are pending imports, inject a reminder
if [ "$pending_count" -gt 0 ]; then
    activityLogInfo "hook:$HOOK_NAME" "Injecting import reminder for $pending_count files"
    cat << EOF
SYSTEM REMINDER: There are $pending_count conversation history files that could be imported into ccmemory.
Before responding to the user's request, use AskUserQuestion to offer:
- "Import 10 recent conversations" (Recommended)
- "Import all $pending_count conversations"
- "Skip import"
This is a one-time prompt per session. After user responds, proceed normally.
EOF
fi

hookEnd "$HOOK_NAME"

4. MCP Tool Definition: recordDecision (from tools/record.py, verbatim)

@mcp.tool()
@logTool
async def recordDecision(
    description: str,
    rationale: Optional[str] = None,
    options_considered: Optional[str] = None,
    revisit_trigger: Optional[str] = None,
    sets_precedent: bool = False,
    topics: Optional[list[str]] = None,
) -> dict:
    """Record a decision to the context graph.

    Args:
        description: What was decided
        rationale: Why this choice was made
        options_considered: What alternatives were evaluated
        revisit_trigger: Conditions that should prompt reconsideration
        sets_precedent: Whether this decision should guide future similar decisions
        topics: Topics/components this decision relates to (e.g., ['auth', 'api'])
    """
    project = getCurrentProject()
    if not project:
        return _projectError()

    client = getClient()
    decision_id = f"decision-{uuid.uuid4().hex[:8]}"

    text_for_embedding = f"{description} {rationale or ''}"
    embedding = getEmbedding(text_for_embedding)

    kwargs = {
        "detection_method": "explicit_command",
        "detection_confidence": 1.0,
    }
    if rationale:
        kwargs["rationale"] = rationale
    if options_considered:
        kwargs["options_considered"] = options_considered
    if revisit_trigger:
        kwargs["revisit_trigger"] = revisit_trigger
    if sets_precedent:
        kwargs["sets_precedent"] = sets_precedent

    result = client.createDecision(
        decision_id=decision_id,
        project=project,
        description=description,
        embedding=embedding,
        topics=topics or [],
        **kwargs,
    )

    return {"decision_id": decision_id, "status": "recorded", **result}

5. MCP Tool Definition: searchSemantic (from tools/query.py, verbatim)

Demonstrates the two-stage retrieval pattern: Ollama embeddings for candidates, then LLM reranking.

@mcp.tool()
@logTool
async def searchSemantic(
    query: str, limit: int = 10, include_team: bool = True
) -> dict:
    """Semantic similarity search across decisions, corrections, and insights.

    Uses local embeddings for candidate retrieval, then Claude for reranking.

    Args:
        query: Natural language query
        limit: Maximum results to return
        include_team: Whether to include curated team decisions
    """
    client = getClient()
    project = _getProject()

    embedding = getEmbedding(query)
    raw_limit = min(limit * 2, 20)
    results = client.searchSemantic(
        embedding, project, limit=raw_limit, include_team=include_team
    )

    candidates = []
    for category, items in results.items():
        for item in items:
            candidates.append(
                {"data": item[0], "score": item[1], "category": category}
            )

    candidates.sort(key=lambda x: x["score"], reverse=True)

    reranked = await rerank(query, candidates, limit=limit)

    formatted = {}
    for item in reranked:
        cat = item.get("category", "unknown")
        if cat not in formatted:
            formatted[cat] = []
        formatted[cat].append({"data": item["data"], "score": item["score"]})

    return {"project": project, "query": query, "results": formatted}

6. Project CLAUDE.md (excerpt — first-time setup instructions for Claude)

From CLAUDE.md in the root of the repo (this CLAUDE.md is for developing ccmemory itself, not for users):

Before making design decisions, consult these in order:

  1. doc/clarifications/ — Binding decisions that override other docs
  2. doc/PROJECT_VISION.md — Intended behavior and architecture
  3. Gupta/Koratana Articles — External strategy ccmemory implements

Key principles from these sources:

  • Event clock, not state clock — Capture reasoning/decisions, not just current state
  • Decision traces, not containers — Organize by time + entity links, not sessions
  • Schema as output — Let structure emerge from use, don't over-specify upfront
  • World models, not retrieval — Goal is simulation ("what if?"), not just search

Prompting Techniques Observed

  1. Trigger-phrase enumeration: SKILL.md lists exact natural language phrases that should trigger tool calls (e.g., "No, that's not right", "Actually, in this project we...", "That didn't work") — grounding abstract behavior rules in concrete linguistic signals.

  2. Priority labeling: Corrections are explicitly labeled "highest-value capture" and "highest priority" to guide agent attention ordering.

  3. SYSTEM REMINDER injection via hook: The prompt_submit.sh hook injects a SYSTEM REMINDER block directly into the user's prompt stream — a hook-based prompt injection pattern that doesn't require any changes to CLAUDE.md.

  4. Error recovery prompting: The skill includes a complete error-recovery sub-protocol for when the MCP session is lost, including exact AskUserQuestion text to present to the user.

  5. Two-stage retrieval: Semantic search uses local Ollama embeddings for fast candidate retrieval, then passes candidates to a Claude LLM reranker — a RAG-plus-rerank pattern expressed entirely within an MCP tool.

09

Uniqueness

ccmemory — Uniqueness & Positioning

What This Does That No Other Seed Framework Does

Neo4j graph database as the memory substrate — No other framework in this research corpus uses a property graph database with vector indexes for AI memory. Other frameworks use file-based memory (CLAUDE.md, memory banks in markdown), SQLite, or no persistence at all. ccmemory stores structured nodes (Decision, Correction, FailedApproach, etc.) with typed relationships (SUPERSEDES, CONTINUES, CITES, CONFLICTS_WITH) enabling graph traversal queries that file-based systems cannot perform — e.g., "all decisions that depend on this choice" or "trace the full reasoning chain behind this rule."

Automatic detection pipeline — After every Claude response (via the Stop hook), the MCP server runs LLM-based analysis of the transcript to detect and store decisions and corrections with confidence scores, without any explicit user action. Most memory frameworks require the user or the AI to explicitly invoke a save command.

The "cognitive coefficient" metric — ccmemory introduces a named, queryable metric for the value of accumulated context. No other framework in the corpus attempts to measure or expose how much its memory layer is improving AI effectiveness.

Team shared memory with promotion workflow — Decisions start as developmental (private per developer) and are explicitly promoted to curated (team-visible) via a dashboard or promoteDecisions tool. This is a multi-user memory governance model that no other seed framework addresses.

Docker-native four-container architecture — Ollama (local embeddings) + Neo4j + MCP server + Dashboard as a compose stack is the most operationally complex memory infrastructure in the corpus. This is a deliberate design choice: the author explicitly argues against CLAUDE.md-based approaches because they create "instructions to write instructions."

What ccmemory Explicitly Drops

  • No spec/plan/task workflow — ccmemory drops the "Brainstorm → Spec → Plan → Code" pipeline entirely. It has no /specify, /plan, or /tasks commands. It is purely a memory system, not a development methodology.
  • No file-based memory — ccmemory explicitly argues against flat markdown memory files (CLAUDE.md updates, memory banks) in PROJECT_VISION.md: "Say 'we use uv here' once → captured forever. No meta-loop."
  • No STDIO MCP transport — Uses SSE-over-HTTP (not stdio), which requires the server to be running independently as a container.

One-Sentence Positioning

ccmemory is the only Claude Code plugin that uses a Neo4j graph database to accumulate a typed, queryable knowledge graph of decisions, corrections, and failed approaches across sessions — making each session smarter than the last through graph traversal, not just text search.

Derivative: ccmemory-plain

The author also maintains patrickkidd/ccmemory-plain (created 2025-12-26, predating the main repo) — a simpler variant that stores memory in markdown files within the project rather than Neo4j. It uses hooks + prompts + scripts with no database or Docker requirement. The plain variant's memory targets are CLAUDE.md, doc/[topic].md, and .ccmemory/session.md. It is the "no infrastructure" alternative for users who want basic session continuity without Neo4j.

Key differences between ccmemory and ccmemory-plain:

Aspect ccmemory (Neo4j) ccmemory-plain (File-based)
Storage Neo4j graph + Ollama embeddings Markdown files in project
Search Full-text + vector semantic + graph traversal grep / Claude reads
Infrastructure Docker required None
Team sharing Yes (curated/developmental split) No
Semantic search Yes No
Install complexity High Low

Failure Modes / Criticisms

No Reddit or HN discussion found. Potential failure modes based on architecture review:

  1. High operational overhead: Requires Docker, 4 running containers (Ollama, Neo4j, MCP server, Dashboard), and an LLM API key just for the detection subsystem. A developer on a plane or without Docker has no memory.
  2. Cold-start problem: Until enough sessions have been logged, the context graph is sparse. Session 1-5 provide less value than the README implies.
  3. Detection reliability: The LLM-based auto-detection for corrections/decisions operates on transcripts post-hoc (Stop hook) — it may miss real-time nuance or misclassify. The 0.7 confidence threshold is not validated publicly.
  4. No portability: The graph lives in a named Docker volume. Migrating to a new machine or sharing snapshots is not documented.
  5. Neo4j licensing: Neo4j 5.x Community Edition has AGPL-3.0 licensing constraints that may affect enterprise use.
04

Workflow

ccmemory — Workflow

Overview

ccmemory does not impose a spec-driven or planning workflow. It is a passive memory layer that augments any existing workflow without replacing it. There are no workflow phases in the sense of "Brainstorm → Spec → Plan → Implement." Instead, ccmemory operates as background infrastructure across all sessions.

Session Lifecycle Phases

Phase 1: Session Startup

  • Trigger: Claude Code session begins
  • Hook: SessionStart fires ensure-running.sh then session_start.sh
  • What happens: Containers are started if not running; MCP server retrieves recent context for the current project and injects it into Claude's context window as Markdown
  • Artifact produced: Context Markdown injected at session start (e.g., # Context Graph: your-project / ## Recent Decisions)

Phase 2: First Prompt — Backfill Check

  • Trigger: First user message in a session
  • Hook: UserPromptSubmit fires prompt_submit.sh
  • What happens: Checks for JSONL conversation history files not yet imported; if found, injects a SYSTEM REMINDER instructing Claude to offer import via AskUserQuestion
  • Human approval gate: User chooses "Import 10 conversations", "Import all N conversations", or "Skip import"
  • Artifact produced: Backfilled session nodes in Neo4j (if user approves)

Phase 3: Active Work (Any Task)

  • Trigger: Ongoing conversation
  • Agent behavior: The ccmemory skill instructs Claude to proactively use record tools when it detects decisions, corrections, exceptions, or failed approaches in the conversation
  • Key behaviors:
    • When user makes a decision → recordDecision called immediately
    • When user corrects Claude's understanding → recordCorrection called (highest priority)
    • When user grants an exception → recordException called
    • When an approach fails → recordFailedApproach called
    • Before giving advice → searchPrecedent or searchSemantic consulted
    • Before suggesting solutions → queryFailedApproaches consulted
  • No human approval gate: Recording is automatic/proactive

Phase 4: Post-Response Detection (Automatic)

  • Trigger: After every Claude response (Stop event)
  • Hook: message_response.sh fires
  • What happens: Full transcript posted to MCP server; LLM-based detection scans for decisions/corrections/etc. above 0.7 confidence; embeddings generated for semantic search
  • Artifact produced: New Decision/Correction/etc. nodes in Neo4j with embeddings

Phase 5: Session End

  • Trigger: Session terminates
  • Hook: session_end.sh fires
  • What happens: MCP server finalizes the session record
  • Artifact produced: Session record updated in Neo4j

Human Approval Gates

Gate Where
Backfill import choice First prompt of each session (if pending files exist)
Decision promotion to "curated" Dashboard UI — promoteDecisions tool or web interface

TDD Enforcement

No — ccmemory does not mention or enforce TDD. It is a memory system, not a development methodology.

Multi-Agent Execution

No — There are no multiple agent personas or parallel execution. Claude Code runs as a single agent with the ccmemory skill loaded. The MCP server runs independently as infrastructure.

Git Worktrees / Isolated Workspaces

No — Not mentioned or supported.

Spec Format

None — ccmemory does not generate or consume specs. Memory is stored as graph nodes in Neo4j.

Files Generated Per Feature

ccmemory does not generate per-feature files. Instead, it accumulates the following node types in Neo4j per project:

  • Decision nodes (with rationale, options, revisit_trigger, topics, embedding)
  • Correction nodes (wrong_belief, right_belief, severity, embedding)
  • Exception nodes (rule_broken, justification, embedding)
  • Insight nodes (summary, detail, implications, category, embedding)
  • Question nodes (question, answer, context, embedding)
  • FailedApproach nodes (approach, outcome, lesson, embedding)
  • Reference nodes (uri, context, type)
  • ProjectFact nodes (fact, context, category, embedding)
  • Chunk nodes (reference knowledge chunks from cached URLs/PDFs)

Backfill: One-Time Historical Import

For projects with existing history, ccmemory offers two manual import flows:

  1. Conversation import: Read JSONL files from ~/.claude/projects/ and call ccmemory_backfill_conversation
  2. Markdown import: Read .md files from the project and call ccmemory_backfill_markdown

Decision log files using ## YYYY-MM-DD: Title format are parsed into structured Decision nodes automatically.

06

Memory Context

ccmemory — Memory & Context

Memory Model

neo4j — Neo4j 5.15 Community Edition graph database with Ollama-powered vector indexes (768-dimension embeddings via all-minilm model).

Persistence Scope

global — Memory persists indefinitely across all sessions, across projects (keyed by project property on nodes). Data is stored in Docker volumes (ccmemory_data) and survives container restarts.

Two-Domain Architecture

Domain 1: "Your Specifics"

High-confidence lived experience from actual Claude Code conversations:

Node Type Properties Captured When
Decision description, rationale, options_considered, revisit_trigger, sets_precedent, topics, embedding, status (developmental/curated), detection_method, detection_confidence User makes a choice or commitment
Correction wrong_belief, right_belief, severity, embedding User corrects Claude's understanding
Exception rule_broken, justification, embedding User grants an exception to a rule
Insight summary, detail, implications, category, embedding Strategic realizations emerge
Question question, answer, context, embedding Meaningful Q&A exchanged
FailedApproach approach, outcome, lesson, embedding Something tried and didn't work
Reference uri, context, type URL or file path mentioned
ProjectFact fact, context, category, embedding General project facts

Domain 2: Reference Knowledge

Cached external material chunked and semantically indexed:

Node Type Properties Source
Chunk content, section, source_file, project, embedding Imported URLs, PDFs, markdown files

Relationship Types (Graph Edges)

CONTINUES    — Decision extends/follows from a prior decision (same trace)
CITES        — Decision references a related prior decision (similarity > 0.85, auto)
SUPERSEDES   — Decision replaces/invalidates a prior decision (explicit LLM detection only)
DEPENDS_ON   — Decision requires another decision to hold
CONSTRAINS   — Decision limits what another decision can do
CONFLICTS_WITH — Decisions are incompatible
IMPACTS      — Decision affects another trace/topic

Note: The Session node type is deprecated per internal clarification. The schema previously organized nodes under session containers; the current design organizes by project + timestamp directly, making nodes a DAG with cross-references.

Neo4j Schema Highlights

From mcp-server/init.cypher:

  • Unique constraints on all node types' id properties
  • Composite indexes on (project, timestamp) for all node types
  • Full-text indexes on key text fields (description, rationale, wrong_belief, right_belief, etc.)
  • Vector indexes on embedding properties for all Domain 1 node types — 768 dimensions, cosine similarity, requires Neo4j 5.13+
  • Telemetry nodes (TelemetryEvent, Retrieval) for the "cognitive coefficient" metric

Detection Pipeline

After each Claude response (Stop hook), the MCP server:

  1. Receives the full transcript from message_response.sh
  2. Runs LLM-based detection (using Anthropic/OpenAI/Google API key configured at install)
  3. Detections with confidence > 0.7 are stored as graph nodes
  4. Ollama generates embeddings for each stored node

Detection method is stored on each node (detection_method: explicit_command for tool calls, or LLM-assigned for automatic detection).

Context Compaction Strategy

Yes — ccmemory addresses context compaction explicitly:

  • On SessionStart, only recent context is fetched (not the full graph), keeping the injected context Markdown to a manageable size
  • The queryContext tool has a limit parameter (default: 20 items)
  • Conversation files used for backfill are filtered by size (5 KB–500 KB) to exclude trivially short or excessively large sessions
  • Context is delivered as structured Markdown at session start, not injected per-turn

Cross-Session Handoffs

Yes — This is the core purpose of ccmemory:

  • SessionStart hook fetches accumulated context from Neo4j and injects it as Markdown at the top of every new session
  • Example output:
    # Context Graph: your-project
    ## Recent Decisions
    - Use Postgres for concurrent write support (Jan 4)
    
  • Context is scoped to the current project (by cwd basename)
  • Team mode allows cross-developer handoffs via curated decision status

References to Memory Concepts

  • README: "institutional memory" — the project's primary value proposition
  • SKILL.md: "context graph" — the structured name for the accumulated knowledge
  • PROJECT_VISION.md: "cognitive coefficient" — named metric for memory effectiveness
  • PROJECT_VISION.md: "decision traces" — organizing principle (not sessions, but time-ordered chains of reasoning)
  • PROJECT_VISION.md: "world models, not retrieval" — goal framing distinguishing from RAG
07

Target Tools

ccmemory — Target Tools

Officially Supported

Claude Code (primary)

Evidence: The README explicitly states:

Requirements:

  • Docker
  • Claude Code CLI

The plugin is installed via Claude Code's plugin system:

/plugin marketplace add patrickkidd/ccmemory
/plugin install ccmemory@patrickkidd/ccmemory

The entire hook system (SessionStart, UserPromptSubmit, Stop, SessionEnd) is based on Claude Code's hook event model. Skills are in Claude Code's SKILL.md format. The plugin manifest (plugin.json) uses Claude Code's plugin specification format.

Install path: .claude-plugin/plugin.json declares the plugin; hooks live in hooks/hooks.json; skills in skills/ccmemory/SKILL.md.

Compatibility caveats:

  • Requires Docker to be installed and running (the MCP server + Neo4j + Ollama all run as containers)
  • Requires at least one LLM API key (Anthropic, OpenAI, or Google) for the detection and reranking subsystem — the MCP server itself is not purely local
  • Uses SSE transport for MCP, which requires Claude Code to support SSE-type MCP connections

Not Supported

No other AI tools are mentioned in the README, CLAUDE.md, or any discovered source files. The plugin is exclusively designed for Claude Code.

Tool Supported Notes
claude-code yes Primary and only target
cursor no Not mentioned
codex no Not mentioned
aider no Not mentioned
cline no Not mentioned
copilot no Not mentioned
opencode no Not mentioned
goose no Not mentioned
windsurf no Not mentioned
gemini-cli no Not mentioned
roo no Not mentioned
08

Signals

ccmemory — Signals

GitHub Stats

  • Stars: 1
  • Forks: 0
  • Watchers: 1
  • Open Issues: 0
  • Contributors: 2 (patrickkidd / Dr. Patrick Stinson; and "claude" / Claude AI as a co-contributor)
  • Created: 2026-01-04
  • Last pushed: 2026-01-22
  • Last updated on GitHub: 2026-05-03
  • Total commits: 41

Language Breakdown

  • Python: 72.0%
  • HTML: 21.0%
  • Shell: 4.1%
  • Cypher: 2.5%
  • Dockerfile: 0.4%

Maintainer Status

Active (as of project creation; 41 commits across ~18 days in January 2026; no activity detected after January 22, 2026 based on push date — may be dormant post-initial build)

Reddit / HN Sentiment

Unknown — No Reddit or HN mentions found in the research corpus. The project has 1 star and zero public discussion threads identified.

Package Registry

The repo has 1 published GitHub Container Registry package: ghcr.io/patrickkidd/ccmemory (the MCP server Docker image).

Observations

  • The project was built very rapidly (41 commits in ~18 days, January 2026), with Claude listed as a contributor — consistent with AI-assisted development
  • Zero forks and 1 star indicate very early/experimental status with no community adoption
  • No releases have been published despite a version: "0.1.1" in plugin.json
  • The project has comprehensive internal documentation (doc/PROJECT_VISION.md, doc/DEVELOPMENT.md, doc/NEO4J_COOKBOOK.md, doc/TELEMETRY.md, doc/IMPLEMENTATION_PLAN.md) suggesting serious intent but still pre-public-release maturity

Related frameworks

same archetype · same primary tool · same memory type

Taskmaster AI ★ 27k

Converts a PRD into a dependency-ordered JSON task graph that AI coding agents execute one task at a time, eliminating context…

Pimzino spec-workflow-mcp ★ 4.2k

MCP server providing spec-driven development workflow with dashboard-backed approval gates, implementation logging, and VSCode…

MCP Shrimp Task Manager ★ 2.1k

Convert natural language requests into structured AI development tasks with chain-of-thought enforcement, reflection gates, and…

Bernstein ★ 460

Govern parallel CLI coding agents with a deterministic Python scheduler, HMAC-chained audit trail, and compliance-ready signed…

LeanSpec ★ 252

Provides a unified spec CLI and MCP server over any existing spec backend (markdown, GitHub Issues, ADO), making spec-driven…

Specs Workflow MCP ★ 127

Enforces Requirements → Design → Tasks workflow via a single MCP tool with persistent JSON progress tracking that survives…