claude-mem (thedotmack)

claude-mem · thedotmack/claude-mem · ★ 78k · last commit 2026-05-26

Primitive shape 25 total

Skills 15 Hooks 6 MCP tools 4

Summary

claude-mem — Summary

claude-mem is a production-grade persistent memory compression system for Claude Code (and Gemini CLI, OpenCode, Codex) with 78,000+ GitHub stars and an active npm/marketplace distribution. It captures every tool use (Read, Edit, Bash) as an "observation," compresses observations into semantic summaries via AI, and injects relevant context from past sessions into new ones — enabling Claude to remember codebase decisions, past errors, and prior work without re-explanation. It runs a background worker service on port 37777 with a local web viewer UI for browsing memory, and ships 15 named skills covering search, codebase learning, project planning, weekly digests, and more. The framework targets "progressive disclosure" — layered memory retrieval controlled by token cost — and explicitly supports multiple AI tools (Claude Code, Gemini CLI, OpenCode, Codex, Warp) as well as the OpenClaw gateway for teams. Compared to seeds, it is closest to claude-flow in being an MCP-anchored infrastructure package with a background service, but focuses exclusively on memory persistence rather than full orchestration.

Overview

claude-mem — Overview

Origin

Created by thedotmack. First published in 2025. Grew rapidly to 78K stars, becoming one of the most starred memory plugins in the Claude Code ecosystem. Currently at v6.5.0 with active development including beta "Endless Mode" for very long-running sessions.

Philosophy

"Progressive Disclosure" — memory retrieval is layered to control token costs. The system uses a 3-layer workflow:

search — compact index with IDs (~50-100 tokens/result)
timeline — chronological context around interesting results
get_observations — full details ONLY for filtered IDs (~500-1,000 tokens/result)

This prevents dumping all memory into every session, which would be expensive and noisy.

Core Insight (from how-it-works skill)

Every Read, Edit, and Bash that Claude makes turns into a compressed observation.
Observations get summarized at session end. Relevant ones get auto-injected into
future prompts so the next session starts with context from the last one — no
re-explaining the codebase, no re-discovering decisions.

Privacy Model

"Nothing leaves your machine except calls to whichever AI provider you configured for compression." All state lives in ~/.claude-mem/.

Multi-Tool Strategy

Explicit adapters for:

Claude Code (primary)
Gemini CLI (npx claude-mem install --ide gemini-cli)
OpenCode (npx claude-mem install --ide opencode)
Codex (separate codex-plugin directory)
Warp terminal (WARP.md)
OpenClaw gateway (team use)
Cursor (cursor-hooks/)

Beta Features

Endless Mode — experimental feature for sessions that exceed normal context limits
Version switching via npx claude-mem switch <version>

Architecture

claude-mem — Architecture

Distribution

npm package (claude-mem). Install via npx claude-mem install (not npm install -g).

Install

npx claude-mem install                        # Claude Code
npx claude-mem install --ide gemini-cli       # Gemini CLI
npx claude-mem install --ide opencode         # OpenCode
# Or via Claude Code plugin marketplace:
/plugin marketplace add thedotmack/claude-mem
/plugin install claude-mem
# Or for OpenClaw:
curl -fsSL https://install.cmem.ai/openclaw.sh | bash

Important: npm install -g claude-mem installs SDK/library only — does NOT register hooks or start worker service.

Directory Structure

plugin/
├── .claude-plugin/         # Claude Code plugin manifest
├── .codex-plugin/          # Codex plugin manifest
├── hooks/
│   ├── hooks.json          # 6 lifecycle hook entries
│   └── codex-hooks.json    # Codex-specific hooks
├── modes/                  # Agent modes (e.g., learn mode)
├── scripts/                # Hook scripts (bun-runner.js, worker-service.cjs)
├── skills/                 # 15 named skills
│   ├── babysit/
│   ├── design-is/
│   ├── do/
│   ├── how-it-works/
│   ├── knowledge-agent/
│   ├── learn-codebase/
│   ├── make-plan/
│   ├── mem-search/
│   ├── oh-my-issues/
│   ├── pathfinder/
│   ├── smart-explore/
│   ├── timeline-report/
│   ├── version-bump/
│   ├── weekly-digests/
│   └── wowerpoint/
└── ui/
    ├── viewer.html         # Local web viewer
    └── viewer-bundle.js    # Bundled React/UI

State Storage

~/.claude-mem/
├── (SQLite DB — observations, sessions, summaries)
├── (Vector index — Chroma)
└── (logs, settings)

Required Runtime

Node.js ≥18
Bun (for worker service runtime)

Worker Service

Background HTTP service on port 37777. Manages:

Observation storage
AI compression
Memory search API
Web viewer endpoint

Target AI Tools

Claude Code, Gemini CLI, OpenCode, Codex, Cursor, Warp, OpenClaw

Components

claude-mem — Components

Lifecycle Hooks (6 hook entries)

Hook Event	Action
`Setup`	Version check + dependency installer
`SessionStart` (matcher: startup/clear/compact)	Start worker service + inject session context
`UserPromptSubmit`	Session initialization / first-message setup
`PostToolUse` (matcher: *)	Record observation per tool call
`PreToolUse` (matcher: Read)	File-context lookup before reading
`Stop`	Summarize session + compress observations

MCP Tools (4)

Exposed via .mcp.json:

Tool	Purpose
`search`	Get compact index with IDs (~50-100 tokens/result)
`timeline`	Get chronological context around an anchor observation
`get_observations`	Fetch full details for specific IDs
(4th tool — unknown)	unknown

Skills (15)

Skill	Purpose
`babysit`	Monitor ongoing work, surface issues
`design-is`	Capture design decisions
`do`	Task execution helper
`how-it-works`	Explain claude-mem to the user
`knowledge-agent`	Knowledge base query agent
`learn-codebase`	Front-load entire repo into memory (~5 min)
`make-plan`	Create structured project plans
`mem-search`	Natural language search of past sessions
`oh-my-issues`	Issue tracking helper
`pathfinder`	Codebase navigation
`smart-explore`	Intelligent codebase exploration
`timeline-report`	Generate timeline of work done
`version-bump`	Version management helper
`weekly-digests`	Weekly work summary generation
`wowerpoint`	(unknown — presentation?)

Web Viewer

Local UI served by worker at http://localhost:37777. Features:

Real-time memory stream
Observation browsing
Observation detail via http://localhost:37777/api/observation/{id}
Citation references for past observations

Worker Service Architecture

Node.js/Bun worker service (plugin/scripts/worker-service.cjs) manages all heavy lifting. Claude hooks call bun-runner.js which delegates to the worker. The worker exposes HTTP API on port 37777.

Prompts

claude-mem — Prompts

mem-search skill (verbatim, key sections)

---
name: mem-search
description: Search claude-mem's persistent cross-session memory database. Use when
user asks "did we already solve this?", "how did we do X last time?", or needs work
from previous sessions.
---

# Memory Search

## 3-Layer Workflow (ALWAYS Follow)

**NEVER fetch full details without filtering first. 10x token savings.**

### Step 1: Search - Get Index with IDs

search(query="authentication", limit=20, project="my-project")

Returns: Table with IDs, timestamps, types, titles (~50-100 tokens/result)

| ID | Time | T | Title | Read |
|----|------|---|-------|------|
| #11131 | 3:48 PM | 🟣 | Added JWT authentication | ~75 |
| #10942 | 2:15 PM | 🔴 | Fixed auth token expiration | ~50 |

### Step 2: Timeline - Get Context Around Interesting Results

timeline(anchor=11131, depth_before=3, depth_after=3, project="my-project")

### Step 3: Fetch - Get Full Details ONLY for Filtered IDs

Review titles from Step 1 and context from Step 2. Pick relevant IDs.

Prompting technique: Token-budget-aware search protocol. Forces progressive disclosure through explicit 3-step constraint ("NEVER fetch full details without filtering first"). The "10x token savings" framing creates economic motivation for the constraint.

how-it-works skill (verbatim excerpt)

## What it does

Every Read, Edit, and Bash that Claude makes turns into a compressed observation.
Observations get summarized at session end. Relevant ones get auto-injected into
future prompts so the next session starts with context from the last one — no
re-explaining the codebase, no re-discovering decisions.

## When it kicks in

Memory injection starts on your second session in a project.

Prompting technique: Transparent mechanism explanation. The skill explains what the system does rather than instructing behavior, supporting user trust and adoption.

Uniqueness

claude-mem — Uniqueness

Differs From Seeds

Most similar to claude-flow (seed) in architecture — both use MCP tools plus a background service plus lifecycle hooks. The delta: claude-mem is exclusively a memory/context system, while claude-flow is a full orchestration framework. claude-mem's worker service architecture (background daemon on port 37777 + HTTP API) is unique in the memory-system category; all other memory systems in this batch use synchronous hook scripts. Also similar to ccmemory (seed) in purpose but uses SQLite+Chroma vs Neo4j+Ollama, and adds the progressive disclosure protocol to manage token costs.

Unique Aspects

Background daemon architecture: The only memory system in this batch that runs a persistent background service (port 37777). This enables async AI compression without blocking agent turns.
Progressive disclosure protocol: The explicit 3-layer search (search → timeline → get_observations) with "10x token savings" framing is a novel UX pattern for memory retrieval.
Cross-platform first: Explicit adapters for 7+ AI tools (Claude Code, Gemini CLI, OpenCode, Codex, Cursor, Warp, OpenClaw) — widest portability in this batch.
Private content tagging: <private> tags for selective exclusion is unique.
Scale indicator: 78K GitHub stars — 3x the next largest in this batch. Community adoption signal.

Observable Failure Modes

Port conflicts: Port 37777 must be free; conflicts with other services will prevent the worker from starting.
Bun dependency: Worker service requires Bun runtime, adding a dependency most memory systems avoid.
AI compression costs: Every session end triggers AI API calls for compression; heavy use accumulates cost.
Service startup latency: Worker must start before hooks can function; slow machines may see hook timeouts.
Global memory pollution: Cross-project global memory can inject irrelevant context from unrelated projects.

Workflow

claude-mem — Workflow

Session Lifecycle

SessionStart → worker service starts → inject relevant past context
                                        (from second session onward)
                ↓
UserPromptSubmit → session init / first message processing
                ↓
PostToolUse (every tool) → record observation
PreToolUse (Read) → file context lookup
                ↓
Stop → compress observations → AI-generated session summary → store

Memory Injection Logic

Context injection starts on the second session in a project. First session seeds memory. Subsequent sessions receive auto-injected context for relevant past work.

3-Layer Search Workflow (from mem-search skill)

Layer	Tool	Tokens	Purpose
1	`search`	~50-100/result	Get compact index with IDs
2	`timeline`	moderate	Get chronological context around anchor
3	`get_observations`	~500-1,000/result	Full details for filtered IDs only

Rule: Never fetch full details without filtering first. 10x token savings.

Phases + Artifacts

Phase	Artifact
Tool use	Observation row in SQLite
Session end	AI-compressed summary in SQLite
Memory search	Timeline + observation details
Codebase learn	Full repo observations (via `/learn-codebase` skill)

Approval Gates

None automated. The babysit skill provides human-in-the-loop monitoring.

Private Content

Add <private> tags to exclude sensitive content from storage.

Opt-Out

Per-session: include SKIP_MEMORY in prompt
Global: configure exclusion patterns

Memory Context

claude-mem — Memory & Context

Memory Type

hybrid — SQLite (primary) + Chroma vector database (semantic search).

Persistence Scope

global — All data stored in ~/.claude-mem/ globally. Cross-project by default.

State Files

Path	Content
`~/.claude-mem/*.db`	SQLite: observations, sessions, summaries
`~/.claude-mem/chroma/`	Chroma vector index for semantic search

Context Compaction Strategy

Explicit and designed in:

PostToolUse → each tool call = one observation row
Stop hook → AI compresses all observations from the session into a summary
SessionStart → injects compressed summaries from past sessions, not raw observations

This is fundamentally different from file-based approaches: compaction is the normal operation, not an edge case.

Progressive Disclosure Protocol

Memory injection uses 3 layers to control token costs:

Compact index (IDs + titles): ~50-100 tokens per result
Timeline view (context around anchor): moderate tokens
Full details (for specific IDs): ~500-1,000 tokens per result

Cross-Session Handoffs

Yes — automatically at every session boundary via SessionStart/Stop hooks.

Private Content

Use <private> tags in conversation to exclude sensitive content from storage.

Context Injection Filtering

The SessionStart hook injects relevant context — not all memory, just what the worker's semantic search deems relevant to the current session/project. This prevents context pollution.

Worker Service Role

The background worker on port 37777 handles the heavy lifting: AI compression calls, vector indexing, semantic search. Claude hooks only make HTTP calls to the worker, keeping hook latency minimal.

Orchestration

claude-mem — Orchestration

Multi-Agent

No explicit multi-agent orchestration. OpenClaw integration enables team-shared memory but is not agent-to-agent coordination.

Orchestration Pattern

none — Pure memory infrastructure. Skills provide task assistance but don't spawn subagents.

Isolation

none — In-place global memory. No per-feature isolation.

Multi-Model

Yes (implicit). The worker service calls a configurable AI provider for compression (Claude by default, but also OpenRouter, Gemini). This is not role-based multi-model routing but provider flexibility.

Execution Mode

background-daemon — The worker service runs persistently on port 37777. Claude hooks communicate with it via HTTP. This is the distinctive architecture: it's the only memory system in this batch that runs a persistent background service.

Crash Recovery

The worker service can be restarted via SessionStart hook. SQLite provides transactional safety.

Notes

The background daemon architecture is the key differentiator: unlike hook-only systems (ccmemory, ccmemory-plain) that run scripts synchronously per event, claude-mem's worker service can perform long-running tasks (AI compression, vector indexing) asynchronously without blocking the agent.

Ui Cli Surface

claude-mem — UI & CLI Surface

Dedicated CLI Binary

npx claude-mem — the installer/manager CLI. Subcommands:

install — sets up plugin, hooks, worker service
install --ide gemini-cli — Gemini CLI mode
install --ide opencode — OpenCode mode
uninstall — clean removal
switch <version> — version switching (beta channel)

Note: npm install -g claude-mem gives the SDK only, NOT the installer.

Local Web Dashboard

Yes — served by worker on http://localhost:37777

Feature	Details
Type	Web dashboard (browser-based)
Port	37777
Tech	HTML/JS (viewer.html + viewer-bundle.js bundled in plugin/ui/)
Features	Real-time memory stream, observation browsing, observation detail view
Citation API	`http://localhost:37777/api/observation/{id}`

IDE Integration

Plugins for:

Claude Code (.claude-plugin/)
Codex (.codex-plugin/)
Cursor (cursor-hooks/)
Warp (WARP.md)
Gemini CLI (auto-detected via ~/.gemini)
OpenClaw gateway (dedicated installer script)

Observability

Web viewer at port 37777 for real-time memory
All observations browsable with citation IDs
Session summaries accessible
how-it-works skill provides user-facing explanation

Notes on Installation

The installation complexity is multi-step by design: the worker service requires Bun runtime + port availability. The npm install approach is intentional to avoid npm install -g which would skip the service setup.

Related frameworks

same archetype · same primary tool · same memory type

pi (badlogic/earendil) ★ 55k

A8 Cross-runtime harness

A minimal, hackable, multi-provider terminal coding agent that adapts to your workflows via npm-installable TypeScript Extensions…

Agent Skills (Addy Osmani) ★ 46k

A8 Cross-runtime harness

Encodes senior-engineer software development lifecycle as 23 auto-routed skills and 7 slash commands for any AI coding agent.

wshobson/agents Plugin Marketplace ★ 36k

A8 Cross-runtime harness

Single Markdown source for 83 domain-specialized plugins that auto-generates idiomatic artifacts for five AI coding harnesses.

TabbyML/Tabby ★ 34k

A8 Cross-runtime harness

Self-hosted AI coding assistant server (alternative to GitHub Copilot) with admin dashboard, RAG-based completions, and multi-IDE…

Compound Engineering ★ 17k

A8 Cross-runtime harness

Make each unit of engineering work compound into easier future work via brainstorm→plan→execute→review→learn cycles.

Qodo (PR-Agent) ★ 11k

A8 Cross-runtime harness

Open-source AI PR reviewer with single-call tool architecture, PR compression for large diffs, self-reflection quality gate, and…

Distribution

Type: npm-package
License: Apache-2.0
Install: multi-step
Version: 6.5.0

Surfaces

CLI binary: claude-mem
CLI subcmds: 5
Local UI: web-dashboard
UI port: 37777
Tech stack: HTML/JS bundled (viewer.html + viewer-bundle.js)

Components

Commands: 0
Skills: 15
Subagents: 0
Hooks: 6
MCP servers: 1
MCP tools: 4
Scripts: 2
Templates: 0

Workflow

Phases: 5
Approval gates: 0
Spec format: none
Spec storage: db
Delta or full: none

Orchestration

Multi-agent: No
Pattern: none
Max concurrent: 1
Isolation: none
Consensus: none
Prompt chaining: No

Multi-model

Multi-model: Yes
BYOK: Yes
Modal: text

Execution

Mode: background-daemon
Crash recovery: Yes
Compaction: Yes
Session handoff: Yes
Streaming: No

Memory

Type: hybrid
Persistence: global
Search: hybrid
State files: 2 files

Quality

TDD: No
TDD mechanism: none
Self-review: none

Git / Observability

Auto commit: No
Auto PR: No
Auto merge: No
Worktree/feat: No
Audit log: Yes
Audit format: sqlite
Replay: No

Tools

Primary: claude-code
Targets: 7
Portability: high

Signals

Stars: 78k
Last commit: 2026-05-26
Maintainer: active
Quality score: 4/10