Skip to content
/

memsearch CC Plugin

memsearch-cc · zilliztech/memsearch · ★ 1.8k · last commit 2026-05-21

Primitive shape 5 total
Skills 1 Hooks 4
00

Summary

memsearch CC Plugin — Summary

memsearch (zilliztech/memsearch) is a cross-platform semantic memory system for AI coding agents, built by Zilliz (the company behind Milvus vector database). The Claude Code plugin (plugins/claude-code/) is a lightweight wrapper around the memsearch Python CLI: 4 hooks (SessionStart, UserPromptSubmit, Stop, SessionEnd) that automatically capture conversation turns, a single memory-recall skill (context: fork), and shell scripts that call memsearch CLI commands. Memories are stored as daily .md files in .memsearch/memory/ using Milvus as the vector index (ONNX bge-m3 int8 embeddings by default — no API key required). The system uses 3-layer progressive disclosure: search → expand → transcript. One key design choice: the plugin has NO MCP server — it relies entirely on Claude Code's native hooks + skills + shell scripts.

Differs from seeds: closest to ccmemory (persistent memory via Claude Code plugin, graph + vector search) but memsearch uses Markdown as the source of truth (not Neo4j), Milvus as the shadow vector index (not the primary store), and ONNX local embeddings (not cloud API). Unlike ccmemory's MCP-anchored architecture, memsearch-cc is hooks-and-skills-only with no MCP server.

01

Overview

memsearch CC Plugin — Origin and Philosophy

Origin

Organization: zilliztech (Zilliz, the Milvus vector DB company). 1,841 stars. Apache-2.0 license? (README says MIT badge but org is commercial). Created 2026, active. 14 contributors.

Philosophy: Markdown is the Source of Truth

From the Claude Code plugin README:

"Markdown is the source of truth — inspired by OpenClaw. Your memories are just .md files — human-readable, editable, version-controllable. Milvus is a 'shadow index': a derived, rebuildable cache."

"Progressive retrieval, hybrid search, smart dedup, live sync — 3-layer recall (search → expand → transcript); dense vector + BM25 sparse + RRF reranking; SHA-256 content hashing skips unchanged content; file watcher auto-indexes in real time."

Key Design Decisions

  1. No API key required: default embedding is ONNX bge-m3 int8 — runs locally with no GPU. Quality comparable to OpenAI text-embedding-3-small (~1% lower on benchmark).

  2. Milvus as shadow index: the Markdown files are the durable store; Milvus is a rebuildable cache derived from them. If Milvus data is lost, memsearch index --force regenerates from .md files.

  3. Plugin has NO MCP server: unlike most memory frameworks that add MCP servers, memsearch-cc is hooks + skills + shell scripts only. This avoids sidecar service complexity.

  4. Cross-platform sharing: memories captured in Claude Code are queryable from Codex, OpenClaw, OpenCode via the same memsearch CLI/server. One memory store, multiple harnesses.

  5. Auto-recall via skill autonomy: memory-recall skill uses context: fork — runs in a forked subagent context so recall doesn't pollute the main agent's context window.

02

Architecture

memsearch CC Plugin — Architecture

Distribution

# Claude Code marketplace
/plugin marketplace add zilliztech/memsearch
/plugin install memsearch
# Restart Claude Code

Or standalone Python:

uv tool install memsearch
# OR
pip install memsearch

Required Runtime

  • Python 3.10+
  • ChromaDB (default vector backend) or Milvus
  • ONNX runtime (for local embeddings — auto-downloaded ~300MB)
  • memsearch CLI binary

Directory Structure

plugins/claude-code/
├── .claude-plugin/
│   └── plugin.json          # memsearch version 0.4.4
├── hooks/
│   ├── hooks.json           # SessionStart, UserPromptSubmit, Stop, SessionEnd
│   ├── common.sh
│   ├── session-start.sh
│   ├── user-prompt-submit.sh
│   ├── stop.sh
│   ├── session-end.sh
│   └── parse-transcript.sh
├── skills/
│   └── memory-recall/       # Single skill: memory-recall (context: fork)
├── prompts/
├── scripts/
└── transcript.py

Data Flow

Claude Code conversation turn
      ↓ UserPromptSubmit hook
shell script → memsearch CLI → chunking → ONNX embeddings → Milvus index
                                                           → .memsearch/memory/YYYY-MM-DD.md

Query time:
memory-recall skill (forked subagent)
      ↓ memsearch search <query>
3-layer recall:
  1. search: dense vector + BM25 sparse + RRF rerank
  2. expand: related chunks via graph traversal
  3. transcript: original session transcript lookup
      ↓
Context injected into main session

Memory File Location

.memsearch/memory/          # Daily .md files with summaries
.memsearch/memory/YYYY-MM-DD.md

Verify working: ls .memsearch/memory/

Plugin Architecture Summary

memsearch Python library (core: chunker, embeddings, vector store)
      ↓
memsearch CLI (search, index, watch, expand, transcript, config)
      ↓
plugins/claude-code (hooks + skill — wraps CLI in shell scripts)
03

Components

memsearch CC Plugin — Components

Hooks (plugins/claude-code/hooks/hooks.json)

Event Handler Purpose
SessionStart session-start.sh Load memory context from previous sessions; inject relevant past work
UserPromptSubmit user-prompt-submit.sh Capture user prompt turn; index in real time
Stop stop.sh End-of-session: mine conversation transcript; update memory files; async (timeout 120s)
SessionEnd session-end.sh Cleanup session artifacts; async (timeout 10s)

Skills

Skill Format Purpose
memory-recall skill-md (context: fork) Trigger semantic memory search; runs in forked subagent to avoid polluting main context

Two trigger modes:

  1. Explicit: /memory-recall what did we discuss about Redis?
  2. Autonomous: Claude auto-invokes when it senses the question needs history

Shell Scripts

Script Purpose
common.sh Shared variables and helper functions
session-start.sh Load and inject memory context at session start
user-prompt-submit.sh Real-time capture of conversation turns
stop.sh Mine transcript; update daily memory file
session-end.sh Session cleanup
parse-transcript.sh Parse Claude Code JSONL transcript format

memsearch CLI Commands (Python library)

Command Purpose
memsearch init <path> Initialize palace
memsearch mine <path> Mine content into memory
memsearch mine <path> --mode convos Mine Claude Code sessions
memsearch search "<query>" Semantic search
memsearch index Rebuild vector index
memsearch watch <dir> Auto-index on file changes
memsearch config set <key> <value> Configure settings
memsearch sweep <dir> Store per-message drawers from transcripts

Plugin Marketplace Entry

{
  "name": "memsearch",
  "description": "Automatic semantic memory for Claude Code — remembers what you worked on across sessions",
  "version": "0.4.4",
  "source": "./plugins/claude-code",
  "category": "productivity"
}
05

Prompts

memsearch CC Plugin — Prompts

The memsearch Claude Code plugin has minimal prompt content — the skill and hook shell scripts are functional code, not prompt files.

Prompt 1: memory-recall Skill

From the Claude Code plugin skills directory (context: fork). The skill triggers a semantic search and injects results:

Technique: Forked subagent context (context: fork) so recall doesn't pollute the main context window. The skill invokes memsearch search <query> and formats results for injection.

The skill description from plugin docs:

"Auto-invokes the skill when it senses the question needs history: 'We discussed Redis caching before, what was the TTL we chose?' → agent recognizes past-tense reference → auto-invokes memory-recall."

Prompt 2: Session Start Context Injection (session-start.sh)

Technique: Hook fires at session start; shell script calls memsearch search with session context; injects top results as context block before the first user message.

The effect (from README illustration):

Without plugin:
Wednesday: "The /orders endpoint is slow"
→ Claude suggests solutions from scratch (forgot Monday's Redis cache)

With plugin:
Wednesday: "The /orders endpoint is slow"
→ Plugin injects: "Added Redis caching middleware with 5min TTL..."
→ Claude: "We already have Redis caching — let me add the /orders endpoint to it"

No SKILL.md prompt files directly readable from plugin directory

The skills/memory-recall/ directory contents were not fetched (skill-md format assumed based on plugin structure). The actual prompt content in the SKILL.md file was not captured.

09

Uniqueness

memsearch CC Plugin — Uniqueness and Positioning

Differs from Seeds

Closest to ccmemory (persistent memory Claude Code plugin with vector search) but differs in three fundamental ways:

  1. Markdown-first vs Neo4j-first: ccmemory uses Neo4j as the primary store. memsearch uses Markdown .md files as primary storage with Milvus/ChromaDB as a rebuildable shadow index. Loss of the vector index doesn't lose the memory — just memsearch index --force to rebuild.

  2. No MCP server in the plugin: ccmemory is Archetype 3 (MCP-anchored toolserver). memsearch-cc is hooks + skills + shell scripts only — no sidecar MCP server process required for Claude Code users.

  3. Zero external dependencies for default path: ccmemory requires Neo4j + cloud embeddings (or Ollama). memsearch-cc works with a local ONNX model download (~300MB) and ChromaDB (bundled). No cloud API key required.

Also compared to agentmemory (same batch): agentmemory is a Node.js server with 12 hooks and 53 MCP tools; memsearch-cc is a Python plugin with 4 hooks and 1 skill. memsearch-cc is architecturally simpler.

Positioning

memsearch targets developers who want cross-platform semantic memory without infrastructure overhead. The Zilliz backing means enterprise-grade Milvus support is available for users who want cloud-scale vector search, while the default path (ChromaDB + ONNX) needs nothing external. The "Markdown is the source of truth" philosophy appeals to developers who distrust black-box vector stores.

Observable Failure Modes

  1. ONNX model download: first-run ~300MB download can fail in offline/restricted environments.
  2. Milvus coupling: changing embedding models requires memsearch index --force re-indexing — expensive for large memory stores.
  3. Stop hook async timeout: 120s timeout for transcript mining may be insufficient for very long sessions with large transcripts.
  4. Cross-session memory quality: automatic summarization quality depends on the embedding model; irrelevant or verbose sessions can pollute the memory store.
  5. Plugin-only path has no MCP: Cursor/Windsurf users must use the MCP server path, not the plugin — two separate installation paths.
04

Workflow

memsearch CC Plugin — Workflow

Phases

Phase Trigger Artifact
1. Install /plugin install memsearch Plugin active in Claude Code
2. Session start SessionStart hook Past memory context injected into session
3. Conversation Normal Claude Code usage UserPromptSubmit hook captures each turn
4. Session end Stop hook (async) Transcript mined → .memsearch/memory/YYYY-MM-DD.md updated
5. Memory recall /memory-recall <query> or autonomous 3-layer search results injected

Auto-Save Behavior

The Stop hook fires after every assistant response and mines the conversation transcript. No explicit save action needed. The async: true + 120s timeout prevents blocking the agent while mining occurs.

3-Layer Progressive Recall

When memory-recall skill fires:

  1. Search: hybrid vector (dense bge-m3) + BM25 sparse + RRF reranking → top-K results
  2. Expand: graph traversal from top results to related chunks
  3. Transcript: original session transcript lookup for full context

Approval Gates

None. All hooks and skill invocations are automatic.

Backfill Workflow (for existing sessions)

memsearch mine ~/.claude/projects/ --mode convos
# Optionally scope per project:
memsearch mine ~/.claude/projects/<project>/ --mode convos --wing <project>
06

Memory Context

memsearch CC Plugin — Memory and Context

Storage Architecture

Two-layer storage:

  1. Markdown files (.memsearch/memory/YYYY-MM-DD.md) — primary durable store; human-readable; version-controllable
  2. Milvus/ChromaDB — shadow index; derived from Markdown; rebuildable via memsearch index --force

Embedding

  • Default: ONNX bge-m3 int8 (multilingual, 100+ languages)
  • Alternative: all-MiniLM-L6-v2 (English-only, ~30MB)
  • No API key required for default embedding
  • Can switch: memsearch config set embedding.provider onnx

Dense vector (bge-m3) + BM25 sparse retrieval + RRF (Reciprocal Rank Fusion) reranking. SHA-256 content hashing for deduplication — unchanged content is skipped on re-index.

Memory Persistence

  • Scope: global (per working directory; --wing <project> for project-scoped)
  • Retention: indefinite (daily Markdown files accumulate)
  • Format: Markdown summaries per day

Cross-Session Handoff

Complete. SessionStart hook loads relevant past context from ALL previous sessions. The memory is continuous — no session-to-session re-explanation needed.

Cross-Platform Memory

The same .memsearch/memory/ directory is accessible from:

  • Claude Code (via plugin hooks + skill)
  • Codex CLI (via plugin hooks)
  • OpenClaw (via plugin)
  • OpenCode (via plugin)
  • Cursor, Windsurf, Roo Code (via MCP server — memsearch also ships an MCP mode)
  • REST API (for Aider and other HTTP clients)

Backfill

Existing Claude Code JSONL transcripts can be backfilled:

memsearch mine ~/.claude/projects/ --mode convos
07

Orchestration

memsearch CC Plugin — Orchestration

No multi-agent orchestration. memsearch-cc is a memory layer for single-agent sessions.

  • Multi-agent: no
  • Orchestration pattern: none
  • Isolation: none (all agents share the same .memsearch/memory/ directory)
  • Execution mode: event-driven (hooks fire on Claude Code lifecycle events)
  • Consensus: none

The memory-recall skill uses context: fork — it spawns a forked subagent to run the search query, preventing the retrieval process from consuming main context window tokens. This is not multi-agent orchestration; it is a context isolation technique.

08

Ui Cli Surface

memsearch CC Plugin — UI and CLI Surface

memsearch CLI

Installed globally: uv tool install memsearch or pip install memsearch

Key commands: init, mine, search, index, watch, expand, transcript, config, sweep, wake-up

No Local Web Dashboard (Plugin-only)

The Claude Code plugin has no web dashboard.

memsearch Python Library

Full Python API available for programmatic access: import mempalace (note: memsearch is separate from mempalace — no relation despite similar API surface).

Plugin Marketplace

Installable via Claude Code's plugin marketplace:

/plugin marketplace add zilliztech/memsearch
/plugin install memsearch

Cross-Platform Support

Platform Integration
Claude Code hooks + skill + marketplace plugin
Codex CLI plugin (hooks + scripts)
OpenClaw plugin
OpenCode plugin
Cursor/Windsurf/Roo Code/Cline MCP server (memsearch serve-mcp)
Claude Desktop MCP server
Aider REST API

Embedding Model Download

First run: ~300MB ONNX model download (bge-m3 int8). No GPU required. No API key required.

Related frameworks

same archetype · same primary tool · same memory type

MemPalace ★ 53k

Verbatim local-first AI memory with 96.6% R@5 retrieval on LongMemEval using zero API calls — structured into a palace hierarchy…

Beads (Yegge) ★ 24k

Dolt-powered distributed graph issue tracker where AI agents track tasks with hierarchical IDs and dependency edges, claim work…

deepagents (LangChain) ★ 23k

Opinionated Python agent harness on top of LangGraph with sub-agents, filesystem, memory, and context compaction bundled in

agentmemory ★ 18k

Persistent, searchable memory for AI coding agents that captures every tool interaction, compresses it via LLM, and injects…

Open Multi-Agent ★ 6.3k

Give a natural-language goal to a coordinator agent and get a dynamically decomposed, parallelized task DAG executed by…

Basic Memory ★ 3.1k

Gives AI agents a persistent, human-readable knowledge graph of project decisions, observations, and relations stored as plain…