Batch 12 — Memory + Context Engineering (Compaction, Knowledge Graphs, Pruning)

Roster (10)

slug	stars	distribution	cli_binary	local_ui	memory_type	compression	tier
basic-memory	3087	mcp-server	`bm`/`basic-memory` (15 cmds)	yes (React, cloud)	sqlite+hybrid	none	A
claude-self-reflect	214	claude-plugin	`csr-engine`	no	sqlite+vector	none (quality focus)	A
claude-supermemory	2584	claude-plugin	none	no	cloud-API (proprietary)	none	A
cognilayer	28	standalone-repo	`cognilayer` (TUI)	no	sqlite	session compaction	A
symdex	190	mcp-server	`symdex` (10 cmds)	no	sqlite	none (retrieval only)	A
kratos-mcp	34	npm-package	none	no	sqlite (FTS only)	none	C (LEGACY)
iwe	1086	cli-tool	`iwe`+`iwes`+`iwec` (18 cmds)	no	file-based (Markdown)	none	A
lean-ctx	2186	mcp-server	`lean-ctx` (20 cmds)	yes (browser, :9377)	hybrid+file-based	60-99% (heuristic)	A
entroly	398	claude-plugin	`entroly` (40+ cmds)	yes (browser, :9377)	hybrid vault/	70-95% (knapsack DP+BM25)	A
swe-pruner	282	standalone-repo	`swe-pruner`	no	none	23-54% (neural 0.6B model)	B

Intra-batch Patterns

1. Divergent Architectures for the Same Problem

All 10 frameworks aim to improve LLM effectiveness via context engineering, but they operate at completely different layers:

Storage layer (basic-memory, iwe, kratos-mcp): Build a knowledge graph or document store; agent retrieves what it needs.
Inference preprocessing (swe-pruner, lean-ctx, entroly): Filter or compress context before the LLM call.
Proxy layer (entroly): Intercept the actual HTTP API call; invisible to the agent.
Session quality (claude-self-reflect, cognilayer): Improve quality over time via reflection/RL, not raw compression.
Cloud delegation (claude-supermemory): Offload memory entirely to a hosted API.
Structural indexing (symdex): Index code structure (AST, imports) for precision retrieval.

2. Three Distinct Compression Philosophies

Philosophy	Representatives	Mechanism
Heuristic selection	lean-ctx	BM25 + entropy scoring + token budget
Mathematical optimization	entroly	0/1 knapsack DP on entropy scores
Neural classification	swe-pruner	Fine-tuned 0.6B model per-chunk relevance

3. Token Reduction Claims Spread

swe-pruner: 23-54% (SWE-Bench Verified, peer-reviewed paper)
entroly: 70-95% (self-measured, verify-claims)
lean-ctx: 60-99% (benchmark suite, reproducible via benchmarks.md)
The highest claims (99%+) come from frameworks with the least external validation; swe-pruner has the most rigorous evaluation methodology (arXiv paper, SWE-Bench).

4. Cross-Tool Portability Gradient

lean-ctx and swe-pruner have the highest cross_tool_portability (HTTP interfaces, no Claude lock-in). entroly is nominally multi-model via proxy but ships as a Claude plugin first. basic-memory is explicitly positioned as IDE-agnostic but has a freemium wall for team features. iwe is the only framework built as a writing/notes tool that gained agent capabilities rather than the reverse.

Most Interesting Finds

entroly — HTTP proxy-level interception: The only framework in the batch (and likely in the entire corpus) that operates at the API proxy layer. By routing calls through :9377, entroly compresses context and attaches WITNESS hallucination-detection certificates to every response, creating an audit trail at $0/2ms without the agent issuing any special calls. The combination of knapsack DP (Rust), PRISM RL loop, and WITNESS NLI checking (AUROC 0.80 on HaluEval-QA) in a single package is architecturally novel. The Python+Rust hybrid (PyO3) with hot paths in native Rust is the most sophisticated runtime in the batch.
lean-ctx's LEAN-CTX.md constraint: The most aggressive tool-override rule found in the batch — "CRITICAL: NEVER use native Claude Code Read/Grep/Shell tools" — forces the agent to route all file access through lean-ctx's 62 MCP tools, ensuring every context fragment passes through the compression layer. This is the clearest example in the batch of a framework using its prompt file as an architectural enforcement mechanism.
swe-pruner — Neural vs. heuristic gap: The only framework with a peer-reviewed paper and SWE-Bench numbers. The 14.84x compression claim on LongCodeQA is the highest single-benchmark ratio in the batch, but the CUDA requirement and absent license make it research-prototype-only in practice.

Items Written as Tier C

kratos-mcp — README explicitly states "This repo is now legacy. Please use ceorkm/kratos-cli." Last meaningful commit 2024. No MCP registration, no active maintainer. Written as full 11-file report (code still works) but marked tier: C (LEGACY).
swe-pruner — Tier B: has a paper and reproducible benchmarks, but no license file, no version tags, CUDA-only requirement, and no native Claude integration. Demoted from A.

Cross-References Discovered

lean-ctx and entroly both serve browser dashboards on port :9377 — naming collision if both run simultaneously on the same machine.
symdex could feed swe-pruner: symdex retrieves structurally-relevant code chunks; swe-pruner prunes those chunks by neural relevance. They are naturally complementary preprocessing stages.
kratos-mcp successor is ceorkm/kratos-cli (not in this batch — flagged for a future batch if evaluated).
claude-supermemory is built on Supermemory.ai API (mem0.ai-adjacent space); if Supermemory.ai discontinues, the plugin stops working — pure cloud dependency unlike every other framework in the batch.
basic-memory AGPL-3.0 license means any system that bundles it must also be AGPL — unique licensing risk in the batch; all others are MIT, Apache-2.0, or unlicensed.
cognilayer v4.3.0 claims 21 multi-agent orchestration tools (more than any other framework here) and 5 lifecycle hooks including PreCompact — the most complete lifecycle coverage in the batch alongside ccmemory (seed).