Batch 12 — Memory + Context Engineering (Compaction, Knowledge Graphs, Pruning)
Roster (10)
| slug | stars | distribution | cli_binary | local_ui | memory_type | compression | tier |
|---|---|---|---|---|---|---|---|
| basic-memory | 3087 | mcp-server | bm/basic-memory (15 cmds) |
yes (React, cloud) | sqlite+hybrid | none | A |
| claude-self-reflect | 214 | claude-plugin | csr-engine |
no | sqlite+vector | none (quality focus) | A |
| claude-supermemory | 2584 | claude-plugin | none | no | cloud-API (proprietary) | none | A |
| cognilayer | 28 | standalone-repo | cognilayer (TUI) |
no | sqlite | session compaction | A |
| symdex | 190 | mcp-server | symdex (10 cmds) |
no | sqlite | none (retrieval only) | A |
| kratos-mcp | 34 | npm-package | none | no | sqlite (FTS only) | none | C (LEGACY) |
| iwe | 1086 | cli-tool | iwe+iwes+iwec (18 cmds) |
no | file-based (Markdown) | none | A |
| lean-ctx | 2186 | mcp-server | lean-ctx (20 cmds) |
yes (browser, :9377) | hybrid+file-based | 60-99% (heuristic) | A |
| entroly | 398 | claude-plugin | entroly (40+ cmds) |
yes (browser, :9377) | hybrid vault/ | 70-95% (knapsack DP+BM25) | A |
| swe-pruner | 282 | standalone-repo | swe-pruner |
no | none | 23-54% (neural 0.6B model) | B |
Intra-batch Patterns
1. Divergent Architectures for the Same Problem
All 10 frameworks aim to improve LLM effectiveness via context engineering, but they operate at completely different layers:
- Storage layer (basic-memory, iwe, kratos-mcp): Build a knowledge graph or document store; agent retrieves what it needs.
- Inference preprocessing (swe-pruner, lean-ctx, entroly): Filter or compress context before the LLM call.
- Proxy layer (entroly): Intercept the actual HTTP API call; invisible to the agent.
- Session quality (claude-self-reflect, cognilayer): Improve quality over time via reflection/RL, not raw compression.
- Cloud delegation (claude-supermemory): Offload memory entirely to a hosted API.
- Structural indexing (symdex): Index code structure (AST, imports) for precision retrieval.
2. Three Distinct Compression Philosophies
| Philosophy | Representatives | Mechanism |
|---|---|---|
| Heuristic selection | lean-ctx | BM25 + entropy scoring + token budget |
| Mathematical optimization | entroly | 0/1 knapsack DP on entropy scores |
| Neural classification | swe-pruner | Fine-tuned 0.6B model per-chunk relevance |
3. Token Reduction Claims Spread
- swe-pruner: 23-54% (SWE-Bench Verified, peer-reviewed paper)
- entroly: 70-95% (self-measured,
verify-claims) - lean-ctx: 60-99% (benchmark suite, reproducible via
benchmarks.md) - The highest claims (99%+) come from frameworks with the least external validation; swe-pruner has the most rigorous evaluation methodology (arXiv paper, SWE-Bench).
4. Cross-Tool Portability Gradient
lean-ctx and swe-pruner have the highest cross_tool_portability (HTTP interfaces, no Claude lock-in). entroly is nominally multi-model via proxy but ships as a Claude plugin first. basic-memory is explicitly positioned as IDE-agnostic but has a freemium wall for team features. iwe is the only framework built as a writing/notes tool that gained agent capabilities rather than the reverse.
Most Interesting Finds
entroly — HTTP proxy-level interception: The only framework in the batch (and likely in the entire corpus) that operates at the API proxy layer. By routing calls through
:9377, entroly compresses context and attaches WITNESS hallucination-detection certificates to every response, creating an audit trail at $0/2ms without the agent issuing any special calls. The combination of knapsack DP (Rust), PRISM RL loop, and WITNESS NLI checking (AUROC 0.80 on HaluEval-QA) in a single package is architecturally novel. The Python+Rust hybrid (PyO3) with hot paths in native Rust is the most sophisticated runtime in the batch.lean-ctx's LEAN-CTX.md constraint: The most aggressive tool-override rule found in the batch — "CRITICAL: NEVER use native Claude Code Read/Grep/Shell tools" — forces the agent to route all file access through lean-ctx's 62 MCP tools, ensuring every context fragment passes through the compression layer. This is the clearest example in the batch of a framework using its prompt file as an architectural enforcement mechanism.
swe-pruner — Neural vs. heuristic gap: The only framework with a peer-reviewed paper and SWE-Bench numbers. The 14.84x compression claim on LongCodeQA is the highest single-benchmark ratio in the batch, but the CUDA requirement and absent license make it research-prototype-only in practice.
Items Written as Tier C
kratos-mcp— README explicitly states "This repo is now legacy. Please use ceorkm/kratos-cli." Last meaningful commit 2024. No MCP registration, no active maintainer. Written as full 11-file report (code still works) but markedtier: C (LEGACY).swe-pruner— Tier B: has a paper and reproducible benchmarks, but no license file, no version tags, CUDA-only requirement, and no native Claude integration. Demoted from A.
Cross-References Discovered
- lean-ctx and entroly both serve browser dashboards on port
:9377— naming collision if both run simultaneously on the same machine. - symdex could feed swe-pruner: symdex retrieves structurally-relevant code chunks; swe-pruner prunes those chunks by neural relevance. They are naturally complementary preprocessing stages.
- kratos-mcp successor is
ceorkm/kratos-cli(not in this batch — flagged for a future batch if evaluated). - claude-supermemory is built on Supermemory.ai API (mem0.ai-adjacent space); if Supermemory.ai discontinues, the plugin stops working — pure cloud dependency unlike every other framework in the batch.
- basic-memory AGPL-3.0 license means any system that bundles it must also be AGPL — unique licensing risk in the batch; all others are MIT, Apache-2.0, or unlicensed.
- cognilayer v4.3.0 claims 21 multi-agent orchestration tools (more than any other framework here) and 5 lifecycle hooks including PreCompact — the most complete lifecycle coverage in the batch alongside ccmemory (seed).