AI-AfterImage — Memory and Context
Storage Architecture
Primary: SQLite at ~/.afterimage/afterimage.db. KnowledgeBase class (kb.py) wraps the backend with an abstraction layer.
Optional: PostgreSQL + pgvector at a configured host. IVFFlat index for native vector search. Scales to millions of entries with concurrent write access.
Backend abstraction: Both backends implement the same interface — SQLiteBackend and SyncPostgreSQLBackend. The hook caches the backend instance at module level to reuse connection pools across invocations.
Knowledge Base Schema
Each stored entry contains:
file_path: path to the code file
new_code: code content (after the Write/Edit)
old_code: previous content (for Edit operations)
context: empty string (not used currently)
session_id: Claude Code session ID (from CLAUDE_SESSION_ID env var)
embedding: vector embedding (optional, if EmbeddingGenerator is available)
timestamp: when stored
Embedding Model
~90 MB local model, downloaded by afterimage setup. Generates embeddings for hybrid search. Works offline after download. Embedding dimension: 384.
Hybrid Search
search.py — keyword + semantic search:
- BM25-style keyword match on code content
- Cosine similarity on stored embeddings (if available)
- Combined score for ranking
PostgreSQL backend uses native pgvector IVFFlat index; SQLite uses brute-force cosine similarity.
Churn Tracking Storage
afterimage/churn/ — separate tracking store. Records per-function edits with timestamps and session IDs. Used to compute tier classification (Gold/Silver/Bronze/Red) and detect repetitive function edits within 24h windows.
AST-based function detection: Python (tree-sitter/ast module), regex for JavaScript/TypeScript/Go/Rust/C.
Seen-Writes State
~/.afterimage/.seen_writes — plain text file, one content hash per line, last 100 entries. Implements the deny-once-allow-retry pattern without persistent database overhead. Cleared on restart (no TTL — file accumulates until explicitly cleared or rolled by the 100-entry window).
Three keying modes:
file: resets denial per session (same file → re-denies next session)
content: keyed on file + content hash (each unique change shown once)
session_file: keyed on session + file (denied once per session per file)
Context Injection Scope
Injected context shows at most 3 past files, with 400-character code previews (truncated with "... (truncated)" if longer). Deduplication by file path (same file shown once). Semantic chunking mode can group multiple results from the same file.
Persistence Scope
Global (cross-project, cross-session): the SQLite KB at ~/.afterimage/afterimage.db is a single global store. All code written across all projects and sessions accumulates in one place. Search runs across the entire history unless filtered by --path.
This is a deliberate choice: the goal is cross-project pattern recall ("how did I solve this last week in the other codebase?"), not project-isolated memory.
extract.py handles Codex-specific transcript formats:
session_meta JSON blocks for session ID extraction
response_item envelope format with function_call + custom_tool_call
apply_patch unified diff format for file additions and updates
This extends the memory to cover both Claude Code and Codex CLI transcript histories.