codesight

codesight · Houseofmvps/codesight · ★ 1.1k · last commit 2026-05-18

Primitive shape 13 total

MCP tools 13

Summary

codesight — Summary

codesight is a universal AI context generator that compiles a codebase into a compact, token-efficient summary file (CODESIGHT.md) using AST-precision analysis and pattern detection across 30+ frameworks and 14 languages. Its primary value proposition is reducing per-conversation token spend by 7–130x by replacing raw file exploration with pre-compiled, structured maps of routes, schemas, components, middleware, and dependency graphs. The tool operates entirely locally with zero dependencies and zero API calls, running in ~200ms via a single npx codesight invocation. Beyond the base context map, codesight v1.6.2+ adds a wiki mode that generates a persistent .codesight/wiki/ knowledge base organized by domain (auth, database, payments, UI), allowing AI sessions to start with a 200-token index rather than the full 5K-token context map. An MCP server mode exposes 13 tools directly to Claude Code, Cursor, and any MCP-compatible client, and a --init flag generates platform-specific config files (CLAUDE.md, .cursorrules, codex.md, AGENTS.md) automatically. Unlike seed frameworks that focus on agent behavior rules or orchestration, codesight is a static analysis tool: it occupies the "input preparation" layer before any AI session begins, making it most similar to agent-os in its "write files that agents read" philosophy but differentiated by AST-precision extraction rather than manual template content.

Overview

codesight — Overview

Origin

Created by Kailesk Khumar (founder of HouseofMVPs and Kailxlabs), codesight emerged from a practical observation: AI assistants waste thousands of tokens every conversation figuring out project structure. The description on the npm page is direct: "Universal AI context generator. Saves thousands of tokens per conversation in Claude Code, Cursor, Copilot, Codex, and more."

Philosophy

The core philosophy is context engineering as static analysis. Rather than instructing agents how to behave, codesight pre-processes the codebase into the minimum viable representation needed for AI comprehension. TypeScript projects get full AST precision; other languages use battle-tested regex detection. The wiki mode extends this with Karpathy's LLM wiki pattern — "compiled from AST, not an LLM. Zero API calls. 200ms."

Stated Goals

From the README:

"Your AI assistant wastes thousands of tokens every conversation just figuring out your project. codesight fixes that in one command."

"Instead of loading the full 5K token context map every conversation, your AI reads one targeted article."

"The key difference from general-purpose wiki tools: codesight already knows your routes, schema, blast radius, and middleware from AST — no LLM needed to extract code structure."

Scope

The tool is explicitly a pre-session context packager, not a workflow framework. It has no opinions about how the agent should write code, plan features, or create PRs. Its boundary ends at handing off a .codesight/CODESIGHT.md (or wiki article) to the agent at session start.

Token Savings Claims

Measured on real OSS projects (v1.6.4):

SaaS A: 46,020 tokens → 550 tokens with wiki (83.7x reduction)
SaaS B: 26,130 tokens → 440 tokens (59.4x)
Average combined reduction: 91x

ultraship — 39 expert skills for Claude Code
claude-rank — SEO/GEO/AEO plugin for Claude Code

Architecture

codesight — Architecture

Distribution

Type: npm package (npx / global install)
Binary: codesight (mapped from dist/index.js in package.json bin)
Version analyzed: 1.14.0
Runtime required: Node.js >= 18
Dependencies: Zero runtime dependencies (0 deps in package.json)
Language: TypeScript (compiled to dist/)

Install Methods

# One-shot, no install
npx codesight

# Global install
npm install -g codesight

# MCP server mode
npx codesight --mcp

Directory Tree

codesight/
├── src/
│   ├── index.ts              # CLI entry point
│   ├── core.ts               # Main scan orchestration
│   ├── scanner.ts            # File collection + .codesightignore
│   ├── config.ts             # Config loading
│   ├── formatter.ts          # CODESIGHT.md output
│   ├── mcp-server.ts         # MCP JSON-RPC server (13 tools)
│   ├── telemetry.ts          # Anonymous usage metrics
│   ├── types.ts              # ScanResult types
│   ├── ast/                  # TypeScript AST analysis
│   ├── detectors/            # Language/framework detectors
│   │   ├── routes.ts
│   │   ├── schema.ts
│   │   ├── components.ts
│   │   ├── libs.ts
│   │   ├── config.ts
│   │   ├── middleware.ts
│   │   ├── graph.ts          # Dependency graph
│   │   ├── contracts.ts      # Route contract enrichment
│   │   ├── tokens.ts         # Token count estimation
│   │   ├── blast-radius.ts   # Import graph analysis
│   │   ├── graphql.ts
│   │   └── events.ts
│   ├── generators/
│   │   ├── wiki.ts           # Wiki knowledge base generation
│   │   ├── ai-config.ts      # CLAUDE.md / .cursorrules / AGENTS.md
│   │   └── html-report.ts    # Interactive HTML report
│   ├── plugins/              # Plugin system
│   └── monorepo/             # Monorepo support
├── .codesight/               # Output artifacts
│   ├── CODESIGHT.md          # Primary context map
│   ├── wiki/                 # Wiki knowledge base (--wiki)
│   │   ├── index.md
│   │   ├── overview.md
│   │   ├── auth.md
│   │   ├── database.md
│   │   └── ...
│   ├── config.md
│   ├── routes.md
│   └── ...
├── tests/
├── eval/                     # Benchmark data
└── package.json

Target AI Tools

Claude Code (primary)
Cursor
GitHub Copilot
OpenAI Codex
Windsurf
Cline
Aider
Any MCP-compatible client

Language/Framework Coverage

14 programming languages (TypeScript at AST level; others at regex level)
30+ framework detectors
13 ORM parsers

Output Artifacts

File	Format	Purpose
`.codesight/CODESIGHT.md`	Markdown	Full context map
`.codesight/wiki/index.md`	Markdown	Domain article catalog
`.codesight/wiki/*.md`	Markdown	Per-domain articles
`.codesight/KNOWLEDGE.md`	Markdown	Decision/notes/ADR map
`CLAUDE.md`	Markdown	Claude-specific config
`.cursorrules`	Plain text	Cursor-specific config
`AGENTS.md`	Markdown	Multi-tool agent config

Components

codesight — Components

CLI Flags (Subcommands)

Flag	Purpose
`npx codesight` (default)	Scan project, generate CODESIGHT.md
`--wiki`	Generate wiki knowledge base (.codesight/wiki/)
`--init`	Generate CLAUDE.md, .cursorrules, codex.md, AGENTS.md
`--open`	Open interactive HTML report in browser
`--mcp`	Start as MCP server (13 tools) for Claude Code / Cursor
`--blast <file>`	Show blast radius for a specific file
`--profile <tool>`	Generate optimized config for a specific AI tool
`--benchmark`	Show detailed token savings breakdown
`--mode knowledge`	Map .md notes/ADRs to KNOWLEDGE.md
`--mode knowledge <dir>`	Map a specific vault (Obsidian, etc.)
`--watch`	Watch mode — regenerate on file changes

MCP Server Tools (13 total)

Tool	Purpose
`codesight_scan`	Full codebase scan, returns structured ScanResult
`codesight_get_routes`	Extract HTTP/GraphQL/gRPC/WebSocket routes
`codesight_get_schemas`	Extract database schemas and ORM models
`codesight_get_components`	Extract UI components with props
`codesight_get_libs`	Extract library/dependency usage
`codesight_get_middleware`	Extract middleware/interceptors
`codesight_get_graph`	Dependency graph edges
`codesight_get_blast_radius`	Impact analysis for a file
`codesight_get_tokens`	Token count for context map sections
`codesight_get_wiki_index`	Get wiki article catalog (~200 tokens)
`codesight_get_wiki_article`	Read one wiki article by name
`codesight_lint_wiki`	Check for orphan articles, stale content
`codesight_get_knowledge`	Get KNOWLEDGE.md content

Detectors

Detector	Purpose
routes.ts	HTTP routes (Express, Fastify, Next.js, etc.)
schema.ts	DB schemas (Prisma, Drizzle, SQLAlchemy, etc.)
components.ts	UI components (React, Vue, Svelte)
libs.ts	Library detection
config.ts	Framework config files
middleware.ts	Auth/rate-limit/logging middleware
graph.ts	Import dependency graph
contracts.ts	Route input/output types from AST
blast-radius.ts	Reverse dependency traversal
graphql.ts	GraphQL/gRPC/WebSocket routes
events.ts	Event emitters/handlers

Generators

Generator	Output
wiki.ts	.codesight/wiki/*.md domain articles
ai-config.ts	CLAUDE.md, .cursorrules, AGENTS.md, codex.md
html-report.ts	Interactive browser report

Config

.codesight.json (optional) — project-level overrides for ignore patterns, max depth, custom detectors.

Prompts

codesight — Prompts

codesight does not ship AI prompt files — it ships generated context files (CODESIGHT.md, wiki articles, KNOWLEDGE.md) that are read by AI assistants as passive context. The "prompting technique" is therefore structural: pre-compilation of codebase metadata into a token-minimal format.

Excerpt 1 — Sample CODESIGHT.md Output Structure (from eval directory)

The generated CODESIGHT.md follows a deterministic schema:

# Project: my-saas-app

**Stack:** Next.js 14 · TypeScript · PostgreSQL (Prisma) · Tailwind

## Routes (23)

### Auth (4)
| Method | Path | Auth | Input | Output |
|--------|------|------|-------|--------|
| POST | /api/auth/login | none | { email, password } | { token, user } |
| POST | /api/auth/register | none | { email, password, name } | { user } |
| GET  | /api/auth/me | bearer | — | { user } |
| POST | /api/auth/logout | bearer | — | { ok } |

## Schemas (3)

### User
| Field | Type | Nullable | Relations |
|-------|------|----------|-----------|
| id | String (uuid) | false | posts, sessions |
| email | String | false | — |
| createdAt | DateTime | false | — |

Technique: Tabular compression. Routes and schemas compressed from ~150 tokens each (raw file read) to ~15 tokens (table row). Achieves 7–10x reduction at the type-information level.

Excerpt 2 — Wiki Article Format (auth.md)

# Auth — my-saas-app

## Routes
- POST /api/auth/login — no auth required — returns token + user
- POST /api/auth/register — no auth required
- GET /api/auth/me — requires bearer token

## Middleware
- `withAuth` (src/middleware/auth.ts) — validates JWT, attaches req.user

## Session
- JWT stored in httpOnly cookie
- 7-day expiry, refresh via /api/auth/refresh

## Key Files
- src/middleware/auth.ts — auth enforcement
- src/lib/jwt.ts — token creation/validation
- prisma/schema.prisma (User model)

Technique: Narrative + selective inclusion. Domain article includes only auth-relevant routes, middleware, and files — not the full codebase map. At ~300 tokens vs. ~5K for CODESIGHT.md, this is targeted retrieval.

Excerpt 3 — AI Config Init Output (CLAUDE.md generated content)

# Project Context

**Stack**: Next.js 14, TypeScript, PostgreSQL (Prisma), Tailwind CSS
**Test runner**: Vitest
**Package manager**: pnpm

## Key paths
- Routes: src/app/api/, src/pages/api/
- Components: src/components/
- DB schema: prisma/schema.prisma
- Auth: src/middleware/auth.ts, src/lib/jwt.ts

## Context Files
- `.codesight/CODESIGHT.md` — full codebase map (3,936 tokens)
- `.codesight/wiki/index.md` — wiki catalog (200 tokens)

## Usage
Start each session by reading `.codesight/wiki/index.md`, then load the
relevant domain article. Only read CODESIGHT.md when you need full context.

Technique: Structured onboarding. Tells the agent where context files live and how to use them efficiently — minimizing "figuring out" cost at session start.

Uniqueness

codesight — Uniqueness

differs_from_seeds

codesight is most similar to agent-os in that both produce files that agents read passively, but they differ fundamentally in what those files contain. agent-os ships curated markdown templates (standards/, profiles/) that encode developer philosophy and workflow conventions; codesight generates machine-extracted structural maps (routes, schemas, dependency graphs) from AST analysis of live source code. codesight most closely resembles a pre-session counterpart to taskmaster-ai's analyze-complexity — both reduce the token cost of "figuring out the codebase" — but codesight is entirely static (zero LLM calls) while taskmaster-ai uses Claude to analyze its own tasks. Unlike superpowers, BMAD-METHOD, or spec-driver, codesight ships no behavioral skills, workflow instructions, or enforcement mechanisms; it has no opinion about how the agent should code, only what it should know about the codebase before coding.

Positioning

Primary differentiator: AST-precision extraction. TypeScript projects get full type information, route contracts, and import graphs from actual AST traversal — not grep patterns or regex.
Secondary differentiator: Three-tier context budget management (full map → wiki → targeted article), making codesight useful even in severely token-constrained sessions.
Target user: Developers who have already chosen their AI tool and framework, want to reduce per-session token spend without changing their workflow.

Observable Failure Modes

Non-TypeScript projects get degraded accuracy: Only TypeScript benefits from AST precision; Python, Go, Ruby, etc. use regex detection which may miss routes or schemas in unconventional structures.
Wiki staleness: If --wiki or --watch is not run after code changes, the wiki articles become stale. The log.md tracks when wiki was last regenerated, but there is no auto-invalidation.
MCP tool cold start: Starting as an MCP server re-scans on first call; large monorepos may have noticeable first-call latency.
Framework detector miss: 30+ detectors cover popular frameworks, but bespoke or niche frameworks will produce sparse context maps.
Zero behavioral guardrails: codesight generates config files but has no enforcement mechanism — if the agent ignores CLAUDE.md, there is no fallback.

Workflow

codesight — Workflow

Execution Pattern

codesight is a one-shot pre-session tool — not a continuous agent loop. It runs before an AI session begins, producing artifacts the agent reads passively.

Primary Workflow

Phase 1: Project Scan (one-time or on-change)

npx codesight

Collects files respecting .codesightignore
Detects project language/framework
Runs all detectors in parallel (routes, schemas, components, libs, config, middleware, graph, GraphQL, gRPC, WebSocket, events)
Enriches route contracts from AST
Computes token statistics
Writes .codesight/CODESIGHT.md

Artifact produced: .codesight/CODESIGHT.md (~3K–5K tokens)

Phase 2: Wiki Generation (optional, persistent)

npx codesight --wiki

Organizes scan results into domain-specific articles
Writes .codesight/wiki/index.md + per-domain articles
Append-only log.md tracks every wiki operation

Artifact produced: .codesight/wiki/*.md (~200-token index + ~300-token articles)

Phase 3: Config Generation (one-time setup)

npx codesight --init

Generates CLAUDE.md, .cursorrules, codex.md, AGENTS.md
Tailored output per AI tool

Artifact produced: Multiple AI tool config files

Phase 4: AI Session

Agent reads .codesight/CODESIGHT.md or wiki articles at session start. No ongoing codesight involvement.

Phase-to-Artifact Map

Phase	Artifact
Scan	`.codesight/CODESIGHT.md`
Wiki	`.codesight/wiki/index.md`, `.codesight/wiki/*.md`
Knowledge	`.codesight/KNOWLEDGE.md`
Init	`CLAUDE.md`, `.cursorrules`, `AGENTS.md`, `codex.md`
HTML	Browser report (ephemeral)

Approval Gates

None — fully automated, no user confirmation steps.

CI Integration

npx codesight --mode knowledge  # in CI alongside codesight scan

Both artifacts stay current on every push.

Incremental / Watch

npx codesight --watch  # regenerate on file changes (debounced)
npx codesight --hook   # regenerate on every git commit

Memory Context

codesight — Memory & Context

State Storage

codesight uses file-based project-scoped storage in the .codesight/ directory.

Persistent Artifacts

File	Format	Persistence	Updated
`.codesight/CODESIGHT.md`	Markdown	Project-level	On every `npx codesight` run
`.codesight/wiki/*.md`	Markdown	Project-level	On `--wiki` run
`.codesight/wiki/log.md`	Append-only log	Project-level	On every wiki operation
`.codesight/KNOWLEDGE.md`	Markdown	Project-level	On `--mode knowledge` run
`.codesight/config.md`	Markdown	Project-level	On scan

No Session State

codesight maintains no session-level state — it does not track what an AI assistant did or read. Each scan produces a fresh output file.

Context Compaction Handling

The wiki mode was explicitly designed for context compaction scenarios:

wiki/index.md (~200 tokens) is the session-start entry point
Individual articles (~300 tokens each) are loaded on demand
The full CODESIGHT.md (~3K–5K tokens) is a fallback when targeted lookup fails

This three-tier approach means that even when a Claude Code session compacts its context, the agent can re-establish orientation by re-reading wiki/index.md rather than the full context map.

Cross-Session Handoff

Wiki files are committed to git, enabling cross-machine and cross-session context sharing. Any new session (on any machine) starts with full codebase knowledge from the wiki files without re-scanning.

Cache Behavior

Results are cached in memory during a single scan run
No cross-run disk cache — every npx codesight re-scans from scratch
--watch mode uses debounced file-change detection to minimize re-scans

.codesightignore

Users can create .codesightignore (gitignore-style) to exclude paths from scanning.

Orchestration

codesight — Orchestration

Multi-Agent Pattern

codesight is a single-shot tool with no orchestration layer. It does not spawn agents, coordinate workers, or sequence AI tasks. It runs once, writes files, and exits.

Orchestration Pattern

none — codesight produces context artifacts consumed by external agents; it does not orchestrate agents itself.

Execution Mode

one-shot — invoked per scan, produces output, terminates. Exception: --watch mode runs as a continuous file watcher (not a continuous agent loop).

Internal Parallelism

The scan itself uses Promise.all across all detectors (routes, schemas, components, libs, config, middleware, graph, graphql, grpc, websocket, events) — parallel static analysis, not multi-agent orchestration.

Isolation Mechanism

none — codesight reads the filesystem and writes to .codesight/. No sandboxing.

Multi-Model Usage

None — codesight makes zero LLM API calls. All analysis is static (AST + regex).

Subagent Definition Format

None.

Consensus Mechanism

None.

Prompt Chaining

No — codesight output is passive context, not a chained prompt input.

MCP Mode as Integration Point

When running as an MCP server (--mcp), codesight exposes 13 tools that AI agents can call during a session. This is the closest to "orchestration" codesight gets — a tool server that agents query rather than a standalone pipeline.

Cross-Tool Portability

High — codesight generates standard markdown files (CODESIGHT.md, wiki/*.md, KNOWLEDGE.md) readable by any AI tool that accepts file context. The MCP server further extends compatibility to any MCP-protocol client.

Ui Cli Surface

codesight — UI & CLI Surface

CLI Binary

Binary name: codesight
Is thin wrapper: No — it is the runtime (TypeScript compiled to Node.js)
Subcommands: Flag-based (not subcommand-based). Key flags: --wiki, --init, --open, --mcp, --blast, --profile, --benchmark, --mode knowledge, --watch

npx codesight                          # Default: scan + generate CODESIGHT.md
npx codesight --wiki                   # Wiki mode
npx codesight --init                   # Generate AI tool configs
npx codesight --open                   # Open HTML report in browser
npx codesight --mcp                    # Start MCP server
npx codesight --blast src/lib/db.ts    # Blast radius analysis
npx codesight --profile claude-code    # Profile-specific config
npx codesight --benchmark              # Token savings breakdown
npx codesight --mode knowledge         # Knowledge map from .md files

Local UI

Exists: Yes — interactive HTML report
Type: Browser-rendered HTML (not a persistent server)
Access: npx codesight --open opens the HTML in the default browser
Features: Codebase visualization, token savings stats, framework detection results

MCP Server

Exists: Yes
Type: stdio MCP server
Tool count: 13 tools
Start command: npx codesight --mcp

IDE Integration

None native — the --init flag generates AI tool config files (CLAUDE.md, .cursorrules, AGENTS.md, codex.md) that configure the tools, but codesight itself has no IDE plugin.

Observability

--benchmark flag prints detailed token savings breakdown
Token stats included in every scan output (output tokens, estimated exploration tokens saved)
No audit log of agent actions (codesight doesn't observe the agent)

GitHub Actions

Not shipped as an action — designed for manual or CI invocation:

npx codesight
npx codesight --mode knowledge

Related frameworks

same archetype · same primary tool · same memory type

Taskmaster AI ★ 27k

A3 MCP-anchored

Converts a PRD into a dependency-ordered JSON task graph that AI coding agents execute one task at a time, eliminating context…

ccmemory ★ 1

A3 MCP-anchored

Accumulates decisions, corrections, and failed approaches from Claude Code sessions into a queryable Neo4j graph so each new…

Pimzino spec-workflow-mcp ★ 4.2k

A3 MCP-anchored

MCP server providing spec-driven development workflow with dashboard-backed approval gates, implementation logging, and VSCode…

MCP Shrimp Task Manager ★ 2.1k

A3 MCP-anchored

Convert natural language requests into structured AI development tasks with chain-of-thought enforcement, reflection gates, and…

Bernstein ★ 460

A3 MCP-anchored

Govern parallel CLI coding agents with a deterministic Python scheduler, HMAC-chained audit trail, and compliance-ready signed…

LeanSpec ★ 252

A3 MCP-anchored

Provides a unified spec CLI and MCP server over any existing spec backend (markdown, GitHub Issues, ADO), making spec-driven…

Distribution

Type: npm-package
License: MIT
Install: one-liner
Version: 1.14.0

Surfaces

CLI binary: codesight
CLI subcmds: 0
Local UI: other
Tech stack: HTML report (static, browser-rendered)

Components

Commands: 0
Skills: 0
Subagents: 0
Hooks: 0
MCP servers: 1
MCP tools: 13
Scripts: 0
Templates: 0

Workflow

Phases: 5
Approval gates: 0
Spec format: none
Spec storage: none
Delta or full: none

Orchestration

Multi-agent: No
Pattern: none
Max concurrent: 0
Isolation: none
Consensus: none
Prompt chaining: No

Multi-model

Multi-model: No
BYOK: No
Modal: text

Execution

Mode: one-shot
Crash recovery: No
Compaction: Yes
Session handoff: Yes
Streaming: No

Memory

Type: file-based
Persistence: project
Search: none
State files: 4 files

Quality

TDD: No
TDD mechanism: none
Self-review: none

Git / Observability

Auto commit: No
Auto PR: No
Auto merge: No
Worktree/feat: No
Audit log: Yes
Audit format: structured-md
Replay: No

Tools

Primary: claude-code
Targets: 7
Portability: high

Signals

Stars: 1.1k
Last commit: 2026-05-18
Contributors: 14
Maintainer: active
Quality score: 2.2/10

Summary

codesight — Summary

Overview

codesight — Overview

Origin

Philosophy

Stated Goals

Scope

Token Savings Claims

Related Projects by Same Author

Architecture

codesight — Architecture

Distribution

Install Methods

Directory Tree

Target AI Tools

Language/Framework Coverage

Output Artifacts

Components

codesight — Components

CLI Flags (Subcommands)

MCP Server Tools (13 total)

Detectors

Generators

Config

Prompts

codesight — Prompts

Excerpt 1 — Sample CODESIGHT.md Output Structure (from eval directory)

Excerpt 2 — Wiki Article Format (auth.md)

Excerpt 3 — AI Config Init Output (CLAUDE.md generated content)

Uniqueness

codesight — Uniqueness

differs_from_seeds

Positioning

Observable Failure Modes

Workflow

codesight — Workflow

Execution Pattern

Primary Workflow

Phase 1: Project Scan (one-time or on-change)

Phase 2: Wiki Generation (optional, persistent)

Phase 3: Config Generation (one-time setup)

Phase 4: AI Session

Phase-to-Artifact Map

Approval Gates

CI Integration

Incremental / Watch

Memory Context

codesight — Memory & Context

State Storage

Persistent Artifacts

No Session State

Context Compaction Handling

Cross-Session Handoff

Cache Behavior

.codesightignore

Orchestration

codesight — Orchestration

Multi-Agent Pattern

Orchestration Pattern

Execution Mode

Internal Parallelism

Isolation Mechanism

Multi-Model Usage

Subagent Definition Format

Consensus Mechanism

Prompt Chaining

MCP Mode as Integration Point

Cross-Tool Portability

Ui Cli Surface

codesight — UI & CLI Surface

CLI Binary

Local UI

MCP Server

IDE Integration

Observability

GitHub Actions

Related frameworks