Haft

haft · m0n0x41d/haft · ★ 1.3k · last commit 2026-05-25

Engineering reasoning governor that enforces First Principles Framing — frame, compare under parity, decide with falsifiable contracts, detect stale evidence — before AI agents execute.

Best whenThe handle between the tool and the hand: governance belongs between intention and execution, not after the fact.

Skip ifskipping exploration before deciding, comparing options on different criteria (parity violation)

vs seeds

taskmaster-aimanages execution tasks,…

Primitive shape 21 total

Commands 13 Skills 1 MCP tools 7

Summary

Haft — Summary

Haft is a Go-compiled CLI and MCP server that enforces structured engineering reasoning before and during AI-agent code execution. It implements First Principles Framing (FPF) as an engineering discipline: agents must frame problems, compare options under parity, record decisions as falsifiable contracts, and track when evidence goes stale before committing to implementation. The framework provides 7 MCP tools (haft_note, haft_problem, haft_solution, haft_decision, haft_commission, haft_refresh, haft_query), 13 slash commands (h-frame, h-char, h-explore, h-compare, h-decide, h-onboard, etc.), a spec-check CLI, and a WorkCommission lifecycle for bounding agent execution. It supports Claude Code and Codex as primary targets with experimental support for Cursor, Gemini CLI, OpenCode, and JetBrains Air. Haft also ships a Bubbletea TUI and an alpha Tauri desktop app. Compared to seeds: Haft is closest to taskmaster-ai in that both provide structured reasoning scaffolds and MCP tool interfaces, but Haft's emphasis on decision contracts with evidence decay, parity enforcement, and formal spec sections as a pre-execution governance layer has no analog in any seed framework.

Overview

Haft — Overview

Origin

Repo: m0n0x41d/haft (formerly quint-code). Created December 2025, Go, non-standard license. 1,333 stars, 104 forks. Author: m0n0x41d. Website: quint.codes. Languages: Go (primary), with Bubbletea TUI, Tauri desktop, SQLite.

Philosophy

From README:

"True harness engineering for AI-assisted software delivery."

"Haft is the engineering governor that sits between your intentions and your agents' execution. It enforces the discipline that separates 'we shipped fast' from 'we shipped right': frame the problem before solving it, compare options under parity, record decisions as falsifiable contracts, and know the moment assumptions go stale."

"The handle between the tool and the hand — the part that turns raw capability into formal specification, governed decisions, bounded commissions, and evidence-backed engineering work."

"Not a coding agent. Not a generic documentation tool."

Core workflow: Specify → Think → Run → Govern.

First Principles Framing (FPF)

FPF is Haft's proprietary reasoning discipline. Key concepts:

Problems must be framed with dimensions before solutions are explored
Solutions require diversity checks (no premature convergence)
Comparisons must be done under parity (same criteria for all options)
Decisions are recorded as falsifiable contracts with invariants, claims, evidence
Evidence decay: decision contracts automatically track staleness of their assumptions
WorkCommissions: bounded execution units with allowed actions and evidence requirements

CLAUDE.md Philosophy

Haft's own CLAUDE.md reflects its communication style:

"Be a peer engineer, not a cheerleader: Skip validation theater ('you're absolutely right', 'excellent point')." "Talk like you're pairing with a staff engineer, not pitching to a VP." "No. | That's an interesting approach, however..." (USE vs AVOID table)

Architecture

Haft — Architecture

Distribution

Go binary (haft) — single compiled binary
Install: curl -fsSL https://raw.githubusercontent.com/m0n0x41d/haft/main/install.sh | bash
Binary installs to PATH as haft

Init Per Tool

haft init           # Claude Code (global commands in ~/.claude/commands/)
haft init --local   # Claude Code (local .claude/commands/)
haft init --codex   # Codex CLI + Codex App
haft init --all     # Claude Code + Codex
haft init --cursor  # Experimental Cursor
haft init --gemini  # Experimental Gemini CLI
haft init --opencode # Experimental OpenCode
haft init --air     # Experimental JetBrains Air

Directory Structure (per project after init)

.haft/                   # Project knowledge base (created by init)
  decisions/             # Decision contracts
  evidence/              # Evidence records
  knowledge/             # Domain knowledge
.mcp.json                # Claude Code MCP config (haft serve)
.codex/config.toml       # Codex MCP config

Global Installs (Claude Code)

~/.claude/commands/      # h-frame, h-char, h-explore, h-compare, h-decide,
                         # h-commission, h-note, h-onboard, h-problems,
                         # h-search, h-status, h-verify, h-view
~/.claude/skills/h-reason/  # Aggregated reasoning skill

Internal Architecture

Go monorepo with ~40 internal packages including:

artifact — FPF artifact (notes, problems, solutions, decisions)
workcommission — WorkCommission lifecycle
fpf — First Principles Framing engine
agentcore, agentdriver, agentloop — embedded agent runtime
mcp — MCP server implementation
tui — Bubbletea TUI
embedding — semantic search
graph — relationship graph
lsp — Language Server Protocol integration

Storage

SQLite database for artifact persistence.

Runtime Surfaces

Surface	Description
`haft serve`	MCP server mode (for Claude Code / Codex)
`haft` (bare)	Interactive agent mode (Bubbletea TUI)
`haft harness`	Operator harness CLI (prepare/run/status/result/apply/requeue/cancel)
`haft spec check`	Spec section validator (deterministic L0/L1/L1.5)
Desktop app	Alpha Tauri desktop app (`.desktop-tauri/`)

Components

Haft — Components

CLI Binary: `haft`

Top-Level Subcommands

Subcommand	Purpose
`haft` (bare)	Launch interactive agent (Bubbletea TUI)
`haft init`	Initialize project, install commands/MCP config
`haft serve`	Start MCP server for embedded agents
`haft harness`	Operator/runtime harness lifecycle
`haft spec check`	Validate spec sections (deterministic L0/L1/L1.5)
`haft version`	Print version information

Harness Subcommands (haft harness)

From source: prepare, run, status, result, apply, requeue, cancel.

MCP Tools (7)

Tool	Purpose
`haft_note`	Micro-decisions with validation + auto-expiry
`haft_problem`	Frame problems, define comparison dimensions with roles
`haft_solution`	Explore variants with diversity check, compare with parity
`haft_decision`	Decision contract with invariants, claims, evidence, baseline lifecycle
`haft_commission`	WorkCommission create/list/claim lifecycle
`haft_refresh`	Lifecycle management for all artifacts (evidence decay)
`haft_query`	Search, status dashboard, file-to-decision lookup, FPF spec search

Slash Commands (13, installed globally or locally)

Command	Purpose
`/h-frame`	Frame a problem — define scope, constraints, success criteria
`/h-char`	Characterize — what matters about this problem?
`/h-explore`	Explore solution variants with diversity check
`/h-compare`	Fair comparison under parity
`/h-decide`	Record decision contract with invariants and evidence
`/h-note`	Record micro-decision or assumption
`/h-commission`	Create a WorkCommission for execution
`/h-onboard`	Onboard a new project to Haft (build spec + term map)
`/h-problems`	List open problems and their status
`/h-search`	Search FPF artifact graph
`/h-status`	Dashboard — what's stale, what's decided, what's commissioned
`/h-verify`	Verify implementation against decision contracts
`/h-view`	View a specific artifact

Skill: h-reason

One aggregated skill (~/.claude/skills/h-reason/SKILL.md) that activates the full FPF reasoning pipeline end-to-end.

Spec Sections

Haft uses fenced YAML blocks within markdown (yaml spec-section) as the spec format. haft spec check validates these.

Desktop App (Alpha)

Tauri desktop app in desktop-tauri/ — not production-recommended in v7.

Prompts

Haft — Prompts

Verbatim Excerpt 1: README (FPF Pipeline Description)

### One command: `/h-reason`

Describe your problem. The agent frames it, generates alternatives, 
compares them fairly, and records the decision — all in one command. 
It auto-selects the right depth.

### Or drive each step manually

/h-frame  → /h-char  → /h-explore → /h-compare → /h-decide
  what's      what       genuinely     fair         engineering
  the         matters    different     parity       contract
  problem?

Technique: staged reasoning pipeline — each step has a precise name and semantic contract. The pipeline enforces that alternatives are explored before a decision is made, and parity is enforced in comparison (all options evaluated against the same criteria).

Verbatim Excerpt 2: CLAUDE.md (Communication Style)

## Communication Style

**Be a peer engineer, not a cheerleader:**

| USE | AVOID |
|-----|-------|
| "This won't work because..." | "Great idea, but..." |
| "The issue is..." | "I think maybe..." |
| "No." | "That's an interesting approach, however..." |
| "You're wrong about X, here's why..." | "I see your point, but..." |
| "I don't know" | "I'm not entirely sure but perhaps..." |
| "This is overengineered" | "This is quite comprehensive" |

Technique: anti-sycophancy persona enforcement — Haft's own CLAUDE.md uses a table of explicitly banned phrases and their preferred replacements to calibrate agent communication style. This is the same "Iron Law with rationalization table" pattern seen in superpowers.

Verbatim Excerpt 3: README (Spec Check)

`haft spec check` is intentionally deterministic L0/L1/L1.5 only: 
it parses fenced `yaml spec-section` blocks, checks required structural 
fields, validates known carrier shapes, and verifies that the term-map 
carrier contains parseable term entries. It does not make L2 semantic 
judgments, perform LLM review, prove product correctness, or make L3 
runtime/evidence claims.

Technique: explicit scope limitation — Haft explicitly names what it does NOT do (L2/L3 checks). This anti-pattern-of-overreach is a deliberate design choice to keep spec-check deterministic and fast.

Uniqueness

Haft — Uniqueness & Positioning

Differs from Seeds

Haft is most architecturally similar to taskmaster-ai (both provide structured task/decision management as MCP servers with CLI), but the philosophical delta is substantial. taskmaster-ai manages a task list with complexity scoring and expansion prompts for execution. Haft enforces a reasoning discipline (FPF) where problems must be framed, options explored under diversity check, and decisions recorded as falsifiable contracts — before any execution. The evidence decay mechanism (knowing when assumptions go stale) has no analog in any of the 11 seeds. The spec section format (fenced YAML blocks in markdown, validated with haft spec check) is proprietary to Haft. Unlike claude-flow (MCP-anchored hive mind), Haft has no multi-agent orchestration — it governs how a single agent reasons, not how a swarm coordinates. The CLAUDE.md with an explicit anti-sycophancy calibration table is borrowed from superpowers' "Iron Law" pattern but applied to communication style rather than TDD discipline.

Positioning

Signal type: pre-execution reasoning governance Intervention point: before tool-use (framing + decision recording) + after tool-use (evidence verification) Unique features: evidence decay, parity enforcement, WorkCommission scope bounding, deterministic spec checker, multi-host init (Claude Code + Codex + Cursor + Gemini + OpenCode + Air) Target user: engineering teams wanting structured decision records alongside AI-generated code

Observable Failure Modes

FPF discipline is enforced by prompts/skills, not by hooks — the agent can skip the pipeline
Evidence decay depends on the agent running haft_refresh; a lazy agent never detects staleness
Non-standard license (NOASSERTION) is a red flag for enterprise adoption
Alpha TUI and desktop app may be unreliable (explicitly called out in README)
Heavy Go binary + SQLite setup vs. simpler markdown-only frameworks

Relationship to Batch 31

Haft is the only framework in this batch that focuses on pre-execution decision governance rather than runtime enforcement. All other frameworks (Sponsio, DashClaw, clauder, pi-steering-hooks) block/audit tool calls. Haft governs the reasoning that happens before tool calls occur.

Workflow

Haft — Workflow

Phases (FPF Reasoning Pipeline)

Phase	Command	Artifact
Frame	`/h-frame`	Problem statement with dimensions
Characterize	`/h-char`	Characterization of what matters
Explore	`/h-explore`	Solution variants (diversity-checked)
Compare	`/h-compare`	Parity comparison table
Decide	`/h-decide`	Decision contract (invariants, claims, evidence)
Commission	`/h-commission`	WorkCommission with allowed actions
Execute	Agent runs commissioned work	Code + evidence
Verify	`/h-verify`	Evidence checked against decision contracts
Refresh	`haft_refresh`	Stale evidence detected, contracts updated

Or compressed: /h-reason — auto-selects depth, runs full pipeline.

Harness Lifecycle

haft harness prepare  →  haft harness run  →  haft harness status
                                        ↓
                              haft harness result
                                        ↓
                              haft harness apply (or requeue)

Approval Gates

Gate	Trigger	Type
Evidence review	Before `/h-decide`	freetext-clarify
Decision invariants	Embedded in contract	file-review
WorkCommission scope	Before commission execution	choice-list

Spec Format

Fenced YAML blocks within markdown:

id: "problem-auth-design"
type: problem
dimensions:
  - security
  - latency
  - maintainability

Evidence Decay

Decision contracts have evidence attached with timestamps. haft_refresh detects which evidence items have gone stale (configurable decay policy). Stale decisions are surfaced by /h-status.

Execution Mode

Interactive (TUI), MCP event-driven (via haft serve), or one-shot CLI (spec check, harness commands).

Memory Context

Haft — Memory & Context

State Storage

SQLite database — Haft's primary artifact store.

Store	Content
SQLite (`.haft/` or project root)	FPF artifacts: problems, solutions, decisions, evidence, notes, commissions
`.haft/decisions/`	Decision contract files
`.haft/evidence/`	Evidence records
`.haft/knowledge/`	Domain knowledge entries

Artifact Lifecycle

All FPF artifacts are persisted to SQLite with:

Creation timestamp
Last-updated timestamp
Evidence links
Staleness metadata (for evidence decay)

Evidence Decay

The haft_refresh MCP tool implements evidence decay:

Decision contracts link to specific evidence items
Evidence items carry timestamps
haft_refresh checks which evidence is stale based on configurable decay policy
Stale decisions surface in /h-status dashboard

Cross-Session Persistence

SQLite persists across sessions. Decisions made in session 1 are available and checkable in session 100. This is project-scoped persistence.

FPF Spec Sections

Markdown files with fenced yaml spec-section blocks serve as the spec format. haft spec check validates these deterministically.

Cross-Tool Handoffs

When switching between Claude Code and Codex, both use the same .haft/ SQLite database (via .mcp.json / .codex/config.toml pointing at the same haft serve instance).

Compaction

Not explicitly addressed. Long sessions with many artifacts: haft_query provides search across the graph.

Orchestration

Haft — Orchestration

Multi-Agent

Partial — WorkCommissions can be created for bounded execution units that could be handed to different agents, but Haft itself is primarily a single-agent reasoning discipline.

Orchestration Pattern

Sequential (frame → explore → compare → decide → commission → execute → verify).

Isolation Mechanism

None explicitly. WorkCommissions bound scope by declaring allowed_actions and evidence_requirements but do not enforce filesystem or process isolation.

Execution Mode

Three modes:

Interactive — bare haft launches Bubbletea TUI
MCP event-driven — haft serve responds to MCP tool calls from Claude Code / Codex
One-shot — haft spec check, haft harness prepare/run/status/result

Multi-Model

No. Single model via MCP connection. No multi-model routing.

Cross-Tool Portability

Medium — primary support for Claude Code and Codex. Experimental support for Cursor, Gemini CLI, OpenCode, JetBrains Air. Architecture supports multiple hosts via platform-specific install paths.

Consensus

None.

Prompt Chaining

Yes — FPF pipeline is explicit prompt chaining: output of /h-frame feeds into /h-explore; output of /h-compare feeds into /h-decide. Each step's output IS the input context for the next step.

TUI

Bubbletea-based terminal UI for interactive agent mode (haft bare command).

Ui Cli Surface

Haft — UI & CLI Surface

CLI Binary

Exists: yes Name: haft Language: Go Is thin wrapper: no — own full runtime Install: curl -fsSL https://raw.githubusercontent.com/m0n0x41d/haft/main/install.sh | bash

Key Subcommands

Subcommand	Description
`haft`	Interactive TUI agent
`haft init [--claude	--codex
`haft serve`	Start MCP server
`haft harness`	Operator harness lifecycle
`haft spec check [--json]`	Deterministic spec validation
`haft version`	Version info

TUI (Bubbletea)

Exists: yes Type: terminal-tui Stack: charm.land/bubbletea v2, charm.land/bubbles v2, charm.land/lipgloss v2 Features: interactive agent loop, commission status, artifact browser

Desktop App (Alpha)

Exists: yes (alpha, not production-recommended) Type: desktop-app Stack: Tauri (Rust + WebView) Location: desktop-tauri/

MCP Server

Exists: yes Invocation: haft serve or via .mcp.json / .codex/config.toml Tools: 7 tools (haft_note, haft_problem, haft_solution, haft_decision, haft_commission, haft_refresh, haft_query) Protocol: stdio

IDE Integration

Via MCP server — any MCP client can connect. Primary: Claude Code (.mcp.json), Codex (.codex/config.toml).

Observability

haft_query provides search/status dashboard
/h-status command shows what's stale, decided, commissioned
SQLite persists full artifact history

Related frameworks

same archetype · same primary tool · same memory type

claude-mem (thedotmack) ★ 78k

A8 Cross-runtime harness

Background worker service captures every tool call as an observation, AI-compresses sessions, and auto-injects relevant past…

pi (badlogic/earendil) ★ 55k

A8 Cross-runtime harness

A minimal, hackable, multi-provider terminal coding agent that adapts to your workflows via npm-installable TypeScript Extensions…

Agent Skills (Addy Osmani) ★ 46k

A8 Cross-runtime harness

Encodes senior-engineer software development lifecycle as 23 auto-routed skills and 7 slash commands for any AI coding agent.

wshobson/agents Plugin Marketplace ★ 36k

A8 Cross-runtime harness

Single Markdown source for 83 domain-specialized plugins that auto-generates idiomatic artifacts for five AI coding harnesses.

TabbyML/Tabby ★ 34k

A8 Cross-runtime harness

Self-hosted AI coding assistant server (alternative to GitHub Copilot) with admin dashboard, RAG-based completions, and multi-IDE…

Compound Engineering ★ 17k

A8 Cross-runtime harness

Make each unit of engineering work compound into easier future work via brainstorm→plan→execute→review→learn cycles.

Distribution

Type: cli-tool
License: NOASSERTION
Install: one-liner
Version: v7

Surfaces

CLI binary: haft
CLI subcmds: 6
Local UI: terminal-tui
Tech stack: Bubbletea v2, Bubbles v2, Lipgloss v2 (charm.land)

Components

Commands: 13
Skills: 1
Subagents: 0
Hooks: 0
MCP servers: 1
MCP tools: 7
Scripts: 1
Templates: 0

Workflow

Phases: 9
Approval gates: 2
Spec format: yaml
Spec storage: per-feature-folder
Delta or full: whole-file

Orchestration

Multi-agent: No
Pattern: sequential
Max concurrent: 1
Isolation: none
Consensus: none
Prompt chaining: Yes

Multi-model

Multi-model: No
BYOK: No
Modal: text

Execution

Mode: interactive-loop
Crash recovery: Yes
Compaction: No
Session handoff: Yes
Streaming: No

Memory

Type: sqlite
Persistence: project
Search: full-text
State files: 3 files

Quality

TDD: No
TDD mechanism: none
Validators: 2
Self-review: inline-self

Git / Observability

Auto commit: No
Auto PR: No
Auto merge: No
Worktree/feat: No
Audit log: Yes
Audit format: sqlite
Replay: Yes

Tools

Primary: claude-code
Targets: 6
Portability: medium

Signals

Stars: 1.3k
Last commit: 2026-05-25
Contributors: 4
Maintainer: active
Quality score: 4.4/10

Summary

Haft — Summary

Overview

Haft — Overview

Origin

Philosophy

First Principles Framing (FPF)

CLAUDE.md Philosophy

Architecture

Haft — Architecture

Distribution

Init Per Tool

Directory Structure (per project after init)

Global Installs (Claude Code)

Internal Architecture

Storage

Runtime Surfaces

Components

Haft — Components

CLI Binary: haft

Top-Level Subcommands

Harness Subcommands (haft harness)

MCP Tools (7)

Slash Commands (13, installed globally or locally)

Skill: h-reason

Spec Sections

Desktop App (Alpha)

Prompts

Haft — Prompts

Verbatim Excerpt 1: README (FPF Pipeline Description)

Verbatim Excerpt 2: CLAUDE.md (Communication Style)

Verbatim Excerpt 3: README (Spec Check)

Uniqueness

Haft — Uniqueness & Positioning

Differs from Seeds

Positioning

Observable Failure Modes

Relationship to Batch 31

Workflow

Haft — Workflow

Phases (FPF Reasoning Pipeline)

Harness Lifecycle

Approval Gates

Spec Format

Evidence Decay

Execution Mode

Memory Context

Haft — Memory & Context

State Storage

Artifact Lifecycle

Evidence Decay

Cross-Session Persistence

FPF Spec Sections

Cross-Tool Handoffs

Compaction

Orchestration

Haft — Orchestration

Multi-Agent

Orchestration Pattern

Isolation Mechanism

Execution Mode

Multi-Model

Cross-Tool Portability

Consensus

Prompt Chaining

TUI

Ui Cli Surface

Haft — UI & CLI Surface

CLI Binary

Key Subcommands

TUI (Bubbletea)

Desktop App (Alpha)

MCP Server

IDE Integration

Observability

Related frameworks

CLI Binary: `haft`