Skip to content
/

AgentTrace

agenttrace-luoyu · luoyuctl/agenttrace · ★ 49 · last commit 2026-05-24

Primitive shape 1 total
Skills 1
00

Summary

AgentTrace — Summary

AgentTrace is a local-first terminal TUI and report generator (Go, v0.5.4) that reads AI coding-agent session logs from 15+ agent sources (Claude Code, Codex CLI, Gemini CLI, Qwen Code, Cline, Aider, Cursor exports, OpenCode, Kimi CLI, and generic JSONL) and surfaces cost, token, time, tool-failure, latency, health, and anomaly data. It runs entirely on the local machine — no hosted backend, no data upload — and outputs text, JSON, Markdown, and self-contained HTML reports. The TUI ranks sessions by cost, duration, health, failures, or text search, and a --doctor flag auto-detects session directories. CI gate flags (--fail-under-health, --fail-on-critical, --max-tool-fail-rate) let the tool block builds when agent session quality degrades below threshold. A --baseline comparison mode detects regressions in cost/duration/tokens against a prior JSON snapshot.

Differs from seeds: all 11 seeds operate before or during agent sessions as planning/execution layers; AgentTrace operates after sessions as a retrospective forensics and CI-gate tool. It ships a skills/agenttrace-session-audit/SKILL.md that can be loaded by Claude Code or Codex CLI to use the binary from within a session, bridging the two worlds. No seed tracks multi-agent cross-session cost telemetry, session health scores, or baseline regression detection.

01

Overview

AgentTrace — Overview

Origin

AgentTrace (luoyuctl/agenttrace) is developed under MIT license, primarily maintained by one author (luoyuctl) with 1 external fork. Version v0.5.4, released 2026-05-24. Written in Go 1.25+. The project includes Chinese README translation (README.zh-CN.md), suggesting the primary audience includes Chinese developers. 49 stars.

Philosophy

AgentTrace is purpose-built for a specific problem: AI coding agents now behave like small build systems, but you only see the final answer. AgentTrace reads the logs your agents already write and puts cost-heavy or slow sessions first.

Five core questions it answers:

  1. What did my agents spend? — cost, tokens, time across sessions
  2. Why was this task slow? — latency, retries, stalls, large parameters
  3. Did a run regress? — baseline comparison
  4. What should I inspect first? — ranking by health, cost, failures
  5. Can I inspect this privately? — everything local, no data upload

"Local-First" as a Design Principle

"Everything runs locally; prompts, code, and logs do not need to leave your machine."

In a space where most observability tools require cloud backends, AgentTrace's local-first stance is a deliberate architectural decision that makes it deployable in regulated or privacy-sensitive environments.

Manifesto-Style Quotes

From README:

"AI coding agents now behave like small build systems: they call tools, retry, stall, and spend tokens while you only see the final answer."

"agenttrace reads the logs your agents already write and puts cost-heavy or slow sessions first."

"These screenshots were captured from a local run against real session logs. They are not --demo output and not test fixtures."

The last quote is significant: AgentTrace explicitly proves its claims with real data from the author's own sessions (1,761 sessions, 9.13B tokens, $5,037.26 estimated cost, 91% average health).

02

Architecture

AgentTrace — Architecture

Distribution

  • Primary: curl -sL install.sh | sh
  • Homebrew: brew install luoyuctl/tap/agenttrace
  • Go install: go install github.com/luoyuctl/agenttrace/cmd/agenttrace@latest
  • Windows: PowerShell install script

Repository Structure

cmd/
  agenttrace/
    main.go           # CLI entry point + flag parsing
internal/
  engine/             # Session parsing and analysis engine
  tui/                # Bubble Tea TUI components
  i18n/               # English + Chinese localization
  
skills/
  agenttrace-session-audit/
    SKILL.md          # Claude Code / Codex skill for using agenttrace
assets/               # Logo, demo GIFs, screenshots
site/                 # Documentation site
testdata/             # Test fixtures
scripts/              # Build/install scripts
homebrew/             # Homebrew formula

Required Runtime

  • Go 1.25+ (build from source)
  • None for binary install

Supported Log Sources

AgentTrace reads logs from 15+ agent runtimes:

  • Claude Code, Codex CLI, Gemini CLI
  • Qwen Code, Cline, Aider
  • Cursor exports
  • Hermes Agent, OpenCode, OpenClaw
  • Pi, Oh My Pi, Kimi CLI
  • Copilot-style logs
  • Generic JSON/JSONL traces

Auto-detection

agenttrace --doctor detects agent directories automatically. No configuration required for supported runtimes.

Tech Stack

Component Technology
TUI Bubble Tea (charmbracelet)
CLI framework Go flag package
Report formats text, json, markdown, html
Languages Go 1.25+, i18n: en/zh
03

Components

AgentTrace — Components

CLI Binary: agenttrace

Single binary with flag-based interface:

Flag Purpose
(no args) Launch interactive Bubble Tea TUI
--overview Show global dashboard across all sessions
--latest Analyze most recent session
--compare Compare all sessions
-d <dir> Specify session directory
-f <format> Output format: text, json, markdown, html
-o <file> Save report to file
--search <query> Search session metadata, tools, anomalies
--search-limit <n> Max search results
--doctor Check detected directories, cache state
--demo Use built-in demo sessions
--baseline <file> Compare against baseline JSON
--fail-under-health <n> CI gate: exit non-zero if health below n
--fail-on-critical CI gate: exit non-zero if critical sessions exist
--max-tool-fail-rate <n> CI gate: exit non-zero if failure rate above n%
--baseline-max-duration-delta-pct <n> Regression gate: max duration increase %
--baseline-max-cost-delta-pct <n> Regression gate: max cost increase %
--baseline-max-token-delta-pct <n> Regression gate: max token increase %
--list-models List models with pricing
--update-pricing Download latest pricing from LiteLLM
-m <model> Model for pricing calculations
--lang <en/zh> Report language
--version Show version
--waste Waste analysis for latest session

TUI Views

From README screenshots:

  • Overview — sessions ranked by cost/health, totals
  • Critical sessions — filtered high-priority sessions
  • Session detail — health score, cost breakdown, tool failures, next action
  • Diagnostics — latency stats, context window, large parameter calls

Skill: agenttrace-session-audit

A single skill in skills/agenttrace-session-audit/SKILL.md that can be loaded by Claude Code or Codex CLI to use the agenttrace binary from within an agent session. Includes guardrails:

"Treat prompts, code, and session contents as local/private data. Do not upload logs to external services."

"Do not invent metrics. If a parser cannot infer cost, model, or latency, say which field is missing."

What is Measured

Per session:

  • Model, agent source/runtime
  • Input tokens, output tokens, cache tokens
  • Estimated cost (model pricing lookup via LiteLLM)
  • Wall-clock duration
  • Turn count
  • Tool failure count and rate
  • Latency stats (gaps, retries, stall detection)
  • Context window pressure indicators
  • Large parameter calls
  • Anomaly flags
  • Health score (0-100)
05

Prompts

AgentTrace — Prompts

Note: AgentTrace is a Go CLI tool, not a prompt engineering framework. Its "prompts" are limited to the one skill file it ships for integration with AI coding agents.

Prompt 1: agenttrace-session-audit Skill

Technique: Tool-use skill with explicit guardrails against data fabrication and privacy violations

Verbatim from skills/agenttrace-session-audit/SKILL.md:

---
name: agenttrace-session-audit
description: Audit local AI coding-agent sessions with agenttrace. Use when the user asks 
to inspect Claude Code, Codex CLI, Gemini CLI, Qwen Code, Cline, Aider, Cursor exports, 
Hermes Agent, OpenCode, OpenClaw, Pi, Oh My Pi, Kimi CLI, Copilot-style logs, or generic 
JSON/JSONL traces for cost, tokens, tool failures, latency, anomalies, health, diffs, or CI gates.
license: MIT
metadata:
  short-description: Audit AI agent session health
---

## Guardrails

- Treat prompts, code, and session contents as local/private data. Do not upload logs to external services.
- Do not invent metrics. If a parser cannot infer cost, model, or latency, say which field is missing.
- Do not overwrite user reports unless the user asked for that output path.

Analysis: The guardrails are unusually specific about data privacy and anti-fabrication. "Do not invent metrics" directly addresses a common LLM failure mode in observability tools. The skill provides binary-first (prefer installed agenttrace) vs. source fallback (go run) logic.

Prompt 2: Workflow Steps in Skill

Technique: Decision tree with binary availability check + progressive disclosure

## Workflow
1. Prefer the installed agenttrace binary when available on PATH.
2. If not available and in the luoyuctl/agenttrace repository, use go run ./cmd/agenttrace.
3. Start with discovery:
   agenttrace --doctor
   agenttrace --overview
4. For human report: agenttrace --overview -f markdown -o agenttrace-overview.md
5. For CI/automation: agenttrace --overview --fail-under-health 80 --fail-on-critical

Analysis: The binary-availability decision tree makes the skill robust to both installed and development environments. The --fail-under-health 80 CI gate example provides a concrete actionable threshold.

Prompt 3: Report Focus Guidance

## Report Focus
- Lead with the highest-risk sessions and the reason they matter.
- Call out token/cost waste, repeated tool failures, retry loops, long gaps, and low health scores.
- When proposing a CI gate, include the exact agenttrace command and threshold.
- If no sessions are detected, run agenttrace --doctor and report the detected directories.

Analysis: Prioritizes actionability over completeness — lead with risk, not with counts. The "include exact command and threshold" instruction prevents vague LLM recommendations.

09

Uniqueness

AgentTrace — Uniqueness & Positioning

Differs From Seeds

All 11 seeds operate before or during agent sessions as planning/execution layers. AgentTrace is the only framework in the full surveyed set that operates after sessions as a retrospective forensics and CI-gate tool. The closest seed is ccmemory (both persist cross-session agent state) but ccmemory extracts context for the agent; AgentTrace extracts operational telemetry about agents for humans. TraceRoot (batch-mate) is the cloud-based equivalent — AgentTrace is the local-first, zero-backend alternative with a Bubble Tea TUI rather than a web dashboard.

Distinctive Opinion

"AI coding agents now behave like small build systems: they call tools, retry, stall, and spend tokens while you only see the final answer."

AgentTrace bets that local-first forensics is the right operational posture for individual developers and privacy-sensitive teams who cannot or will not send session logs to a cloud backend. The real-session evidence in the README (1,761 sessions, $5,037.26 total cost) demonstrates both the tool's capability and the problem's scale.

The --fail-under-health CI gate transforms AgentTrace from a review tool into an enforcement mechanism — agent session quality becomes a CI signal like test pass rate.

Positioning

  • Target: individual developers and small teams using multiple AI coding agents simultaneously
  • Differentiator vs. TraceRoot: local-first, no infrastructure required, no cloud account
  • Multi-agent support: 15+ agent runtimes in one view, enabling cross-runtime comparison
  • Privacy: prompts and session content never leave the machine

Observable Failure Modes

  1. Pricing accuracy: Cost estimation uses LiteLLM pricing data which may lag provider updates. --update-pricing must be run manually. Stale pricing produces incorrect cost aggregates.
  2. Log format fragility: Supports 15+ agent formats; any breaking change in an agent's log format silently produces empty or incorrect metrics for that source.
  3. Single-author bus risk: 1 contributor, 1 external fork. Maintenance continuity is a concern.
  4. No anomaly learning: Anomaly detection uses heuristics, not ML. It cannot adapt to project-specific "normal" behavior patterns.
  5. TUI-only interactivity: No export to monitoring dashboards (no Grafana integration, no Prometheus metrics). Limited integration surface beyond CI exit codes and static reports.

Cross-References

  • Ships a agenttrace-session-audit skill that connects to Claude Code and Codex CLI agent sessions
  • Reads Claude Code session files natively (.claude/sessions/)
  • Complementary to TraceRoot: TraceRoot for cloud/production, AgentTrace for local/dev
  • ROADMAP.md suggests future CI/CD integrations (not yet shipped)
04

Workflow

AgentTrace — Workflow

Daily Workflow

AgentTrace is used as a retrospective analysis tool after agent sessions:

# Start TUI for interactive exploration
agenttrace

# Quick overview of recent sessions
agenttrace --overview

# Diagnose a slow session
agenttrace --latest
agenttrace --latest --waste

# Find sessions with billing keywords
agenttrace --search billing

CI Integration Workflow

# Fail CI if session health drops below 80
agenttrace --overview \
  --fail-under-health 80 \
  --fail-on-critical \
  --max-tool-fail-rate 15

# Save baseline
agenttrace --overview -f json -o agenttrace-baseline.json

# Compare against baseline in next CI run
agenttrace --overview -f json \
  --baseline agenttrace-baseline.json \
  -o agenttrace-overview.json

Report Generation Workflow

# Machine-readable for automation
agenttrace --overview -f json

# Human-readable markdown
agenttrace --overview -f markdown -o report.md

# Self-contained HTML for sharing/CI artifacts
agenttrace --overview -f html -o agenttrace-report.html

Approval Gates

None — AgentTrace is passive. The CI gate flags (--fail-under-health, etc.) are exit-code-based signals for external CI systems, not interactive approval gates.

Phase-to-Artifact Map

Phase Artifact
Session runs Agent log files (written by agent)
Auto-detection Detected agent directories (doctor output)
Analysis Session data objects (cost, tokens, health)
Report text/json/markdown/html output
CI gate Exit code (0 = pass, non-zero = fail)
Baseline comparison Regression delta report
06

Memory Context

AgentTrace — Memory & Context

State Storage

AgentTrace has minimal persistent state:

  • Session log cache: Auto-detected agent directories are cached for faster subsequent runs (.agenttrace/ or platform-appropriate path — --doctor shows cache status)
  • Pricing database: Model pricing data, updateable via --update-pricing from LiteLLM
  • Baseline snapshots: User-saved JSON files for regression comparison (agenttrace --overview -f json -o baseline.json)

What AgentTrace Reads

It reads the log files that coding agents already write:

  • Claude Code: ~/.claude/sessions/ (JSONL)
  • Codex CLI: ~/.codex/ (JSONL)
  • Gemini CLI: similar structure
  • Generic JSONL traces from any compatible format

AgentTrace does not modify these log files — it is read-only.

No Cross-Session Learning

AgentTrace does not learn from sessions or update a corpus. It reads, analyzes, and reports. The baseline comparison feature is purely statistical, not corpus-based.

Privacy Architecture

All processing happens locally:

  • No API calls to external services
  • No telemetry or usage reporting
  • Prompts and session content analyzed in-process, not transmitted

Persistence

Data Location When Written
Pricing data Local cache --update-pricing
Baseline JSON User-specified path --overview -f json -o file
Reports User-specified path -o <file>
Session logs Agent-native paths Written by agents (read-only by agenttrace)

Compaction

Not applicable — AgentTrace is a reader, not a writer. It does not manage context windows.

07

Orchestration

AgentTrace — Orchestration

Multi-Agent Pattern

None — AgentTrace is a single-process analysis tool. It does not orchestrate agents; it analyzes their session logs post-hoc.

Execution Mode

One-shot — invoked per analysis run. The TUI mode is interactive but not a daemon.

Internal Architecture

AgentTrace runs a local analysis pipeline:

  1. Discovery: Auto-detect agent log directories (--doctor)
  2. Parse: Read JSONL session files from all detected sources
  3. Analyze: Compute cost, tokens, health scores, latency, tool failure rates, anomalies
  4. Rank: Sort sessions by selected metric
  5. Report: Output in requested format (text/json/markdown/html)

No Coordination

AgentTrace has no sub-agents, no parallel execution patterns, and no consensus mechanisms. It is a single Go binary reading local files.

CI Integration Mode

In CI mode (--fail-under-health, --fail-on-critical, --max-tool-fail-rate), AgentTrace acts as an exit-code gate:

  • Exit 0 = healthy
  • Exit non-zero = gate failed

This enables CI pipelines to catch agent session quality regressions.

Skill-as-Agent Interface

The agenttrace-session-audit skill allows AgentTrace to be invoked by an AI coding agent (Claude Code, Codex) rather than only by humans. In this mode, an agent can:

  1. Run agenttrace --doctor to discover sessions
  2. Run agenttrace --overview -f json to get structured data
  3. Analyze the output and surface findings as part of its own session

This creates a meta-observability loop: an agent can audit its own (or peer agents') session history.

Multi-Model

Not applicable — AgentTrace does not invoke LLMs. It is a pure data analysis tool.

08

Ui Cli Surface

AgentTrace — UI / CLI Surface

CLI Binary: agenttrace

  • Name: agenttrace
  • Is thin wrapper: No — standalone Go CLI, own analysis runtime
  • Install: curl install.sh | sh / brew install luoyuctl/tap/agenttrace / go install

Terminal TUI

AgentTrace ships a full Bubble Tea TUI:

agenttrace    # launches TUI with no args

TUI views (from README screenshots):

  • Overview: sessions table with cost, tokens, health score, errors, source
  • Critical sessions: filtered list of high-risk sessions
  • Session detail: full breakdown — model, cost, turns, health, tool failures, next action recommendation
  • Diagnostics: latency timeline, context window pressure, large parameter calls, retry loops

The TUI is keyboard-navigable and shows real-time data from local log directories.

Output Formats

Format Flag Use case
text (default) Terminal review
json -f json Machine-readable, CI integration
markdown -f markdown Human-shareable reports
html -f html Self-contained report for artifacts/issue links

Internationalization

--lang en   # English (default)
--lang zh   # Chinese (简体中文)

CI Gate Interface

agenttrace --overview \
  --fail-under-health 80 \
  --fail-on-critical \
  --max-tool-fail-rate 15

Exit codes:

  • 0 — healthy, within thresholds
  • non-zero — threshold exceeded (see stderr for which)

IDE Integration

None — terminal-only. The agenttrace-session-audit skill bridges terminal usage into AI coding agent sessions.

Observability

AgentTrace IS the observability layer. It does not observe itself.

Demo Mode

--demo — uses built-in demo sessions for exploration without real log files.

Related frameworks

same archetype · same primary tool · same memory type

claude-mem (thedotmack) ★ 78k

Background worker service captures every tool call as an observation, AI-compresses sessions, and auto-injects relevant past…

pi (badlogic/earendil) ★ 55k

A minimal, hackable, multi-provider terminal coding agent that adapts to your workflows via npm-installable TypeScript Extensions…

Agent Skills (Addy Osmani) ★ 46k

Encodes senior-engineer software development lifecycle as 23 auto-routed skills and 7 slash commands for any AI coding agent.

wshobson/agents Plugin Marketplace ★ 36k

Single Markdown source for 83 domain-specialized plugins that auto-generates idiomatic artifacts for five AI coding harnesses.

TabbyML/Tabby ★ 34k

Self-hosted AI coding assistant server (alternative to GitHub Copilot) with admin dashboard, RAG-based completions, and multi-IDE…

Compound Engineering ★ 17k

Make each unit of engineering work compound into easier future work via brainstorm→plan→execute→review→learn cycles.