HexAgent

hexagent · UnicomAI/hexagent · ★ 122 · last commit 2026-05-19

Gives any LLM a computer via a runtime-computer isolation protocol — the harness never shares its keys or config with the agent.

Best whenThe agent should never see its own harness; Computer protocol isolation is not optional, it is the foundational design constraint.

Skip ifConflating agent runtime with the computer it controls, Clever over obvious code

vs seeds

claude-flowin that subagents are defined as code-class constructs (AgentDefinition Python objects) rather than persona-md files, an…

Primitive shape

No installable primitives

Summary

HexAgent — Summary

HexAgent is an open-source Python agent harness library from UnicomAI that explicitly separates the agent's runtime from the "computer" it operates on, preventing the agent from reading its own API keys, source code, or harness configuration. Rather than assembling an agent from building blocks, HexAgent provides a batteries-included operating environment: 12+ built-in tools, 3-phase automatic context compaction, a pluggable Computer protocol (local shell, local VM via Lima/WSL, or cloud sandbox via E2B), native MCP support across stdio/SSE/HTTP, and subagent orchestration supporting both foreground (blocking) and background (parallel) child agents. A single create_agent() call produces an agent that can run as a CLI coding agent, chatbot, Cowork-style desktop assistant, or fully autonomous headless agent. HexAgent explicitly positions itself against LangChain Deep Agents and the Claude Agent SDK, claiming the key differentiator is the runtime-computer isolation boundary. Compared to seeds, HexAgent most resembles claude-flow's MCP-anchored approach (own runtime, subagents as code-classes) but abandons the slash-command/skill-md surface entirely — the API is pure Python and there are no CLAUDE.md-injected behavioral rules; the whole framework is a composable library.

differs_from_seeds: Closest to claude-flow in that it ships a full runtime with subagents defined as code-level constructs (AgentDefinition Python objects), not persona-md files. Unlike claude-flow's 305-tool MCP server, HexAgent's MCP support is a client (consume any MCP server); there is no bundled MCP server to install. Unlike superpowers or spec-driver (skills-only, Claude Code context injection), HexAgent has no CLAUDE.md primitives whatsoever — the harness is a Python library consumed programmatically, not a Claude Code plugin.

Overview

HexAgent — Overview

Origin

HexAgent is developed under UnicomAI (GitHub: UnicomAI/hexagent). The README cites inspiration from Adam Wolff's QCon 2025 talk on Claude Code's architecture and a growing consensus around the "harness" vs "framework" distinction articulated in an Inngest blog post ("your agent needs a harness, not a framework"). The project targets Python 3.11+, is at pre-experimental status (0.0.x), and uses uv for dependency management.

Philosophy

The manifesto centers on three words: "Give the agent a computer."

Key philosophical positions (verbatim from README):

"Unlike every other agent framework, HexAgent separates the agent runtime from the computer it operates on. Your agent gets a sandboxed machine; your runtime keeps its API keys, config, and source code private."

"A framework gives you building blocks and says 'assemble your own agent.' A harness gives the agent a fully equipped runtime — tools, context management, safety, execution environments — so you focus on what the agent does, not how it executes."

"Composable, not magical — Small modules with explicit I/O. No hidden state. Every piece is testable and replaceable."

The AGENTS.md (developer instructions) adds:

"Testability: Every module must be testable in isolation without complex mocks." "Idempotency: Operations should be safely repeatable. Retries must not cause unintended side effects."

Product Types Supported

From one harness:

CLI Coding Agent — terminal-native, analogous to Claude Code / Gemini CLI
Chatbot — conversational assistant with tool use
Cowork — desktop agent working on local files (analogous to Claude Cowork)
Autonomous Agent — headless, long-running

Taxonomy Position

HexAgent explicitly uses the LangChain taxonomy (Framework / Runtime / Harness) and places itself in the Harness tier alongside Claude Code and OpenHands — not the Framework tier (LangChain, CrewAI) or Runtime tier (LangGraph, Temporal).

Architecture

HexAgent — Architecture

Distribution

Type: Python library (pip install hexagent)
License: MIT
Runtime: Python 3.11+, uv recommended
Repository structure: monorepo with libs/hexagent (core) and libs/hexagent_demo (demo app)

Directory Tree (core library)

libs/hexagent/hexagent/
├── computer/       # Computer protocol — LocalNativeComputer, LocalVM, RemoteE2BComputer
├── harness/        # Runtime augmentation: environment, permissions, skills, reminders
├── tools/          # Built-in tools: CLI (Bash, Read, Write, Edit, Glob, Grep), web, subagents, skills, todos
├── prompts/        # Composable system prompt from 35+ Markdown fragments with variable substitution
├── mcp/            # MCP client (stdio, SSE, HTTP)
├── langchain/      # LangChain/LangGraph integration (isolated — zero leakage into core)
└── types.py        # Framework-agnostic types: ToolResult, AgentContext, CLIResult

Computer Protocol (key differentiator)

Three implementations:

Computer	Environment	Use case
`LocalNativeComputer`	Host shell	Development, trusted agents
`LocalVM`	Lima (macOS) / WSL (Windows)	Security-sensitive work
`RemoteE2BComputer`	E2B cloud sandbox	Production, multi-tenant, CI/CD

The Computer protocol is pluggable — implement your own for Docker, Kubernetes pods, or any remote target.

Middleware Pipeline

Pre-model hooks in order:

Context compaction (3-phase automatic)
Permission gating (multi-layer safety validation before tool execution)
Skill injection (filesystem-based SKILL.md extensions, lazy loading)
Image adaptation
Dynamic <system-reminder> injection

Target AI Tools

Any LLM via OpenAI-compatible endpoint (Anthropic, OpenAI, DeepSeek, Qwen, Llama, Mistral)
LangChain/LangGraph integration isolated in hexagent.langchain module
Observability: LangSmith, Braintrust

Config Files

AGENTS.md — developer/agent instructions (read by Claude Code, Codex, etc.)
CLAUDE.md — project instructions
pyproject.toml — Python package config
libs/hexagent/hexagent/ — library source (no YAML/TOML config file for the harness itself; configured programmatically)

Components

HexAgent — Components

Built-in Tools (12+)

From hexagent/tools/__init__.py:

Tool	Purpose
`BashTool`	Execute bash commands on the Computer
`ReadTool`	Read file contents with line numbers
`WriteTool`	Create or overwrite files
`EditTool`	Perform string replacements in files
`GlobTool`	Find files by pattern
`GrepTool`	Search for patterns in files
`WebSearchTool`	Search the web (Tavily/Brave backend)
`WebFetchTool`	Fetch and extract content from web pages (Jina/Firecrawl backend)
`SkillTool`	Invoke specialized skills by name
`AgentTool`	Spawn subagents (foreground)
`TaskOutputTool`	Return structured output from a subagent
`TaskStopTool`	Terminate a subagent
`TodoWriteTool`	Manage structured task lists
`PresentToUserTool`	Surface output directly to the user (Cowork mode)

Skills System

Discovery: Filesystem-based — looks for directories containing SKILL.md with YAML frontmatter
Loading: Lazy (on-demand, not all at once)
Format: SKILL.md with name, description, whenToUse frontmatter + markdown instructions
Scripts: Optional scripts/ directory for pre-built scripts referenced by SKILL.md

Subagent Definition

Subagents are defined as AgentDefinition Python objects at call site:

agents={
    "researcher": AgentDefinition(
        description="Deep-dives into codebases",
        tools=["Read", "Glob", "Grep", "WebSearch"],
        model="fast",
    ),
}

Not persona-md files — purely code-level definitions.

MCP Client

Protocols supported: stdio, SSE, HTTP
MCP servers added at create_agent() time via mcp_servers dict
HexAgent is an MCP consumer, not a bundled MCP server

Computer Abstraction

LocalNativeComputer — runs tools on the host machine
LocalVM — Lima (macOS) or WSL (Windows) sandboxed VM
RemoteE2BComputer — E2B cloud sandbox for multi-tenant production use

Middleware / Hooks

Permission gating: multi-layer safety rules before every tool call
System reminders: rule-based <system-reminder> injections before model calls
3-phase context compaction: automatic, built into the harness
Image adaptation: preprocesses image inputs for model compatibility

Commands

None. HexAgent has no slash-commands, no CLAUDE.md behavioral hooks. The surface is purely Python API.

Hooks (Claude Code lifecycle)

None. No settings.json hooks, no hooks/hooks.json. HexAgent is not a Claude Code plugin.

Templates

None in the traditional sense. The prompts/ directory contains 35+ Markdown fragments that compose the system prompt programmatically.

Prompts

HexAgent — Prompts

Prompt Architecture

HexAgent builds its system prompt from 35+ Markdown fragments located in libs/hexagent/hexagent/prompts/. Variable substitution is applied at runtime (environment detection: pwd, git, platform, shell, timezone).

Verbatim Excerpt 1 — AGENTS.md (developer instructions, project-level)

## Architecture Principles

- **Testability:** Every module must be testable in isolation without complex mocks. If you can't write a fast, deterministic test for it, redesign the module—not the test.
- **Composability:** Prefer small, single-purpose units with explicit inputs and outputs over larger multi-purpose ones. No hidden state. Pieces should combine freely and fail locally.
- **Minimal Dependencies:** A change to module A should require understanding only module A. No implicit contracts, no action-at-a-distance, no "you also need to update X, Y, Z."
- **Agent-First:** Design tools and format results for agent ergonomics. Hide infrastructure complexity inside modules.
- **Simplicity:** Favor obvious solutions over clever ones. Code should be readable without context.
- **Idempotency:** Operations should be safely repeatable. Retries must not cause unintended side effects.

Technique: Principle enumeration — each principle named + rationale + anti-pattern implicit. Targeted at a coding agent reading AGENTS.md, not a user. Very similar to "Iron Law" style from superpowers but expressed as architecture guidelines rather than behavioral rules.

Verbatim Excerpt 2 — API Design (create_agent call pattern)

agent = await create_agent(
    model="anthropic/claude-sonnet-4-20250514",
    computer=LocalNativeComputer(),
    agents={
        "researcher": AgentDefinition(
            description="Deep-dives into codebases",
            tools=["Read", "Glob", "Grep", "WebSearch"],
            model="fast",
        ),
    },
    mcp_servers={
        "github": {"type": "http", "url": "https://mcp.github.com/mcp"},
    },
    search_provider=("tavily", "your-key"),
    fetch_provider=("jina", "your-key"),
)

Technique: The create_agent() factory is the primary "prompt" surface — subagent roles, tool sets, and model assignments are expressed as Python kwargs, not markdown. This is code-as-prompt for harness configuration.

Prompt Composition Pattern

The prompts/ directory uses a fragment-assembly pattern: each fragment is a Markdown file covering one aspect (environment context, tools available, safety rules, etc.). At runtime, the harness assembles them with {variable} substitution for dynamic values (current directory, git status, platform, etc.). This is analogous to how claude-flow builds its system prompts from modular pieces, but entirely in Python without any MCP intermediary.

No User-Facing Skill/Command Prompts

Unlike superpowers, BMAD, or spec-driver, HexAgent ships no CLAUDE.md-loadable skill files for end users. The behavioral "prompts" are baked into the library's Markdown fragments — not customizable without forking.

Uniqueness

HexAgent — Uniqueness

differs_from_seeds

HexAgent is closest to claude-flow in that both ship a full agent runtime with subagents defined as code constructs rather than YAML or persona-md files. However, claude-flow bundles a 305-tool MCP server as its primary interface and stores state in SQLite + HNSW vector; HexAgent has no bundled MCP server and stores no persistent state — it is a client library. HexAgent's decisive architectural differentiator versus every seed is the Computer protocol: the abstraction that separates the agent's runtime from the machine it controls, which no seed implements. Compared to superpowers/spec-driver/BMAD (skills-only frameworks injected via CLAUDE.md), HexAgent has zero CLAUDE.md primitives — it is a Python library, not a Claude Code plugin. Compared to agent-os/claude-conductor (markdown scaffold frameworks), HexAgent provides actual execution infrastructure rather than template documents. The isolation story (LocalVM / RemoteE2BComputer) goes further than any seed.

Positioning

HexAgent positions itself as the developer SDK for building agent products (Claude Code clones, Cowork apps, autonomous agents, chatbots) rather than a workflow guide for using Claude Code. It targets Python developers building their own agent applications, not knowledge workers using Claude Code.

Observable Failure Modes

No behavioral defaults: No workflow enforcement, no TDD requirement, no spec-first gate. Entirely up to the developer.
0.0.x pre-experimental: Backward compatibility explicitly disclaimed. APIs can break at any commit.
No cross-session memory: Each invocation is stateless unless the developer explicitly persists and re-injects conversation history.
LangChain dependency under the hood: Despite the "vendor-agnostic core" claim, LangGraph is the execution engine (isolated in hexagent.langchain/ but present).
No audit log: No framework-maintained record of what the agent did or decided.
Demo app unclear: The hexagent_demo exists but there is no clear documentation of its UI surface or how to run it beyond the README mention.

Workflow

HexAgent — Workflow

Phases

Phase	Description	Artifact
1. Agent creation	`create_agent(model=..., computer=...)` call configures runtime	`Agent` instance
2. System prompt composition	35+ Markdown fragments assembled with variable substitution	Composed system prompt
3. Permission gating setup	Safety rules registered for tool validation	Permission policy
4. Skill discovery	SKILL.md files indexed lazily from filesystem	Skill catalog
5. Task execution	`agent.ainvoke({"messages": [...]})` starts the agent loop	Messages + results
6. Per-tool-call validation	Middleware validates every tool call before execution	Tool results
7. Context compaction	3-phase automatic when context grows large	Compacted context
8. Subagent orchestration (if used)	Foreground (blocking) or background (parallel) child agents spawn	Subagent results

Approval Gates

Permission gating: Multi-layer rule-based validation before every tool call
Human-in-the-loop approval flows: Supported but not enforced by default — configured per-agent

Execution Model

Programmatic: await agent.ainvoke(...) — synchronous one-shot
Async context manager: async with await create_agent(...) as agent: — handles resource cleanup
No built-in REPL, no CLI binary (library-only)

Context Compaction

3-phase automatic:

Offload completed intermediate results to filesystem
Summarize older context
Truncate to fit context window

The README describes this as an "architectural concern, not an afterthought."

No Workflow Gate Enforced by Default

Unlike superpowers (which enforces brainstorming → spec → plan → TDD) or BMAD (which enforces persona-driven phases), HexAgent imposes no workflow phases. The developer decides the task; the harness executes it.

Memory Context

HexAgent — Memory & Context

Context Compaction

HexAgent implements 3-phase automatic compaction:

Offload: Completed intermediate results written to filesystem
Summarize: Older context summarized by the model
Truncate: Recent messages preserved verbatim; older content replaced with summaries

The README states: "Context is an architectural concern, not an afterthought." Compaction runs automatically when the harness detects context pressure — no user action required.

Memory Type

File-based — no built-in persistent memory store (SQLite, vector DB, Neo4j). The agent's working state lives in the current conversation context plus files written to the Computer's filesystem.

Cross-Session Handoff

No built-in cross-session state. Each create_agent() call starts fresh. Sessions can be continued by the developer by loading prior conversation history into the messages parameter.

Skill Memory

Skills are discovered at launch from the filesystem. The SkillCatalog protocol supports lazy loading — skills are indexed at startup but content loaded on demand.

State Files

Files written by agent tools to the Computer's filesystem
No framework-maintained state files (no tasks.json, no journal.md)

System Reminders

The <system-reminder> mechanism injects rule-based context before model calls. Rules are defined programmatically at harness configuration time. This is the closest HexAgent comes to persistent behavioral context.

Observability

LangSmith: Full trace integration (spans, token counts, latency)
Braintrust: Alternative trace backend
Both are optional; neither is bundled

Orchestration

HexAgent — Orchestration

Multi-Agent Support

Yes. HexAgent supports subagent orchestration natively.

Spawn mechanisms:

Foreground subagents (AgentTool): block the parent's turn, return a result
Background subagents: run in parallel with the parent

Definition format: Python AgentDefinition objects passed to create_agent(). Not persona-md files. Each definition specifies description, tools list (subset of parent's toolset), and optional model override.

Nesting: Subagents can themselves spawn sub-subagents (depth limit: not stated in README).

Orchestration Pattern

parallel-fan-out — background subagents run concurrently; foreground subagents are sequential (blocking). No formal consensus mechanism.

Isolation Mechanism

The Computer protocol provides isolation:

LocalNativeComputer: none (runs in host shell, same process risk)
LocalVM: Lima (macOS) / WSL (Windows) — process-level VM isolation
RemoteE2BComputer: E2B cloud sandbox — full container isolation

The harness runtime itself is always isolated from the agent's Computer — the agent cannot read the harness's API keys or source code.

Multi-Model

Yes. The model parameter accepts any string; subagents can use a different model than the parent. The "fast" shorthand selects a cheaper/faster model automatically.

AgentDefinition(
    description="...",
    tools=["Read", "Glob"],
    model="fast",  # cheaper model for subagents
)

Execution Mode

Event-driven / one-shot per invocation (await agent.ainvoke(...)). No built-in daemon mode, no background loop that survives process exit. Each call is a discrete session.

Consensus Mechanism

None. Subagent results are collected by the parent and synthesized in the parent's next model call.

Prompt Chaining

Yes — each subagent's result is passed back into the parent's conversation history, forming a chain.

Auto-Validators

None built-in. No PostToolUse hooks that auto-run tests/lint. Validation is the developer's responsibility.

Ui Cli Surface

HexAgent — UI / CLI Surface

CLI Binary

None. HexAgent is a Python library consumed programmatically. There is no hexagent CLI binary.

The libs/hexagent_demo/ package ships a demo application with Chat and Cowork modes, but it's an example app, not a production CLI tool.

Local UI / Dashboard

None in the core library. The hexagent_demo ships a demo app (mode unclear from available files) but no web dashboard is bundled in the core hexagent package.

IDE Integration

None. No VSCode extension, no Cursor plugin, no Claude Code plugin.

How Users Interact

Programmatic Python only:

async with await create_agent(model="...", computer=LocalNativeComputer()) as agent:
    result = await agent.ainvoke({"messages": [{"role": "user", "content": "..."}]})

Observability

LangSmith (optional): full LangChain trace integration — spans, token counts, latency per call
Braintrust (optional): alternative trace backend
No built-in metrics endpoint or dashboard

Install Surface

pip install hexagent

Single-line install. No config wizard, no scaffolding command.

Target Integrations

Any HTTP-based LLM API (OpenAI-compatible endpoint)
MCP servers (consumed, not bundled)
E2B (cloud sandbox provider)
Tavily / Brave (web search)
Jina / Firecrawl (web fetch)
LangSmith / Braintrust (observability)

Related frameworks

same archetype · same primary tool · same memory type

claude-mem (thedotmack) ★ 78k

A8 Cross-runtime harness

Background worker service captures every tool call as an observation, AI-compresses sessions, and auto-injects relevant past…

pi (badlogic/earendil) ★ 55k

A8 Cross-runtime harness

A minimal, hackable, multi-provider terminal coding agent that adapts to your workflows via npm-installable TypeScript Extensions…

Agent Skills (Addy Osmani) ★ 46k

A8 Cross-runtime harness

Encodes senior-engineer software development lifecycle as 23 auto-routed skills and 7 slash commands for any AI coding agent.

wshobson/agents Plugin Marketplace ★ 36k

A8 Cross-runtime harness

Single Markdown source for 83 domain-specialized plugins that auto-generates idiomatic artifacts for five AI coding harnesses.

TabbyML/Tabby ★ 34k

A8 Cross-runtime harness

Self-hosted AI coding assistant server (alternative to GitHub Copilot) with admin dashboard, RAG-based completions, and multi-IDE…

Compound Engineering ★ 17k

A8 Cross-runtime harness

Make each unit of engineering work compound into easier future work via brainstorm→plan→execute→review→learn cycles.

Distribution

Type: standalone-repo
License: MIT
Install: one-liner
Version: 0.0.x (pre-experimental)

Surfaces

CLI binary: No
CLI subcmds: 0
Local UI: No
Tech stack: none (hexagent_demo exists but is an example app, not a bundled dashboard)

Components

Commands: 0
Skills: 0
Subagents: 0
Hooks: 0
MCP servers: 0
MCP tools: 0
Scripts: 2
Templates: 0

Workflow

Phases: 7
Approval gates: 1
Spec format: none
Spec storage: none
Delta or full: none

Orchestration

Multi-agent: Yes
Pattern: parallel-fan-out
Isolation: container
Consensus: none
Prompt chaining: Yes

Multi-model

Multi-model: Yes
BYOK: Yes
Modal: text+vision

Execution

Mode: one-shot
Crash recovery: No
Compaction: Yes
Session handoff: No
Streaming: Yes

Memory

Type: file-based
Persistence: session
Search: none

Quality

TDD: No
TDD mechanism: none
Self-review: none

Git / Observability

Auto commit: No
Auto PR: No
Auto merge: No
Worktree/feat: No
Audit log: No
Audit format: none
Replay: No

Tools

Primary: python-library
Targets: 4
Portability: high

Signals

Stars: 122
Last commit: 2026-05-19
Maintainer: active
Quality score: 3/10