Skip to content
/

HexAgent

hexagent · UnicomAI/hexagent · ★ 122 · last commit 2026-05-19

Gives any LLM a computer via a runtime-computer isolation protocol — the harness never shares its keys or config with the agent.

Best whenThe agent should never see its own harness; Computer protocol isolation is not optional, it is the foundational design constraint.
Skip ifConflating agent runtime with the computer it controls, Clever over obvious code
vs seeds
claude-flowin that subagents are defined as code-class constructs (AgentDefinition Python objects) rather than persona-md files, an…
Primitive shape
No installable primitives
00

Summary

HexAgent — Summary

HexAgent is an open-source Python agent harness library from UnicomAI that explicitly separates the agent's runtime from the "computer" it operates on, preventing the agent from reading its own API keys, source code, or harness configuration. Rather than assembling an agent from building blocks, HexAgent provides a batteries-included operating environment: 12+ built-in tools, 3-phase automatic context compaction, a pluggable Computer protocol (local shell, local VM via Lima/WSL, or cloud sandbox via E2B), native MCP support across stdio/SSE/HTTP, and subagent orchestration supporting both foreground (blocking) and background (parallel) child agents. A single create_agent() call produces an agent that can run as a CLI coding agent, chatbot, Cowork-style desktop assistant, or fully autonomous headless agent. HexAgent explicitly positions itself against LangChain Deep Agents and the Claude Agent SDK, claiming the key differentiator is the runtime-computer isolation boundary. Compared to seeds, HexAgent most resembles claude-flow's MCP-anchored approach (own runtime, subagents as code-classes) but abandons the slash-command/skill-md surface entirely — the API is pure Python and there are no CLAUDE.md-injected behavioral rules; the whole framework is a composable library.

differs_from_seeds: Closest to claude-flow in that it ships a full runtime with subagents defined as code-level constructs (AgentDefinition Python objects), not persona-md files. Unlike claude-flow's 305-tool MCP server, HexAgent's MCP support is a client (consume any MCP server); there is no bundled MCP server to install. Unlike superpowers or spec-driver (skills-only, Claude Code context injection), HexAgent has no CLAUDE.md primitives whatsoever — the harness is a Python library consumed programmatically, not a Claude Code plugin.

01

Overview

HexAgent — Overview

Origin

HexAgent is developed under UnicomAI (GitHub: UnicomAI/hexagent). The README cites inspiration from Adam Wolff's QCon 2025 talk on Claude Code's architecture and a growing consensus around the "harness" vs "framework" distinction articulated in an Inngest blog post ("your agent needs a harness, not a framework"). The project targets Python 3.11+, is at pre-experimental status (0.0.x), and uses uv for dependency management.

Philosophy

The manifesto centers on three words: "Give the agent a computer."

Key philosophical positions (verbatim from README):

"Unlike every other agent framework, HexAgent separates the agent runtime from the computer it operates on. Your agent gets a sandboxed machine; your runtime keeps its API keys, config, and source code private."

"A framework gives you building blocks and says 'assemble your own agent.' A harness gives the agent a fully equipped runtime — tools, context management, safety, execution environments — so you focus on what the agent does, not how it executes."

"Composable, not magical — Small modules with explicit I/O. No hidden state. Every piece is testable and replaceable."

The AGENTS.md (developer instructions) adds:

"Testability: Every module must be testable in isolation without complex mocks." "Idempotency: Operations should be safely repeatable. Retries must not cause unintended side effects."

Product Types Supported

From one harness:

  1. CLI Coding Agent — terminal-native, analogous to Claude Code / Gemini CLI
  2. Chatbot — conversational assistant with tool use
  3. Cowork — desktop agent working on local files (analogous to Claude Cowork)
  4. Autonomous Agent — headless, long-running

Taxonomy Position

HexAgent explicitly uses the LangChain taxonomy (Framework / Runtime / Harness) and places itself in the Harness tier alongside Claude Code and OpenHands — not the Framework tier (LangChain, CrewAI) or Runtime tier (LangGraph, Temporal).

02

Architecture

HexAgent — Architecture

Distribution

  • Type: Python library (pip install hexagent)
  • License: MIT
  • Runtime: Python 3.11+, uv recommended
  • Repository structure: monorepo with libs/hexagent (core) and libs/hexagent_demo (demo app)

Directory Tree (core library)

libs/hexagent/hexagent/
├── computer/       # Computer protocol — LocalNativeComputer, LocalVM, RemoteE2BComputer
├── harness/        # Runtime augmentation: environment, permissions, skills, reminders
├── tools/          # Built-in tools: CLI (Bash, Read, Write, Edit, Glob, Grep), web, subagents, skills, todos
├── prompts/        # Composable system prompt from 35+ Markdown fragments with variable substitution
├── mcp/            # MCP client (stdio, SSE, HTTP)
├── langchain/      # LangChain/LangGraph integration (isolated — zero leakage into core)
└── types.py        # Framework-agnostic types: ToolResult, AgentContext, CLIResult

Computer Protocol (key differentiator)

Three implementations:

Computer Environment Use case
LocalNativeComputer Host shell Development, trusted agents
LocalVM Lima (macOS) / WSL (Windows) Security-sensitive work
RemoteE2BComputer E2B cloud sandbox Production, multi-tenant, CI/CD

The Computer protocol is pluggable — implement your own for Docker, Kubernetes pods, or any remote target.

Middleware Pipeline

Pre-model hooks in order:

  1. Context compaction (3-phase automatic)
  2. Permission gating (multi-layer safety validation before tool execution)
  3. Skill injection (filesystem-based SKILL.md extensions, lazy loading)
  4. Image adaptation
  5. Dynamic <system-reminder> injection

Target AI Tools

  • Any LLM via OpenAI-compatible endpoint (Anthropic, OpenAI, DeepSeek, Qwen, Llama, Mistral)
  • LangChain/LangGraph integration isolated in hexagent.langchain module
  • Observability: LangSmith, Braintrust

Config Files

  • AGENTS.md — developer/agent instructions (read by Claude Code, Codex, etc.)
  • CLAUDE.md — project instructions
  • pyproject.toml — Python package config
  • libs/hexagent/hexagent/ — library source (no YAML/TOML config file for the harness itself; configured programmatically)
03

Components

HexAgent — Components

Built-in Tools (12+)

From hexagent/tools/__init__.py:

Tool Purpose
BashTool Execute bash commands on the Computer
ReadTool Read file contents with line numbers
WriteTool Create or overwrite files
EditTool Perform string replacements in files
GlobTool Find files by pattern
GrepTool Search for patterns in files
WebSearchTool Search the web (Tavily/Brave backend)
WebFetchTool Fetch and extract content from web pages (Jina/Firecrawl backend)
SkillTool Invoke specialized skills by name
AgentTool Spawn subagents (foreground)
TaskOutputTool Return structured output from a subagent
TaskStopTool Terminate a subagent
TodoWriteTool Manage structured task lists
PresentToUserTool Surface output directly to the user (Cowork mode)

Skills System

  • Discovery: Filesystem-based — looks for directories containing SKILL.md with YAML frontmatter
  • Loading: Lazy (on-demand, not all at once)
  • Format: SKILL.md with name, description, whenToUse frontmatter + markdown instructions
  • Scripts: Optional scripts/ directory for pre-built scripts referenced by SKILL.md

Subagent Definition

Subagents are defined as AgentDefinition Python objects at call site:

agents={
    "researcher": AgentDefinition(
        description="Deep-dives into codebases",
        tools=["Read", "Glob", "Grep", "WebSearch"],
        model="fast",
    ),
}

Not persona-md files — purely code-level definitions.

MCP Client

  • Protocols supported: stdio, SSE, HTTP
  • MCP servers added at create_agent() time via mcp_servers dict
  • HexAgent is an MCP consumer, not a bundled MCP server

Computer Abstraction

  • LocalNativeComputer — runs tools on the host machine
  • LocalVM — Lima (macOS) or WSL (Windows) sandboxed VM
  • RemoteE2BComputer — E2B cloud sandbox for multi-tenant production use

Middleware / Hooks

  • Permission gating: multi-layer safety rules before every tool call
  • System reminders: rule-based <system-reminder> injections before model calls
  • 3-phase context compaction: automatic, built into the harness
  • Image adaptation: preprocesses image inputs for model compatibility

Commands

None. HexAgent has no slash-commands, no CLAUDE.md behavioral hooks. The surface is purely Python API.

Hooks (Claude Code lifecycle)

None. No settings.json hooks, no hooks/hooks.json. HexAgent is not a Claude Code plugin.

Templates

None in the traditional sense. The prompts/ directory contains 35+ Markdown fragments that compose the system prompt programmatically.

05

Prompts

HexAgent — Prompts

Prompt Architecture

HexAgent builds its system prompt from 35+ Markdown fragments located in libs/hexagent/hexagent/prompts/. Variable substitution is applied at runtime (environment detection: pwd, git, platform, shell, timezone).

Verbatim Excerpt 1 — AGENTS.md (developer instructions, project-level)

## Architecture Principles

- **Testability:** Every module must be testable in isolation without complex mocks. If you can't write a fast, deterministic test for it, redesign the module—not the test.
- **Composability:** Prefer small, single-purpose units with explicit inputs and outputs over larger multi-purpose ones. No hidden state. Pieces should combine freely and fail locally.
- **Minimal Dependencies:** A change to module A should require understanding only module A. No implicit contracts, no action-at-a-distance, no "you also need to update X, Y, Z."
- **Agent-First:** Design tools and format results for agent ergonomics. Hide infrastructure complexity inside modules.
- **Simplicity:** Favor obvious solutions over clever ones. Code should be readable without context.
- **Idempotency:** Operations should be safely repeatable. Retries must not cause unintended side effects.

Technique: Principle enumeration — each principle named + rationale + anti-pattern implicit. Targeted at a coding agent reading AGENTS.md, not a user. Very similar to "Iron Law" style from superpowers but expressed as architecture guidelines rather than behavioral rules.

Verbatim Excerpt 2 — API Design (create_agent call pattern)

agent = await create_agent(
    model="anthropic/claude-sonnet-4-20250514",
    computer=LocalNativeComputer(),
    agents={
        "researcher": AgentDefinition(
            description="Deep-dives into codebases",
            tools=["Read", "Glob", "Grep", "WebSearch"],
            model="fast",
        ),
    },
    mcp_servers={
        "github": {"type": "http", "url": "https://mcp.github.com/mcp"},
    },
    search_provider=("tavily", "your-key"),
    fetch_provider=("jina", "your-key"),
)

Technique: The create_agent() factory is the primary "prompt" surface — subagent roles, tool sets, and model assignments are expressed as Python kwargs, not markdown. This is code-as-prompt for harness configuration.

Prompt Composition Pattern

The prompts/ directory uses a fragment-assembly pattern: each fragment is a Markdown file covering one aspect (environment context, tools available, safety rules, etc.). At runtime, the harness assembles them with {variable} substitution for dynamic values (current directory, git status, platform, etc.). This is analogous to how claude-flow builds its system prompts from modular pieces, but entirely in Python without any MCP intermediary.

No User-Facing Skill/Command Prompts

Unlike superpowers, BMAD, or spec-driver, HexAgent ships no CLAUDE.md-loadable skill files for end users. The behavioral "prompts" are baked into the library's Markdown fragments — not customizable without forking.

09

Uniqueness

HexAgent — Uniqueness

differs_from_seeds

HexAgent is closest to claude-flow in that both ship a full agent runtime with subagents defined as code constructs rather than YAML or persona-md files. However, claude-flow bundles a 305-tool MCP server as its primary interface and stores state in SQLite + HNSW vector; HexAgent has no bundled MCP server and stores no persistent state — it is a client library. HexAgent's decisive architectural differentiator versus every seed is the Computer protocol: the abstraction that separates the agent's runtime from the machine it controls, which no seed implements. Compared to superpowers/spec-driver/BMAD (skills-only frameworks injected via CLAUDE.md), HexAgent has zero CLAUDE.md primitives — it is a Python library, not a Claude Code plugin. Compared to agent-os/claude-conductor (markdown scaffold frameworks), HexAgent provides actual execution infrastructure rather than template documents. The isolation story (LocalVM / RemoteE2BComputer) goes further than any seed.

Positioning

HexAgent positions itself as the developer SDK for building agent products (Claude Code clones, Cowork apps, autonomous agents, chatbots) rather than a workflow guide for using Claude Code. It targets Python developers building their own agent applications, not knowledge workers using Claude Code.

Observable Failure Modes

  • No behavioral defaults: No workflow enforcement, no TDD requirement, no spec-first gate. Entirely up to the developer.
  • 0.0.x pre-experimental: Backward compatibility explicitly disclaimed. APIs can break at any commit.
  • No cross-session memory: Each invocation is stateless unless the developer explicitly persists and re-injects conversation history.
  • LangChain dependency under the hood: Despite the "vendor-agnostic core" claim, LangGraph is the execution engine (isolated in hexagent.langchain/ but present).
  • No audit log: No framework-maintained record of what the agent did or decided.
  • Demo app unclear: The hexagent_demo exists but there is no clear documentation of its UI surface or how to run it beyond the README mention.
04

Workflow

HexAgent — Workflow

Phases

Phase Description Artifact
1. Agent creation create_agent(model=..., computer=...) call configures runtime Agent instance
2. System prompt composition 35+ Markdown fragments assembled with variable substitution Composed system prompt
3. Permission gating setup Safety rules registered for tool validation Permission policy
4. Skill discovery SKILL.md files indexed lazily from filesystem Skill catalog
5. Task execution agent.ainvoke({"messages": [...]}) starts the agent loop Messages + results
6. Per-tool-call validation Middleware validates every tool call before execution Tool results
7. Context compaction 3-phase automatic when context grows large Compacted context
8. Subagent orchestration (if used) Foreground (blocking) or background (parallel) child agents spawn Subagent results

Approval Gates

  • Permission gating: Multi-layer rule-based validation before every tool call
  • Human-in-the-loop approval flows: Supported but not enforced by default — configured per-agent

Execution Model

  • Programmatic: await agent.ainvoke(...) — synchronous one-shot
  • Async context manager: async with await create_agent(...) as agent: — handles resource cleanup
  • No built-in REPL, no CLI binary (library-only)

Context Compaction

3-phase automatic:

  1. Offload completed intermediate results to filesystem
  2. Summarize older context
  3. Truncate to fit context window

The README describes this as an "architectural concern, not an afterthought."

No Workflow Gate Enforced by Default

Unlike superpowers (which enforces brainstorming → spec → plan → TDD) or BMAD (which enforces persona-driven phases), HexAgent imposes no workflow phases. The developer decides the task; the harness executes it.

06

Memory Context

HexAgent — Memory & Context

Context Compaction

HexAgent implements 3-phase automatic compaction:

  1. Offload: Completed intermediate results written to filesystem
  2. Summarize: Older context summarized by the model
  3. Truncate: Recent messages preserved verbatim; older content replaced with summaries

The README states: "Context is an architectural concern, not an afterthought." Compaction runs automatically when the harness detects context pressure — no user action required.

Memory Type

File-based — no built-in persistent memory store (SQLite, vector DB, Neo4j). The agent's working state lives in the current conversation context plus files written to the Computer's filesystem.

Cross-Session Handoff

No built-in cross-session state. Each create_agent() call starts fresh. Sessions can be continued by the developer by loading prior conversation history into the messages parameter.

Skill Memory

Skills are discovered at launch from the filesystem. The SkillCatalog protocol supports lazy loading — skills are indexed at startup but content loaded on demand.

State Files

  • Files written by agent tools to the Computer's filesystem
  • No framework-maintained state files (no tasks.json, no journal.md)

System Reminders

The <system-reminder> mechanism injects rule-based context before model calls. Rules are defined programmatically at harness configuration time. This is the closest HexAgent comes to persistent behavioral context.

Observability

  • LangSmith: Full trace integration (spans, token counts, latency)
  • Braintrust: Alternative trace backend
  • Both are optional; neither is bundled
07

Orchestration

HexAgent — Orchestration

Multi-Agent Support

Yes. HexAgent supports subagent orchestration natively.

Spawn mechanisms:

  • Foreground subagents (AgentTool): block the parent's turn, return a result
  • Background subagents: run in parallel with the parent

Definition format: Python AgentDefinition objects passed to create_agent(). Not persona-md files. Each definition specifies description, tools list (subset of parent's toolset), and optional model override.

Nesting: Subagents can themselves spawn sub-subagents (depth limit: not stated in README).

Orchestration Pattern

parallel-fan-out — background subagents run concurrently; foreground subagents are sequential (blocking). No formal consensus mechanism.

Isolation Mechanism

The Computer protocol provides isolation:

  • LocalNativeComputer: none (runs in host shell, same process risk)
  • LocalVM: Lima (macOS) / WSL (Windows) — process-level VM isolation
  • RemoteE2BComputer: E2B cloud sandbox — full container isolation

The harness runtime itself is always isolated from the agent's Computer — the agent cannot read the harness's API keys or source code.

Multi-Model

Yes. The model parameter accepts any string; subagents can use a different model than the parent. The "fast" shorthand selects a cheaper/faster model automatically.

AgentDefinition(
    description="...",
    tools=["Read", "Glob"],
    model="fast",  # cheaper model for subagents
)

Execution Mode

Event-driven / one-shot per invocation (await agent.ainvoke(...)). No built-in daemon mode, no background loop that survives process exit. Each call is a discrete session.

Consensus Mechanism

None. Subagent results are collected by the parent and synthesized in the parent's next model call.

Prompt Chaining

Yes — each subagent's result is passed back into the parent's conversation history, forming a chain.

Auto-Validators

None built-in. No PostToolUse hooks that auto-run tests/lint. Validation is the developer's responsibility.

08

Ui Cli Surface

HexAgent — UI / CLI Surface

CLI Binary

None. HexAgent is a Python library consumed programmatically. There is no hexagent CLI binary.

The libs/hexagent_demo/ package ships a demo application with Chat and Cowork modes, but it's an example app, not a production CLI tool.

Local UI / Dashboard

None in the core library. The hexagent_demo ships a demo app (mode unclear from available files) but no web dashboard is bundled in the core hexagent package.

IDE Integration

None. No VSCode extension, no Cursor plugin, no Claude Code plugin.

How Users Interact

Programmatic Python only:

async with await create_agent(model="...", computer=LocalNativeComputer()) as agent:
    result = await agent.ainvoke({"messages": [{"role": "user", "content": "..."}]})

Observability

  • LangSmith (optional): full LangChain trace integration — spans, token counts, latency per call
  • Braintrust (optional): alternative trace backend
  • No built-in metrics endpoint or dashboard

Install Surface

pip install hexagent

Single-line install. No config wizard, no scaffolding command.

Target Integrations

  • Any HTTP-based LLM API (OpenAI-compatible endpoint)
  • MCP servers (consumed, not bundled)
  • E2B (cloud sandbox provider)
  • Tavily / Brave (web search)
  • Jina / Firecrawl (web fetch)
  • LangSmith / Braintrust (observability)

Related frameworks

same archetype · same primary tool · same memory type

claude-mem (thedotmack) ★ 78k

Background worker service captures every tool call as an observation, AI-compresses sessions, and auto-injects relevant past…

pi (badlogic/earendil) ★ 55k

A minimal, hackable, multi-provider terminal coding agent that adapts to your workflows via npm-installable TypeScript Extensions…

Agent Skills (Addy Osmani) ★ 46k

Encodes senior-engineer software development lifecycle as 23 auto-routed skills and 7 slash commands for any AI coding agent.

wshobson/agents Plugin Marketplace ★ 36k

Single Markdown source for 83 domain-specialized plugins that auto-generates idiomatic artifacts for five AI coding harnesses.

TabbyML/Tabby ★ 34k

Self-hosted AI coding assistant server (alternative to GitHub Copilot) with admin dashboard, RAG-based completions, and multi-IDE…

Compound Engineering ★ 17k

Make each unit of engineering work compound into easier future work via brainstorm→plan→execute→review→learn cycles.