Skip to content
/
Phase B Batch 28

Batch 28 Notes — Harness Frameworks (LangChain DeepAgents + Multi-Runtime Portability)

10 frameworks in this batch

Batch 28 Notes — Harness Frameworks (LangChain DeepAgents + Multi-Runtime Portability)

Roster

# Slug Stars Language Type Distinctive Feature
1 deepagents-langchain 23,351 Python pip-package LangGraph middleware stack (Filesystem/SubAgent/Memory), BASE_AGENT_PROMPT verbatim
2 deepagentsjs 1,262 TypeScript npm-package TypeScript generics (InferSubagentByName<T,Name>), ACP sibling
3 oh-my-agent 1,020 TypeScript npm-package .agents/ → 27+ AI tool projection, Charter Preflight block, 10 agents + 40+ skills
4 chorus 915 TypeScript standalone-repo AI-DLC "Reversed Conversation", 50+ MCP tools, AGPL-3.0, port 8637
5 flue 3,731 TypeScript npm-package Agent-as-server (Cloudflare Durable Objects), valibot typed results, port 3583
6 water 288 TypeScript npm-package Fluent flow API with 9 composition patterns, FallbackChain, PromptLibrary
7 omnicore-agent 241 Python pip-package Parallel tool batching, BM25 retrieval, signature-loop detection, OmniServe FastAPI
8 hankweave 123 TypeScript npm-package Production ops runtime, single-threaded codon execution, git checkpointing
9 pydantic-ai-harness 354 Python pip-package Official Pydantic AI capability pack, CodeMode sandbox (run_code via pydantic-monty)
10 ai-toolkit-uniswap 37 TypeScript standalone-repo Corporate Claude Code plugin pack, 27 agents + 46 skills, PostToolUse lint gate

Language Ecosystem Segregation

Python Ecosystem (3 frameworks)

deepagents-langchain — The star of this batch by adoption (23,351 stars). LangGraph-based middleware stack is the most architecturally sophisticated Python harness in the batch: FilesystemMiddleware, SubAgentMiddleware, MemoryMiddleware compose around an agent loop with tool result injection. BASE_AGENT_PROMPT is a ~500-token template that sets agent identity, task handling, and memory retrieval rules. The deepagents init / dev / deploy CLI makes it beginner-friendly.

omnicore-agent — Niche (241 stars) but technically interesting: parallel tool batching (tool calls fired concurrently and deduplicated), BM25 retrieval for past observations, and a signature-loop detection algorithm that hashes (tool_name, args) tuples to detect stuck loops. OmniServe FastAPI server exposes agents as HTTP endpoints. Uses LiteLLM for model routing.

pydantic-ai-harness — The "official" Python entry: maintained by Pydantic, Inc. Currently ships nearly empty (only CodeMode) but the capability graduation pipeline (harness → core) and the 30+ capability roadmap in active PRs make it the one to watch. The run_code / pydantic-monty sandbox — collapsing N tool calls into 1 sandboxed Python execution — is a unique primitive not seen elsewhere in this batch.

TypeScript Ecosystem (6 frameworks)

deepagentsjs — Python deepagents' TypeScript sibling. Strong TypeScript generics (InferSubagentByName<T,Name>) for compile-time type safety of sub-agent contracts. The ACP (Agent Communication Protocol) sibling package suggests a multi-framework interop ambition. No CLI binary — pure library.

oh-my-agent — Most opinionated TypeScript harness. The .agents/ directory is a canonical persona-to-27-tools projection engine: one backend-engineer.md persona → Claude Code agent, Cursor rule, Codex config, Gemini persona, etc. The "Charter Preflight" block (hardcoded safety/identity constraints in every agent) is a governance primitive not seen in other frameworks.

chorus — The outlier: AGPL-3.0, self-hosted web app with 50+ MCP tools, port 8637, 5×3 permission matrix (filesystem/network/process × read/write/execute). The AI-DLC "Reversed Conversation" — model generates the conversation structure, not just the responses — is the most novel architectural idea in this batch. On-session-start shell script for git/npm bootstrapping is the most complete lifecycle hook example.

flue — Cloudflare-native: Durable Objects for persistent agent state across requests, Hono HTTP framework, valibot for typed tool results. The agent-as-server pattern (agents live at HTTP endpoints, not in process) is architecturally distinct from every other framework in this batch. Port 3583 for local development.

water — Fluent flow builder for Pydantic AI and LangChain. The 9 composition patterns (chain, parallel, conditional, fallback, map, reduce, loop, branch, merge) with FallbackChain as a first-class primitive are the most complete flow composition API in this batch. PromptLibrary for reusable prompt templates is a clean separation of concerns.

hankweave — Production ops runtime (not a dev-time framework). The single-threaded codon execution constraint, git snapshot at every codon boundary, sentinel (parallel LLM observer) monitoring, and the ability to orchestrate OTHER agent harnesses (Claude Code, Codex, Gemini) as subprocess drivers are all category-unique. The release/alpha default branch signals early stage.

Polyglot / Organizational (1 framework)

ai-toolkit-uniswap — The most organization-specific entry: Uniswap Labs' internal Claude Code plugin pack. Architecture is generalizable (Nx monorepo + plugin marketplace + hooks); content is not (Uniswap Protocol domain knowledge). The PostToolUse lint gate on every Write|Edit is the tightest quality gate in this batch.


Intra-Batch Patterns

Pattern 1: Middleware/Capability Stack

Multiple frameworks use a composable layering model:

  • deepagents-langchain: FilesystemMiddlewareSubAgentMiddlewareMemoryMiddleware (Python, runtime composition)
  • pydantic-ai-harness: capabilities=[CodeMode(), Filesystem(), Shell(), Memory()] (Python, constructor composition)
  • oh-my-agent: middlewares: [filesystem, sub-agent, memory] in YAML (TypeScript, config composition)

The pattern is convergent: capability stacking is the dominant architecture for extensible agent harnesses.

Pattern 2: Parallel Tool Execution

Three frameworks independently arrived at the same insight — sequential tool calls waste round-trips:

  • pydantic-ai-harness: CodeMode's run_code with asyncio.gather inside sandbox
  • omnicore-agent: parallel tool batch execution with deduplication
  • deepagentsjs: asyncio.gather-style concurrent tool calls in agent loop

Pattern 3: Typed Contracts

TypeScript frameworks increasingly use compile-time types for agent contracts:

  • deepagentsjs: InferSubagentByName<T,Name> generics
  • flue: valibot schemas for tool results
  • water: typed flow steps

Python counterparts use Pydantic models (deepagents-langchain, pydantic-ai-harness).

Pattern 4: Model Pinning

Several frameworks explicitly pin models:

  • hankweave: per-codon model override
  • ai-toolkit-uniswap: model: claude-opus-4-7 in planner-agent frontmatter
  • chorus: multi-provider support but no pinning Most frameworks leave model selection to the user — pinning is the exception, not the rule.

Pattern 5: Quality Gates via Hooks

Two frameworks use hooks as mandatory quality gates (not optional observability):

  • ai-toolkit-uniswap: PostToolUse Write|Edit → lint (required, blocks on failure)
  • chorus: SessionStart → git/npm setup (initialization gate)
  • deepagents-langchain: no hooks (gates are in middleware, not hooks)

Most Interesting Find

hankweave's sentinels — the most sophisticated monitoring primitive in all 33 batches encountered so far. Sentinels are parallel LLM-powered observers that tap the hankweave event stream without interrupting execution. They monitor for stuck states, detect anomalies, and can trigger recovery actions — all asynchronously. No other framework in this batch (or in seeds) implements LLM-powered execution monitoring as a first-class primitive. Combined with the single-threaded codon constraint and git snapshot at every boundary, hankweave represents a fundamentally different philosophy: treat long-horizon agentic execution as a production operations problem, not a developer productivity problem.

Runner-up: pydantic-ai-harness's capability graduation pipeline. The explicit two-tier model (harness = incubator, core = graduated capabilities) is a governance mechanism for a growing ecosystem. Web search, tool search, and thinking already graduated. CodeMode, filesystem, shell, memory are next in line. No other framework in this batch (or among seeds) has an explicit capability graduation mechanism.


Tier C (Low Signal, Low Stars)

  • water (288 stars): fluent flow API is clean but the library is thin — 9 composition patterns with minimal implementation depth. No CLI, no UI, no hooks. May not survive as a standalone project; the patterns would be better absorbed into a larger framework.
  • omnicore-agent (241 stars): technically interesting (loop detection, BM25 retrieval) but low adoption and no organizational backing. The OmniServe FastAPI pattern is niche.
  • ai-toolkit-uniswap (37 stars): intentionally internal-use. Generalizable architecture but no external adoption path.
  • hankweave (123 stars): important ideas but release/alpha default branch and Bun dependency limit audience.

Cross-References

  • deepagents-langchain ↔ deepagentsjs: Python/TypeScript sibling pair from the same ecosystem. JavaScript developers can use deepagentsjs for type-safe sub-agent contracts; Python developers use deepagents-langchain for the LangGraph middleware stack. Cross-reference when cataloging LangGraph-based frameworks.

  • pydantic-ai-harness ↔ water: both target Pydantic AI as the agent runtime. water provides flow composition primitives (chain, parallel, fallback); pydantic-ai-harness provides capability packs (CodeMode, Filesystem, Shell). They are complementary, not competing.

  • oh-my-agent ↔ ai-toolkit-uniswap: both use the .agents/ convention for agent persona definitions. oh-my-agent projects one persona to 27 tools; ai-toolkit-uniswap uses .claude/agents/ for Claude Code-specific agent markdown. The .agents/ directory is becoming a cross-framework convention — cross-reference with batch-29 (oh-my-codex family) and seed agent-os.

  • hankweave ↔ chorus: both are self-hosted runtimes that orchestrate Claude Code (and other tools) as subprocess drivers. hankweave: single-threaded codon execution, no human-in-loop. Chorus: interactive web UI, AGPL-3.0, 50+ MCP tools. They solve opposite ends of the "when to involve humans" spectrum.

  • flue ↔ chorus: both expose agents over HTTP with persistent state (Durable Objects vs. session state). flue is Cloudflare-native, lightweight; chorus is self-hosted, full-featured. Cross-reference when cataloging agent-as-server frameworks.

  • pydantic-ai-harness graduation pipeline: capabilities that graduate from harness to pydantic-ai core should be cross-referenced with the core pydantic-ai catalog entry (if one exists). The [temporal] and [dbos] extras suggest future cross-reference with durable execution frameworks (DBOS, Temporal).