Skip to content
/

OmniCoreAgent

omnicore-agent · omnirexflora-labs/omnicoreagent · ★ 241 · last commit 2026-05-25

Primitive shape
No installable primitives
00

Summary

OmniCoreAgent — Summary

OmniCoreAgent is an open Python agent harness (pip: omnicoreagent) that positions itself as an "application-facing harness layer" providing the runtime boundary around LLM models. Its distinctive features are parallel tool batches (the agent batches independent tool calls and gets one structured observation), BM25-based tool retrieval for large toolsets, signature-based loop detection, MCP client integration, a workspace file system with optional cloud storage (S3, R2), dynamic subagents, background tasks (durable scheduled/manual with run history and leases), and REST/SSE serving via OmniServe. The framework uses LiteLLM for model routing, making it compatible with any LiteLLM-supported provider. It also supports OpenTelemetry, LangSmith, and Opik for observability. Modular extras: [redis], [postgres], [mongodb], [s3], [serve], [tokenizer], [otel]. Version is dynamic (git-tagged).

Compared to seeds, OmniCoreAgent most closely resembles claude-flow in bundling parallel tool execution, memory, subagents, and production serving in one package, but differs: it is a pure Python pip package with no MCP server of its own (it is an MCP client), no LangGraph dependency, and uses LiteLLM for model abstraction rather than LangChain.

01

Overview

OmniCoreAgent — Overview

Origin

Developed by Abiola Adeshina / Omnirexflora Labs. Published as omnicoreagent on PyPI. The tagline is "The Open Production Agent Harness for Python."

Core Philosophy

From the README:

"An LLM is not an agent by itself. The model provides intelligence; the harness gives that intelligence a working environment."

The framework explicitly defines four harness boundaries:

Layer What It Owns
Agent harness Model loop, prompt contract, tools, observations, memory, context, workspace, guardrails, events, subagents
Serving boundary OmniServe REST/SSE APIs, request lifecycle, readiness, auth, rate limits, metrics
Background boundary Durable scheduled/manual task execution with task state, run history, leases, retries, and workspace output
External tool boundary MCP server tools and local Python tools through one runtime surface

Parallel Tool Batches (Key Differentiator)

From the README:

"The usual tool loop looks like this: LLM -> call tool A -> wait -> result -> LLM -> call tool B -> wait -> result. OmniCoreAgent lets the model request independent tools together: LLM -> [tool A + tool B + tool C in parallel] -> one structured observation -> LLM."

Structured Observations

Raw tool output is reformatted before being added to the message thread. Failed tools appear alongside successful ones in the same observation, rather than collapsing the step.

02

Architecture

OmniCoreAgent — Architecture

Distribution

  • Type: pip package (omnicoreagent on PyPI)
  • Version analyzed: dynamic (git-tagged via uv-dynamic-versioning)
  • Install: pip install omnicoreagent
  • Optional extras: [redis], [postgres], [mongodb], [s3], [serve], [tokenizer], [otel], [langsmith], [opik]
  • Required runtime: Python >=3.10

Repository Layout

src/omnicoreagent/          # Main Python package
├── __init__.py
├── core/                   # Core agent loop, prompt contract, tool runner
├── background/             # Durable background tasks, run history, leases
├── governance/             # Guardrails, policy enforcement
├── mcp_clients_connection/ # MCP client integration
├── serve/                  # OmniServe REST/SSE API layer (FastAPI)
└── workflows/              # Workflow orchestration

engineering/                # Internal engineering docs
cookbook/                   # Usage examples (progressive complexity)
docker/                     # Docker configuration

Key Dependencies

  • litellm>=1.83.14 — LLM model routing (supports any LiteLLM provider)
  • mcp[cli]>=1.27.0 — MCP client support
  • pydantic>=2.12.5 — data validation
  • rich>=15.0.0 — terminal output formatting
  • click>=8.1.8 — CLI framework
  • Optional: redis, sqlalchemy+psycopg2, motor+pymongo, boto3, fastapi+uvicorn, tiktoken, opentelemetry-*

Config Files

  • Environment variables: LLM_API_KEY, LLM_PROVIDER, LLM_MODEL (via litellm)
  • No project-level config file required; configuration is programmatic

Deployment Options

  • Standalone Python process
  • Docker (via docker/ directory)
  • OmniServe (built-in FastAPI server)

Target Tools

LLM-agnostic via LiteLLM (OpenAI, Anthropic, Google, Azure, Cohere, local models). MCP-compatible for tool integration.

03

Components

OmniCoreAgent — Components

Core Agent

Component Purpose
OmniCoreAgent(name, system_instruction, model_config) Main agent class
agent.run(prompt, session_id) Execute agent with a prompt
agent.cleanup() Release resources
Parallel tool batcher Runs independent tool calls concurrently
Structured observation formatter Formats tool results before LLM sees them
BM25 tool retrieval Finds relevant tools from large toolsets
Signature loop detector Detects and breaks repetitive action patterns

Memory + Context

Component Purpose
Session memory Per-session message history
Context management Token limit enforcement, context windowing
Tool output offloading Write large tool results to workspace, replace with reference
Cross-session recall Persistent memory across sessions

Workspace

Component Purpose
Local workspace files Read/write/list files in agent workspace
Cloud storage backends S3, R2 (via [s3] extra)
Artifact management Store large tool results as workspace files

MCP Client

Component Purpose
mcp_clients_connection/ Connect to MCP servers as a client
Local Python tools Expose Python functions as tools
Unified tool surface MCP tools + local tools through one interface

Subagents

Component Purpose
Dynamic subagents Spawn specialized child agents at runtime
Shared workspace output Parent and child share workspace filesystem

Background Tasks

Component Purpose
Durable task store Persist task definitions
Run history Track all past runs
Task leases Prevent concurrent duplicate execution
Retries + cancellation Automatic retry on failure; manual cancellation
Scheduled tasks Cron-style or manual trigger

OmniServe (REST/SSE API)

Component Purpose
FastAPI server Expose agents as HTTP endpoints
SSE streaming Stream agent responses in real time
Auth API key or custom auth
Rate limiting Per-client request rate limits
Readiness probe /health endpoint
Metrics Request count, latency, error rate

Observability

  • OpenTelemetry (OTLP export) — [otel] extra
  • LangSmith integration — [langsmith] extra
  • Opik integration — [opik] extra
05

Prompts

OmniCoreAgent — Prompts

Excerpt 1: Quickstart (from README)

agent = OmniCoreAgent(
    name="assistant",
    system_instruction="You are a helpful assistant.",
    model_config={"provider": "openai", "model": "gpt-4o"},
)

result = await agent.run(
    "Research the top 3 open-source agent runtimes and summarize them.",
    session_id="quickstart",
)

Prompting technique: Simple string system instruction + string user prompt. The harness manages the message loop, tool calls, and structured observations internally. The system instruction is a single string; no template variables or chaining required for the basic case.

Excerpt 2: LiteLLM Model Config

model_config={"provider": "openai", "model": "gpt-4o"}
# or
model_config={"provider": "anthropic", "model": "claude-sonnet-4-20250514"}
# or any LiteLLM-supported provider

Prompting technique: Model routing is config-driven via LiteLLM's provider abstraction. The same system instruction and user prompt are sent to any supported model without code changes.

Prompting Architecture

  • System instruction: Python string passed to OmniCoreAgent(system_instruction=...)
  • User prompt: Python string passed to agent.run(prompt, ...)
  • No static markdown prompt files are shipped
  • Structured observations: tool results are reformatted by the harness before the LLM sees them (prompt-level context engineering happens automatically)
  • The harness manages the tool-call contract, parser, resolver, parallel runner, and result formatter — prompting is sandwiched by harness infrastructure
09

Uniqueness

OmniCoreAgent — Uniqueness

Differs from Seeds

OmniCoreAgent is most similar to claude-flow in its production-first ambitions: parallel tool execution, memory, subagents, and HTTP serving in one package. The key deltas: OmniCoreAgent is a pip package (not npm), uses LiteLLM rather than being Claude-specific, acts as an MCP client (not server), and adds signature-based loop detection and BM25 tool retrieval — production features not present in claude-flow's 305-tool MCP server approach. Unlike taskmaster-ai (task management with JSON files) or deepagents-langchain (LangGraph-specific), OmniCoreAgent is LLM-framework-agnostic via LiteLLM.

Distinctive Position

  1. Parallel tool batches with structured observations — the most technically rigorous handling of parallel tool calls in this batch: one model request → N parallel tool calls → one structured observation (failed tools shown alongside successful ones)
  2. Signature loop detection — algorithmic detection of repetitive action patterns, not just prompt-based warnings
  3. BM25 tool retrieval — scales to large toolsets without context exhaustion
  4. Four-boundary architecture documented explicitly: agent harness, serving boundary, background boundary, external tool boundary
  5. Background task system with leases and run history — production durable task execution not seen in other Python harnesses in this batch

Explicit Anti-Patterns

Stated in README: "Most demos stop at 'LLM plus tool loop.' Production agents fail in the layer around that loop: slow sequential tool calls, noisy observations, repeated actions, context exhaustion, unsafe tool output, missing workspace state, uninspectable background work, and weak serving boundaries."

Observable Failure Modes

  • Dynamic versioning (git-tagged): no stable semver pinning
  • litellm>=1.83.14 is a very current version floor — potential breakage on LiteLLM updates
  • Stars (241) suggest early adoption; production readiness claims may outrun community validation
  • No built-in git automation or filesystem isolation

Inspired By

Production agent reliability patterns, LiteLLM, MCP protocol.

04

Workflow

OmniCoreAgent — Workflow

Basic Execution

from omnicoreagent import OmniCoreAgent

agent = OmniCoreAgent(
    name="assistant",
    system_instruction="You are a helpful assistant.",
    model_config={"provider": "openai", "model": "gpt-4o"},
)

result = await agent.run(
    "Research the top 3 open-source agent runtimes.",
    session_id="quickstart",
)
print(result["response"])
await agent.cleanup()

Parallel Tool Execution Loop

LLM reasons → identifies independent tool calls
→ tool batcher runs them concurrently
→ structured observation combines all results (including failures)
→ LLM reasons from the complete observation
→ repeat until done

This differs from the standard sequential tool loop where each call is a separate LLM round-trip.

Phases

Phase Description Artifact
Agent initialization OmniCoreAgent(...) — configure model, tools, memory Agent object
Run agent.run(prompt, session_id) — execute agent loop Response dict
Tool batching Parallel tool execution per reasoning step Structured observation
Session memory Message history maintained per session_id In-memory or persistent
Background task Durable task with run history and lease Task run record
OmniServe Expose agent via FastAPI HTTP/SSE API endpoint

Approval Gates

Not documented in the README. The governance/ module may include human-in-the-loop controls.

Progressive Complexity Pattern

The cookbook is organized as progressive complexity tiers:

  1. Hello world
  2. Agent with local tools
  3. Agent with MCP tools
  4. Context + memory management
  5. Tool output offloading
  6. Real applications
  7. Workflows
  8. OmniServe
06

Memory Context

OmniCoreAgent — Memory and Context

Session Memory

  • Per-session message history maintained by session_id
  • In-memory by default; Redis/PostgreSQL/MongoDB persistence via optional extras
  • context_management handles token limits (windowing + summarization)

Tool Output Offloading

  • Large tool outputs are written to workspace files rather than added to the message thread
  • The message thread receives a file reference instead of the raw output
  • Prevents context exhaustion from large API responses or file contents

Workspace

  • Local workspace filesystem: agent can read/write/list files
  • Cloud storage: S3 or R2 backends (via [s3] extra)
  • Shared between parent agent and dynamic subagents

Cross-Session Memory

  • Persistent memory via Redis, PostgreSQL, or MongoDB backends
  • Recall across sessions: the agent can query previous session outputs

BM25 Tool Retrieval

  • For large toolsets, BM25 retrieval surfaces relevant tools based on the current task
  • Prevents context exhaustion from loading all tool descriptions into the prompt

Context Management Strategy

  1. Token limit enforcement (via [tokenizer] extra with tiktoken)
  2. Context windowing: oldest messages may be dropped or summarized
  3. Tool output offloading: large results → workspace file
  4. BM25 tool discovery: only relevant tools loaded into context
07

Orchestration

OmniCoreAgent — Orchestration

Multi-Agent Pattern

OmniCoreAgent supports dynamic subagent spawning:

  • Parent agent can spawn specialized child agents at runtime
  • Child agents share the parent's workspace output
  • Pattern: hierarchical — parent delegates to children

Parallel Tool Batching

The core execution innovation: the model requests multiple independent tools simultaneously, the harness executes them in parallel, and returns a single structured observation. This is not multi-agent coordination; it is single-agent parallel tool execution.

Multi-Model

Via LiteLLM: any model_config can specify any LiteLLM-supported provider. Different agent instances can use different models. No config-file routing system.

Background Tasks

The background task system provides durable work isolation:

  • Tasks run in a separate execution context
  • State machine with run history, leases, retries
  • Cancellable and schedulable
  • Output written to workspace

Execution Modes

  • Interactive: agent.run() returns when done
  • Background: durable scheduled/manual task execution
  • OmniServe: event-driven HTTP/SSE serving

Isolation Mechanism

Workspace isolation: each session/run writes to a scoped workspace. No container isolation or git worktrees.

Signature Loop Detection

Monitors the sequence of tool calls and detects repeated patterns (signatures) indicating the agent is stuck in a loop. Breaks out automatically.

08

Ui Cli Surface

OmniCoreAgent — UI/CLI Surface

CLI Binary

click>=8.1.8 is a dependency, suggesting a CLI exists, but no binary is explicitly declared in pyproject.toml's [project.scripts] section. The CLI interface is likely via python -m omnicoreagent or a click-based entry point in src/.

Local UI

None documented. OmniCoreAgent is a library + OmniServe API server; no web dashboard.

OmniServe (REST/SSE API)

  • FastAPI-based agent serving (via [serve] extra: fastapi, uvicorn)
  • Features: SSE streaming, API key auth, rate limiting, readiness probe, metrics
  • This is an HTTP API server for production deployment, not a monitoring UI

AI Tool Docs Integration

The README includes a dedicated section "Use docs with AI tools" with links for:

  • Ask AI (query the hosted docs)
  • /llms.txt (for LLM-native documentation access)
  • Hosted docs MCP (for AI coding tool integration)
  • Cursor, VS Code, ChatGPT, Claude, Perplexity guides

This suggests the framework actively targets AI coding tool integration for its own documentation.

Cross-Tool Portability

high — LiteLLM backend supports any provider. Not tied to any AI coding tool.

Related frameworks

same archetype · same primary tool · same memory type

claude-mem (thedotmack) ★ 78k

Background worker service captures every tool call as an observation, AI-compresses sessions, and auto-injects relevant past…

pi (badlogic/earendil) ★ 55k

A minimal, hackable, multi-provider terminal coding agent that adapts to your workflows via npm-installable TypeScript Extensions…

Agent Skills (Addy Osmani) ★ 46k

Encodes senior-engineer software development lifecycle as 23 auto-routed skills and 7 slash commands for any AI coding agent.

wshobson/agents Plugin Marketplace ★ 36k

Single Markdown source for 83 domain-specialized plugins that auto-generates idiomatic artifacts for five AI coding harnesses.

TabbyML/Tabby ★ 34k

Self-hosted AI coding assistant server (alternative to GitHub Copilot) with admin dashboard, RAG-based completions, and multi-IDE…

Compound Engineering ★ 17k

Make each unit of engineering work compound into easier future work via brainstorm→plan→execute→review→learn cycles.