Skip to content
/

mcp-agent

mcp-agent-lastmile · lastmile-ai/mcp-agent · ★ 8.3k · last commit 2026-01-25

Primitive shape
No installable primitives
00

Summary

mcp-agent (lastmile) — Summary

mcp-agent by LastMile AI is a Python framework (v0.2.6, 8.3k stars, Apache-2.0) that treats MCP as the complete substrate for building agents — not just a tool protocol. Its thesis is that "MCP is all you need to build agents, and simple patterns are more robust than complex architectures." The framework fully implements MCP (tools, resources, prompts, notifications, OAuth, sampling, elicitation, roots) and implements every pattern from Anthropic's "Building Effective Agents" paper as composable AugmentedLLM workflow types: orchestrator, parallel, swarm, evaluator-optimizer, router, deep_orchestrator. The CLI (mcp-agent) exposes init, run, deploy, and describe subcommands; uvx mcp-agent init scaffolds a project in under 2 minutes. Durable execution via Temporal is a first-class feature — swapping from in-process to Temporal requires no API changes. Agents can be exposed as MCP servers themselves, enabling composition. OpenTelemetry tracing is built in. Compared to the seeds, mcp-agent is closest to ccmemory (both MCP-centric) but inverted: ccmemory ships an MCP server for memory storage, while mcp-agent builds full agent workflows on top of MCP as the only protocol — a radically MCP-pure architecture that no seed takes to this degree.

01

Overview

mcp-agent (lastmile) — Overview

Origin

Built by Sarmad Qadri at LastMile AI. The framework's vision statement:

"mcp-agent's vision is that MCP is all you need to build agents, and that simple patterns are more robust than complex architectures for shipping high-quality agents."

Philosophy

The framework makes a bold architectural bet: rather than adding MCP as a tool-calling layer on top of a custom agent framework, everything — tool execution, agent coordination, resource access, prompt management — goes through the MCP wire protocol. This means agents are composable as MCP servers, enabling nested agent topologies where each layer speaks MCP.

Key design principles:

  1. Full MCP compliance: all MCP features (sampling, elicitation, OAuth, roots) are implemented, not just tool calling
  2. Pattern-based composition: the six patterns from Anthropic's "Building Effective Agents" paper are first-class workflow types
  3. Durable by default: Temporal integration is opt-in but requires zero API changes — local and durable agents have identical code
  4. Agents as MCP servers: any agent can be exposed as an MCP server for consumption by other agents or clients

The six composable patterns

From workflows/ subdirectory:

  1. Orchestrator — LLM-directed task decomposition → delegate to sub-agents → synthesize
  2. Parallel — fan-out same task to N agents, collect results
  3. Router — classify intent → route to specialist agent
  4. Evaluator-Optimizer — generator + critic loop until quality threshold
  5. Swarm — peer-to-peer agent handoffs (no central coordinator)
  6. Deep Orchestrator — recursive orchestration (orchestrators orchestrating orchestrators)

Key quote

"Altogether, this is the simplest and easiest way to build robust agent applications."

02

Architecture

mcp-agent (lastmile) — Architecture

Distribution

  • Type: pip-package
  • PyPI name: mcp-agent
  • Version: 0.2.6
  • Install: pip install mcp-agent or uv add mcp-agent
  • Required runtime: Python ≥ 3.10
  • License: Apache-2.0

CLI binaries (from pyproject.toml scripts)

Binary Entry point Purpose
mcp-agent mcp_agent.cli.main_bootstrap:run Primary CLI
silsila mcp_agent.cli.main:run Alternative CLI entry
mcp-cloud mcp_agent.cli.cloud.main:run Cloud deployment CLI
mcpc mcp_agent.cli.cloud.main:run Alias for mcp-cloud

Source layout

src/mcp_agent/
  app.py              # MCPApp — application container
  agents/agent.py     # Agent class
  workflows/          # Six pattern implementations
    llm/              # AugmentedLLM base + provider adapters
    orchestrator/     # Orchestrator workflow
    parallel/         # Parallel fan-out
    router/           # Intent router
    swarm/            # Peer-to-peer handoffs
    evaluator_optimizer/ # Generator + critic loop
    deep_orchestrator/   # Recursive orchestration
  cli/                # CLI implementation
  mcp/                # Full MCP client/server implementation
  executor/           # Temporal + in-process executors
  tracing/            # OpenTelemetry
  human_input/        # Human-in-the-loop signals
  elicitation/        # MCP elicitation support
  oauth/              # OAuth 2.0 for MCP servers
  server/             # FastMCP-compatible server API
  config.py           # YAML-based configuration

Key dependencies

mcp>=1.20.0
pydantic>=2.10.4
pydantic-settings>=2.7.0
typer>=0.15.3
temporalio[opentelemetry]>=1.10.0  # optional
anthropic>=0.48.0                  # optional
openai  / google-genai             # optional

Configuration

YAML-based: mcp_agent.config.yaml (server definitions) and mcp_agent.secrets.yaml (API keys).

Target AI tools

OpenAI, Anthropic, Google Gemini, AWS Bedrock, Ollama, Azure OpenAI, custom providers via AugmentedLLM protocol.

03

Components

mcp-agent (lastmile) — Components

Core classes

Name Purpose
MCPApp Application container; manages server connections, config, context
Agent Agent unit; binds to MCP servers by name; attaches an LLM
AugmentedLLM Abstract base for LLM+MCP integration; generic over message types
Memory[MessageParamT] In-memory message history; generic over provider message types

Workflow pattern classes

Class Pattern Description
Orchestrator orchestrator LLM-directed task decomp → delegate to agents → synthesize
ParallelLLM parallel Fan-out same task to N agents; collect results
Router router Classify intent → route to specialist
EvaluatorOptimizer evaluator-optimizer Generator + critic loop
Swarm swarm Peer-to-peer agent handoffs
DeepOrchestrator deep-orchestrator Orchestrators orchestrating orchestrators

LLM provider adapters

Class Provider
OpenAIAugmentedLLM OpenAI
AnthropicAugmentedLLM Anthropic
GoogleAugmentedLLM Google Gemini
BedrockAugmentedLLM AWS Bedrock
AzureOpenAIAugmentedLLM Azure OpenAI
OllamaAugmentedLLM Ollama

CLI subcommands

Command Purpose
mcp-agent init Scaffold new project
mcp-agent run <app> Run an agent app
mcp-agent deploy <app> Deploy to cloud
mcp-cloud / mcpc Cloud management CLI

Protocols and interfaces

Protocol Purpose
GetFullPlanPrompt Protocol for orchestrator plan prompt
GetIterativePlanPrompt Protocol for iterative plan generation
GetTaskPrompt Protocol for individual task prompts

MCP feature support

Tools ✅, Resources ✅, Prompts ✅, Notifications ✅, OAuth ✅, Sampling ✅, Elicitation ✅, Roots ✅

05

Prompts

mcp-agent (lastmile) — Prompts

Verbatim: Orchestrator plan prompt templates (from orchestrator_prompts.py)

The orchestrator has three prompt templates:

  • FULL_PLAN_PROMPT_TEMPLATE — generates a complete upfront plan
  • ITERATIVE_PLAN_PROMPT_TEMPLATE — generates the next step in an iterative plan
  • SYNTHESIZE_PLAN_PROMPT_TEMPLATE — synthesizes all step results into a final answer
# From orchestrator.py (inferred structure)
class GetFullPlanPrompt(Protocol):
    def __call__(
        self, objective: str, plan_result: PlanResult, agents: List[Agent]
    ) -> str:
        """Get the full plan prompt for the given objective, plan result, and agents"""
        ...

class GetIterativePlanPrompt(Protocol):
    def __call__(
        self, objective: str, plan_result: PlanResult, agents: List[Agent]
    ) -> str:
        """Get the iterative plan prompt for the given objective, plan result, and agents"""
        ...

Technique: Protocol-based prompt injection — users can override the plan prompt functions by providing their own GetFullPlanPrompt implementation. This is dependency injection at the prompt level.

Verbatim: Task prompt template

class GetTaskPrompt(Protocol):
    def __call__(self, objective: str, task: str, context: str) -> str:
        """Get the task prompt for the given objective, task, and context"""
        ...

Technique: Three-part prompt injection (objective + task + context). The context is the accumulated output from prior steps — classic prompt chaining.

AugmentedLLM message history pattern

class Memory(BaseModel, Generic[MessageParamT]):
    # Per-provider typed message history
    # MessageParamT varies: OpenAI ChatCompletionMessageParam,
    #   Anthropic MessageParam, etc.

Technique: Generic typed memory. Each provider gets its own message type; no loss-of-fidelity conversion between provider formats.

LLMS.txt (repo root)

LLMS.txt in the repo root maps to docs.mcp-agent.com/llms.txt — a machine-readable sitemap for AI coding assistants to navigate the documentation.

09

Uniqueness

mcp-agent (lastmile) — Uniqueness

Differs from seeds

This framework occupies a unique position in the entire catalog: it is the most MCP-pure architecture. While ccmemory (seed) ships an MCP server for memory and taskmaster-ai bundles an MCP server for task management, mcp-agent uses MCP as the only protocol for all agent operations — tool calls, resource reads, prompt fetching, agent-to-agent communication, and even human-in-the-loop signals. The Swarm pattern (peer-to-peer agent handoffs with no central coordinator) is the only swarm implementation in the entire batch and has no equivalent in any seed. Temporal durable execution with zero API change is another unique property — no other framework in this batch or the seeds provides transparent local↔distributed switching.

Positioning

mcp-agent sits between "MCP library" (like langchain-mcp-adapters, which just converts MCP tools) and "full agent platform" (like Agno or Hive). It is the only framework in this batch that exposes agents as MCP servers, enabling agent composition at the protocol level.

Distinctive features

  1. Swarm pattern: peer-to-peer agent handoffs — no central coordinator — unique in the batch
  2. Agents as MCP servers: composition at the protocol level rather than at the code level
  3. Temporal durable execution: zero API change to go from local to durable execution
  4. Full MCP spec compliance: sampling, elicitation, OAuth, roots — not just tool calling
  5. Six composable patterns: all patterns from Anthropic's "Building Effective Agents" paper in one library
  6. Generic typed memory: Memory[MessageParamT] preserves provider-native message types without conversion

Observable failure modes

  1. No built-in persistence: memory is session-scoped by default; all history lost on process restart (unless using Temporal)
  2. Last commit 2026-01-25: framework may be in a low-maintenance period; mcp>=1.20.0 requirement may lag newer MCP spec revisions
  3. No built-in guardrails: no PII, injection, or safety gates in the framework itself
  4. No web UI: purely code-and-CLI; higher barrier for non-developers
  5. Context growth: unbounded message history within a session
04

Workflow

mcp-agent (lastmile) — Workflow

Minimal agent workflow

app = MCPApp(name="hello_world")

async def main():
    async with app.run():
        agent = Agent(
            name="finder",
            instruction="Use filesystem and fetch to answer questions.",
            server_names=["filesystem", "fetch"],
        )
        async with agent:
            llm = await agent.attach_llm(OpenAIAugmentedLLM)
            answer = await llm.generate_str("Summarize README.md")
            print(answer)
Phase Artifact
1. Create MCPApp Application context
2. Define Agent with server_names Agent bound to MCP servers
3. Attach LLM provider AugmentedLLM instance
4. Call generate_str / generate LLM response with tool calls

Orchestrator pattern

Phase Artifact
1. Define N specialist agents Agent list
2. Create Orchestrator with agents + LLM Orchestrator instance
3. Call orchestrator.run(objective) Plan object
4. Orchestrator decomposes → delegates → monitors Step results
5. Synthesize final output PlanResult

Evaluator-Optimizer loop

Phase Artifact
1. Define generator agent + evaluator agent Two agents
2. Create EvaluatorOptimizer Loop controller
3. Run with quality threshold Iterations until threshold
4. Return best result Final output

Temporal durable execution

# Same agent code — just swap the executor
app = MCPApp(name="durable_app", executor=TemporalExecutor())

No API changes. Temporal handles pause/resume/retry transparently.

Approval gates

  • human_input module: signal-based pause for human confirmation
  • MCP elicitation protocol: structured interactive prompts to users

Configuration

mcp_agent.config.yaml:

mcp:
  servers:
    filesystem:
      command: "npx"
      args: ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"]
    fetch:
      command: "uvx"
      args: ["mcp-server-fetch"]
06

Memory Context

mcp-agent (lastmile) — Memory & Context

Memory architecture

Memory[MessageParamT] — generic in-memory message history, typed to the provider's message format.

class Memory(BaseModel, Generic[MessageParamT]):
    # Stores the full conversation history
    # No persistence by default — session-scoped

Session-scoped (no persistence by default)

Message history lives in the AugmentedLLM instance for the duration of the agent context manager. When async with agent: exits, history is discarded.

Temporal durable execution (persistence opt-in)

When using TemporalExecutor, workflow state is persisted by Temporal's workflow history. Interrupted workflows are replayed from last checkpoint — full persistence without code changes.

MCP resources as context

Agents access MCP resources via ReadResourceResult. Resources are loaded on-demand per tool call, not pre-loaded into context.

Cross-session handoff

No native cross-session persistence in the base framework. Temporal provides this when opted in.

Context compaction

Not explicitly handled. Message history grows unbounded within a session; users are responsible for trimming.

Configuration files

  • mcp_agent.config.yaml — MCP server definitions, model settings
  • mcp_agent.secrets.yaml — API keys (gitignored)

These files persist across runs; they are the only project-level artifacts the framework creates.

07

Orchestration

mcp-agent (lastmile) — Orchestration

Multi-agent

Yes. Six composable multi-agent patterns.

Orchestration patterns

Pattern Class Topology
orchestrator Orchestrator hierarchical
parallel ParallelLLM parallel-fan-out
router Router sequential
evaluator-optimizer EvaluatorOptimizer sequential (loop)
swarm Swarm swarm (peer-to-peer)
deep_orchestrator DeepOrchestrator task-decomposition-tree

The Swarm pattern is unique: agents pass context to each other without a central coordinator, making routing decisions themselves based on their capabilities.

Max concurrent agents

Unknown; async Python means N agents can be in flight simultaneously; no hard cap.

Isolation mechanism

None. Agents run in the same Python process. MCP servers are isolated (subprocess).

Multi-model

Yes. Each agent can attach a different AugmentedLLM provider:

agent1_llm = await agent1.attach_llm(OpenAIAugmentedLLM)
agent2_llm = await agent2.attach_llm(AnthropicAugmentedLLM)

Execution mode

one-shot (local) or background-daemon (Temporal). The same code runs in both modes.

Crash recovery

Yes — via Temporal. In-process mode has no crash recovery.

Context compaction

Not built-in.

Consensus

None (no raft/byzantine mechanisms).

Prompt chaining

Yes — orchestrator captures step outputs and feeds them as context to the next step's prompt (explicit chaining via TASK_PROMPT_TEMPLATE).

Agents as MCP servers

Any mcp-agent agent can be exposed as an MCP server (AgentMCPServer), enabling other MCP clients (including other mcp-agent agents) to use it as a tool. This is the composition primitive.

Streaming output

Yes — AugmentedLLM supports streaming generation events via StreamEvent / StreamEventType.

08

Ui Cli Surface

mcp-agent (lastmile) — UI & CLI Surface

CLI binary

mcp-agent — primary CLI (installed as package script).

Key subcommands:

  • mcp-agent init — scaffold a new project (interactive)
  • mcp-agent run <app> — run an agent application
  • mcp-agent deploy <app> — deploy to LastMile cloud
  • mcp-agent describe — describe available agents/workflows

mcp-cloud / mcpc — cloud management CLI (separate entry point).

Quick start:

uvx mcp-agent init
uv init && uv add "mcp-agent[openai]"
uv run main.py

Local UI

None. mcp-agent is a library + CLI; no web dashboard or TUI.

Rich terminal output

Uses rich library for formatted terminal output during agent runs. The prompt-toolkit library provides interactive input in the CLI.

IDE integration

  • LLMS.txt in repo root maps to full documentation for AI coding assistants
  • docs.mcp-agent.com/mcp exposes documentation as an MCP server
  • docs.mcp-agent.com/llms-full.txt — full docs in LLM-readable format

Observability

  • OpenTelemetry — built-in via opentelemetry-instrumentation-anthropic and opentelemetry-instrumentation-openai
  • tracing/ module — semconv constants + telemetry utilities
  • Span tracing — each agent run, tool call, and LLM call emits OTEL spans with GEN_AI_* semantic conventions

Cloud deployment

LastMile cloud (mcp-cloud deploy) deploys agents as MCP servers. Used to build ChatGPT apps and other MCP-native applications.

Related frameworks

same archetype · same primary tool · same memory type

Claude-Flow / Ruflo ★ 55k

Eliminates single-agent context limits and sequential bottlenecks by orchestrating fault-tolerant swarms of specialized AI agents…

Hermes Agent (NousResearch) ★ 168k

Self-improving personal AI agent with closed learning loop, 7 terminal backends, and messaging gateway — not tied to any AI…

OpenCode ★ 165k

Terminal-first AI coding agent with multi-model routing, native desktop app, and a typed .opencode/ configuration system for…

OpenHands ★ 75k

Open-source AI software development platform (open-source Devin alternative) with Docker sandbox isolation, 77.6% SWE-bench…

DeerFlow ★ 70k

Long-horizon superagent that researches, codes, and creates by orchestrating parallel sub-agents with isolated contexts in Docker…

oh-my-openagent (omo) ★ 60k

Multi-provider AI agent orchestration for OpenCode: escape vendor lock-in by routing Sisyphus (Claude/Kimi/GLM) and Hephaestus…