Skip to content
/

ContextForge

contextforge · IBM/mcp-context-forge · ★ 3.8k · last commit 2026-05-26

Primitive shape 4 total
Commands 4
00

Summary

ContextForge — Summary

ContextForge is an IBM-backed open-source MCP registry, API gateway, and proxy that federates any number of MCP servers, A2A agents, REST APIs, and gRPC services behind a single unified endpoint. It delivers centralized governance, discovery, and observability via a FastAPI-based server with an HTMX Admin UI, installable from PyPI as mcp-contextforge-gateway or as a Docker/container image. The system's 40+ plugin architecture (content moderation, PII guardian, circuit-breaker, TOON compression, deny-filter, etc.) makes it the most feature-dense MCP gateway in the corpus. OpenTelemetry tracing supports Phoenix (LLM-specific), Jaeger, Zipkin, and any OTLP backend — making it usable for production observability. Auth includes JWT (RS256/HS256), OAuth tokens, RBAC with team scoping, and rate-limiting with retries.

Compared to seeds: unlike taskmaster-ai (single MCP bundler) or claude-flow (MCP-anchored workflow engine), ContextForge is a production-grade API gateway first — it doesn't generate code or orchestrate LLM tasks, it governs and proxies existing tool/agent infrastructure. Closer in spirit to an enterprise service mesh for AI, differing from all 11 seeds by being deployment infrastructure rather than a coding methodology.

01

Overview

ContextForge — Overview

Origin

Developed by IBM (Mihai Criveti et al.), first published 2025, maintained actively with 3,770+ stars and 670+ forks as of 2026-05-26. On PyPI as mcp-contextforge-gateway. Apache-2.0 licensed.

Philosophy

"An open source registry and proxy that federates tools, agents, and APIs into one clean endpoint for your AI clients. It provides centralized governance, discovery, and observability across your AI infrastructure."

The core philosophy is that MCP's decentralized model creates operational chaos at enterprise scale. ContextForge inserts a control plane between AI clients and the fragmented tool landscape: one endpoint, one auth surface, one observability pipe.

Design Principles

  • Federation first: any number of MCP servers, REST APIs, gRPC services, and A2A agents can be registered as virtual servers exposing unified MCP-compatible tools.
  • Plugin-extensibility: the 40+ plugin system (cpex package) lets operators inject content moderation, PII filtering, SQL sanitization, TOON compression, rate-limiting, audit trails, and more without code changes.
  • Transport agnosticism: HTTP, JSON-RPC, WebSocket, SSE, stdio, and streamable-HTTP all supported simultaneously.
  • Airgapped-capable: the Admin UI bundles HTMX 2.0.3 locally so it operates without CDN dependencies.
  • Vendor-agnostic observability: OpenTelemetry traces with multiple backends; zero overhead when disabled.

Key Manifesto Quote

From README: "ContextForge is an open source registry and proxy that federates any Model Context Protocol (MCP) server, A2A server, or REST/gRPC API, providing centralized governance, discovery, and observability. It optimizes agent and tool calling, and supports plugins."

02

Architecture

ContextForge — Architecture

Distribution

  • PyPI: pip install mcp-contextforge-gateway / uvx --from mcp-contextforge-gateway mcpgateway
  • Docker/OCI: ghcr.io/ibm/mcp-context-forge
  • Kubernetes: Helm charts in charts/ with Redis-backed federation
  • Ansible: ansible/ playbooks for deployment automation

Install Complexity

Multi-step. Requires Python ≥ 3.11, .env configuration (from .env.example), and optional Redis for multi-cluster federation.

CLI Binary

mcpgateway — a thin wrapper around Uvicorn exposing the FastAPI app. Entry point registered in pyproject.toml as mcpgateway.cli:main. Default host: 127.0.0.1:4444.

Also ships: mcpgateway.translate (stdio-to-SSE bridge), mcpgateway.wrapper (stdio MCP client wrapper), mcpgateway.utils.create_jwt_token (JWT generation utility).

Admin UI

HTMX 2.0.3 + Alpine.js web dashboard bundled inside the Python package. Available at http://localhost:4444/admin when MCPGATEWAY_UI_ENABLED=true. Features: real-time log viewer with filtering/search/export, tool catalog management, virtual server configuration, user management.

Directory Structure

mcpgateway/         # Core FastAPI application
  main.py           # Application entry point
  config.py         # Pydantic Settings configuration
  db.py             # SQLAlchemy ORM models
  schemas.py        # Pydantic validation schemas
  services/         # Business logic (50+ services)
  routers/          # HTTP endpoints (19 routers)
  middleware/       # Cross-cutting concerns (16 middleware)
  transports/       # SSE, WebSocket, stdio, streamable-HTTP
  plugins/          # Plugin integration via cpex package
  alembic/          # Database migrations
  admin_ui/         # HTMX+Alpine.js front-end bundle
plugins/            # 40+ plugin implementations (Python)
plugins_rust/       # Rust plugins for perf-critical paths
a2a-agents/         # A2A agent implementations
mcp-servers/        # MCP server templates
crates/             # Rust crates (mcp_runtime, wrapper)
charts/             # Helm charts
ansible/            # Infrastructure automation
tests/              # 7,000+ test suite

Required Runtime

  • Python ≥ 3.11
  • SQLAlchemy (SQLite default; Postgres supported)
  • Redis (optional, for multi-cluster federation)
  • OpenTelemetry SDK (optional, for tracing)

Target AI Tools

Any MCP-compatible client (Claude Code, Claude Desktop, Cursor, etc.). Also exposes A2A endpoints for OpenAI and Anthropic agent protocols.

03

Components

ContextForge — Components

CLI Commands / Binaries

Name Purpose
mcpgateway Main server CLI — starts FastAPI/Uvicorn on port 4444
mcpgateway.translate Bridges stdio MCP servers to SSE/HTTP endpoints
mcpgateway.wrapper stdio MCP client wrapper for use in MCP Inspector
mcpgateway.utils.create_jwt_token Generates JWT tokens for API access

Core Modules (19 routers)

Routers include: tools, prompts, resources, servers (virtual), auth, health, metrics, admin, federation, llm-providers, and transport-specific endpoints.

Plugin System (40+ plugins)

Each plugin is a directory in plugins/ implementing the cpex plugin interface. Key plugins:

Plugin Purpose
ai_artifacts_normalizer Normalizes AI-generated artifact formats
argument_normalizer Normalizes tool call arguments
cached_tool_result Caches tool results to reduce redundant calls
circuit_breaker Prevents cascading failures on upstream tools
citation_validator Validates citations in LLM outputs
code_formatter Auto-formats code in tool outputs
code_safety_linter Safety linting on code generation outputs
content_moderation Content moderation for tool inputs/outputs
deny_filter Blocks specific tool calls or content patterns
file_type_allowlist Restricts file operations to allowed types
harmful_content_detector Detects harmful content in LLM responses
header_filter Filters/modifies HTTP headers
header_injector Injects headers into upstream requests
html_to_markdown Converts HTML tool outputs to Markdown
json_repair Repairs malformed JSON from tool calls
jwt_claims_extraction Extracts JWT claims for RBAC decisions
license_header_injector Injects license headers into code outputs
markdown_cleaner Cleans Markdown formatting from outputs
output_length_guard Enforces maximum output length limits
privacy_notice_injector Injects privacy notices into responses
regex_filter Regex-based content filtering
resource_filter Filters accessible resources
response_cache_by_prompt Caches responses keyed by prompt hash
robots_license_guard Enforces robots.txt and license constraints
safe_html_sanitizer Sanitizes HTML in tool outputs
schema_guard Validates tool inputs against JSON schema
span_attribute_customizer Customizes OTel span attributes
sparc_static_validator SPARC methodology static validation
sql_sanitizer Sanitizes SQL queries before execution
summarizer Summarizes lengthy tool outputs
timezone_translator Translates timezone representations
tools_telemetry_exporter Exports tool usage telemetry
toon_encoder TOON (Token Optimization) compression
unified_pdp Unified Policy Decision Point
vault HashiCorp Vault integration for secrets
virus_total_checker VirusTotal scanning for file uploads
watchdog Monitors agent/tool health
webhook_notification Sends webhook notifications on events
altk_json_processor ALTK JSON processing

Services (50+)

Business logic services covering: tool registration, server federation, prompt management, resource management, authentication/JWT, RBAC, rate limiting, caching, health checks, audit logging.

Middleware (16 layers)

Request logging, authentication, CORS, rate limiting, error handling, OpenTelemetry instrumentation.

Transports

SSE (with configurable keepalive), WebSocket, stdio, streamable-HTTP, JSON-RPC over HTTP.

Observability

mcpgateway/observability.py — vendor-agnostic OpenTelemetry instrumentation. Supports: Phoenix (Arize), Jaeger, Zipkin, Tempo, DataDog, New Relic, any OTLP backend. Zero overhead when disabled.

05

Prompts

ContextForge — Prompts

ContextForge is a gateway/proxy infrastructure tool, not an LLM prompting framework. It ships prompt-related infrastructure (a Prompts Registry with Jinja2 templates, multimodal support, and versioning) but does not include agent-instruction prompt files in the traditional sense.

AGENTS.md (Developer Instructions)

The AGENTS.md file provides instructions for AI coding assistants working on the ContextForge codebase itself. This is the closest thing to a "prompt file":

## Project Overview

ContextForge is an open source registry and proxy that federates MCP, A2A, and REST/gRPC APIs
with centralized governance, discovery, and observability. It federates tools, agents, and APIs,
optimizes agent and tool calling, and supports plugins, auth/RBAC, rate-limiting, virtual servers,
multi-transport protocols, and an optional Admin UI.

## Authentication & RBAC Overview

ContextForge implements a **two-layer security model**:

1. **Token Scoping (Layer 1)**: Controls what resources a user CAN SEE (data filtering)
2. **RBAC (Layer 2)**: Controls what actions a user CAN DO (permission checks)

### Token Scoping Quick Reference

**API / legacy tokens** — JWT `teams` claim is the sole authority:

| JWT `teams` State | `is_admin: true` | `is_admin: false` |
|-------------------|------------------|-------------------|
| Key MISSING       | PUBLIC-ONLY []   | PUBLIC-ONLY []    |
| `teams: null`     | ADMIN BYPASS     | PUBLIC-ONLY []    |
| `teams: []`       | PUBLIC-ONLY []   | PUBLIC-ONLY []    |
| `teams: ["t1"]`   | Team + Public    | Team + Public     |

Prompting technique: Architectural reference document with decision table. Designed to prevent AI assistants from misimplementing the RBAC model by showing all state combinations explicitly.

CLAUDE.md (Developer Agent Instructions)

AGENTS.md

(The .claude/CLAUDE.md simply symlinks to the root AGENTS.md — one source of truth for all AI assistants.)

Prompts Registry (Runtime Feature)

ContextForge ships a PromptService that manages Jinja2 prompt templates for use by MCP clients. Prompts are versioned, support rollback, and can include multimodal content. This is infrastructure for storing and serving prompts to LLMs — not the framework's own agent instructions.

Plugin AGENTS.md (Plugin Developer Instructions)

# plugins/AGENTS.md

Plugin framework overview. Plugins are Python packages installed via `pip install <plugin-name>`.
Each plugin implements:
- `pre_call_hook(request) -> request`: Transform/filter before upstream
- `post_call_hook(response) -> response`: Transform/filter after upstream
- `metadata()`: Name, version, description, configuration schema

Prompting technique: Contract-first specification with interface requirements — tells AI assistants exactly what a plugin must implement without prose explanation.

09

Uniqueness

ContextForge — Uniqueness & Positioning

Differs from Seeds

ContextForge is the only framework in this corpus that operates as a production enterprise API gateway for MCP/A2A infrastructure. Closest seed is taskmaster-ai (also MCP-anchored) but taskmaster bundles one MCP server for task management — ContextForge federates N upstream MCP servers behind a single authenticated endpoint with RBAC, rate-limiting, audit logging, and 40+ plugins. Unlike claude-flow which uses MCP as an execution substrate for agent coordination, ContextForge uses MCP as a protocol standard to unify disparate tool backends. Unlike all 11 seeds, ContextForge is not a coding methodology or agent harness — it is deployment infrastructure that any AI client uses passively.

Unique Characteristics

  1. 40+ plugin pipeline: No other framework in the corpus ships anything close to this scope of content-moderation, safety, PII, compression, and observability plugins for MCP tool calls.
  2. TOON compression: Token Optimization via toon_encoder plugin — reduces token cost of tool responses automatically.
  3. gRPC-to-MCP translation: Automatic service discovery via gRPC reflection, exposing gRPC services as MCP tools.
  4. Two-layer RBAC: Token scoping (what resources you can see) + permission RBAC (what actions you can take) — enterprise-grade access control not seen in other MCP servers.
  5. Airgapped Admin UI: HTMX+Alpine.js bundled locally, no CDN dependencies — suitable for air-gapped enterprise environments.
  6. IBM backing: Corporate maintenance, security scanning (Snyk, detect-secrets, whitesource), and 7,000-test suite imply production readiness.
  7. Multi-transport simultaneous support: SSE, WebSocket, stdio, HTTP, streamable-HTTP all exposed at once.

Positioning

Enterprise MCP control plane. Solves the problem that MCP servers are decentralized and unmanaged in team environments. ContextForge is the "Nginx/Kong for MCP" — insert it between AI clients and your tool ecosystem.

Observable Failure Modes

  • Complexity cliff: The 40+ plugin system, 50+ services, and multi-transport architecture make ContextForge significantly harder to operate than simpler alternatives. Misconfigured plugins could silently corrupt tool outputs.
  • Single point of failure: Without HA/Redis federation, the gateway is a SPOF for all connected AI tools.
  • Plugin ordering sensitivity: The sequential plugin pipeline means ordering matters; documentation on plugin interaction contracts is sparse.
  • PyPI-only distribution: No native Rust/Go distribution — Python startup latency may be an issue in serverless environments.

Cross-References

  • Competes with: Agentgateway, Plano (all are AI proxy/gateway frameworks targeting MCP)
  • Inspired by: Kong, Nginx, AWS API Gateway (classical API gateway patterns applied to MCP)
04

Workflow

ContextForge — Workflow

ContextForge is an infrastructure gateway, not a coding-workflow methodology. Its operational workflow is deployment and configuration, not agent task execution.

Deployment Workflow

Phase Artifact
Install pip install mcp-contextforge-gateway or docker pull ghcr.io/ibm/mcp-context-forge
Configure .env file from .env.example — set JWT secret, DB URL, Redis URL, admin credentials
Start server mcpgateway --host 0.0.0.0 --port 4444
Register servers POST to /api/servers to register MCP/REST/gRPC backends
Create virtual servers Bundle registered tools into virtual MCP servers via Admin UI or API
Configure plugins plugins/config.yaml to enable/chain content moderation, PII, circuit-breaker, etc.
Connect clients Point MCP clients at http://localhost:4444/mcp with Bearer token auth

Gateway Request Flow

  1. Client sends MCP JSON-RPC request with Bearer token
  2. JWT middleware validates token, extracts team scopes (RBAC Layer 1 — token scoping)
  3. RBAC middleware checks action permissions (Layer 2 — permission checks)
  4. Rate-limiting middleware checks per-user/team limits
  5. Plugin pipeline executes (pre-call plugins: schema guard, deny filter, PII, etc.)
  6. Request proxied to registered upstream server/tool
  7. Plugin pipeline on response (post-call: content moderation, output length guard, TOON encoder, etc.)
  8. OpenTelemetry spans emitted
  9. Audit log entry written to DB

Approval Gates

None — ContextForge is an infrastructure layer. All decisions (allow/deny) are policy-enforced automatically via plugins and RBAC, not via interactive human gates.

Admin UI Workflow

  • Real-time log viewer: filter, search, export request/response logs
  • Tool catalog management: register, update, disable tools
  • Virtual server builder: bundle tools into virtual MCP servers
  • User/team management: create tokens with team scopes
  • Health dashboard: per-server health status

Kubernetes/HA Workflow

Redis-backed federation allows multiple ContextForge instances to share state. Helm chart in charts/ enables Kubernetes deployment with replica sets.

06

Memory Context

ContextForge — Memory & Context

State Storage

Primary: SQLAlchemy ORM with SQLite (default) or PostgreSQL. Stores: registered servers, tools, prompts, resources, virtual server definitions, user accounts, team memberships, audit records, plugin configurations.

Caching: Redis (optional) for multi-cluster federation and shared caching. Also mcpgateway/cache/ directory for local caching layers.

Schema Management: Alembic migrations in mcpgateway/alembic/ for database schema evolution.

State Files

  • mcpgateway.db — SQLite database (default location, configurable via DATABASE_URL)
  • .env — runtime configuration
  • plugins/config.yaml — plugin pipeline configuration
  • plugins/plugin_parity_config.yaml — plugin parity testing configuration

Session / Conversation State

ContextForge does not manage conversation state itself — it proxies requests to upstream MCP servers that may maintain their own session state. The gateway is stateless at the request level but maintains persistent registry state.

Memory Persistence

  • Registry data: persistent across restarts (SQLite/Postgres)
  • Audit logs: persistent in database
  • Rate limit counters: in-memory (Redis-backed in HA mode)
  • OpenTelemetry spans: exported to external backends (Jaeger, Phoenix, etc.) — not stored locally

Context Compaction

Not applicable. ContextForge is infrastructure, not an LLM orchestrator.

Cross-Session Handoff

Not applicable at the agent level. The registry and tool catalog persist indefinitely, so any new AI client connecting gets the same tool landscape as previous clients.

Prompt Registry (State)

The PromptService stores Jinja2 templates in the database with:

  • Version history and rollback capability
  • Multimodal content support
  • Tag-based categorization
  • Team-scoped access control
07

Orchestration

ContextForge — Orchestration

Multi-Agent Support

ContextForge does not orchestrate agents. It provides the infrastructure that multi-agent systems connect to. An A2A (Agent-to-Agent) gateway module enables agents to discover and communicate with other agents registered in the system.

Execution Mode

Event-driven (background daemon). The mcpgateway process runs continuously as an HTTP server, responding to MCP JSON-RPC requests from connected clients.

Orchestration Pattern

None — ContextForge is a proxy/gateway. It does not sequence or parallelize agent tasks. Upstream LLM clients or orchestrators make sequencing decisions; ContextForge handles routing, auth, and policy enforcement on each individual request.

Multi-Model Routing

Not directly. ContextForge includes llm_provider_configs.py and llm_schemas.py suggesting LLM provider routing capability, but the primary value is MCP tool routing, not LLM routing (unlike Plano).

Isolation Mechanism

None at the tool execution level. Each tool call is proxied to its registered upstream server; isolation is the upstream server's responsibility.

A2A Gateway

The Agent-to-Agent (A2A) module enables:

  • OpenAI-compatible agent routing
  • Anthropic agent routing
  • External AI agent integration
  • Agent capability discovery

Federation / HA

Redis-backed multi-cluster federation allows multiple ContextForge instances to share the tool registry and distribute load. This is the closest to a distributed orchestration pattern — but it is registry federation, not workflow orchestration.

Plugin Pipeline as Sequential Orchestration

The plugin pipeline (pre-call → upstream → post-call) is a sequential transformation chain applied to every request. Plugins run in configured order within a single request lifecycle.

Consensus Mechanism

None. Registry state is single-writer with Redis cache invalidation for HA.

08

Ui Cli Surface

ContextForge — UI/CLI Surface

CLI Binary

Name: mcpgateway Type: Thin Uvicorn wrapper (not a standalone runtime) Entry point: mcpgateway.cli:main Default: mcpgateway --host 127.0.0.1 --port 4444

Also ships:

  • python -m mcpgateway.translate — stdio-to-SSE/HTTP bridge
  • python -m mcpgateway.wrapper — stdio MCP client
  • python -m mcpgateway.utils.create_jwt_token — JWT generation

Subcommands: None (flags are passed through to Uvicorn: --reload, --workers, --ssl-keyfile, etc.)

Local Web Dashboard

Exists: Yes Type: Web dashboard (Admin UI) Port: 4444 (configurable via MCG_PORT) URL: http://localhost:4444/admin Tech Stack: HTMX 2.0.3 + Alpine.js (bundled, airgapped-capable)

Features:

  • Real-time log viewer with filtering, search, and export
  • Tool catalog management (CRUD for registered tools/servers)
  • Virtual server builder (bundle tools into virtual MCP servers)
  • User and team management (JWT token issuance with scope)
  • Health dashboard with per-upstream-server status
  • gRPC-to-MCP translation configuration
  • Plugin pipeline management

IDE Integration

None directly. Integrates with any MCP-compatible IDE through the MCP protocol.

Observability Stack

  • OpenTelemetry: Full OTel SDK integration in mcpgateway/observability.py
  • Supported backends: Phoenix (Arize), Jaeger, Zipkin, Tempo, DataDog, New Relic, any OTLP
  • Traces: Per-tool-call spans with LLM-specific metrics (token usage, cost, model performance)
  • Metrics: Prometheus-compatible health endpoints
  • Logs: Structured JSON logs with configurable sampling rate and skip patterns
  • Audit: governance_audit_log-style DB records (via JdbcAuditSink pattern)

API Surface

  • MCP JSON-RPC: POST /mcp (MCP 2025-11-25 compliant)
  • SSE transport: GET /sse, POST /messages
  • Admin REST API: 19 routers covering all CRUD operations
  • Health: GET /health
  • Metrics: GET /metrics
  • A2A: A2A protocol endpoints for agent-to-agent routing

Development UX

  • make dev — dev server with autoreload on :8000
  • make serve — production gunicorn on :4444
  • make autoflake isort black pre-commit — code quality pipeline
  • make detect-secrets-scan — secret detection baseline management
  • 7,000+ test suite with Makefile targets for individual test categories

Related frameworks

same archetype · same primary tool · same memory type

OpenHarness ★ 13k

Open-source Python agent runtime providing complete harness infrastructure: tools, memory, governance, swarm coordination, and…

Trae Agent ★ 12k

Research-friendly open-source CLI coding agent by ByteDance, designed for academic ablation studies and modular LLM provider…

Sweep AI ★ 7.7k

Autonomous GitHub bot that converts issues to pull requests using a sequential multi-agent pipeline.

Agent Governance Toolkit (microsoft) ★ 2.3k

Enterprise-grade AI agent governance: YAML policy enforcement, 12-vector prompt injection defense, zero-trust identity,…

TDD Guard ★ 2.1k

Mechanically enforces the Red-Green-Refactor TDD cycle by blocking file writes that violate TDD principles via a PreToolUse hook…

Agentic Coding Flywheel Setup (ACFS) ★ 1.5k

Take a complete beginner from laptop to three AI coding agents running on a VPS in 30 minutes via an idempotent manifest-driven…