ContextForge

contextforge · IBM/mcp-context-forge · ★ 3.8k · last commit 2026-05-26

Primitive shape 4 total

Commands 4

Summary

ContextForge — Summary

ContextForge is an IBM-backed open-source MCP registry, API gateway, and proxy that federates any number of MCP servers, A2A agents, REST APIs, and gRPC services behind a single unified endpoint. It delivers centralized governance, discovery, and observability via a FastAPI-based server with an HTMX Admin UI, installable from PyPI as mcp-contextforge-gateway or as a Docker/container image. The system's 40+ plugin architecture (content moderation, PII guardian, circuit-breaker, TOON compression, deny-filter, etc.) makes it the most feature-dense MCP gateway in the corpus. OpenTelemetry tracing supports Phoenix (LLM-specific), Jaeger, Zipkin, and any OTLP backend — making it usable for production observability. Auth includes JWT (RS256/HS256), OAuth tokens, RBAC with team scoping, and rate-limiting with retries.

Compared to seeds: unlike taskmaster-ai (single MCP bundler) or claude-flow (MCP-anchored workflow engine), ContextForge is a production-grade API gateway first — it doesn't generate code or orchestrate LLM tasks, it governs and proxies existing tool/agent infrastructure. Closer in spirit to an enterprise service mesh for AI, differing from all 11 seeds by being deployment infrastructure rather than a coding methodology.

Overview

ContextForge — Overview

Origin

Developed by IBM (Mihai Criveti et al.), first published 2025, maintained actively with 3,770+ stars and 670+ forks as of 2026-05-26. On PyPI as mcp-contextforge-gateway. Apache-2.0 licensed.

Philosophy

"An open source registry and proxy that federates tools, agents, and APIs into one clean endpoint for your AI clients. It provides centralized governance, discovery, and observability across your AI infrastructure."

The core philosophy is that MCP's decentralized model creates operational chaos at enterprise scale. ContextForge inserts a control plane between AI clients and the fragmented tool landscape: one endpoint, one auth surface, one observability pipe.

Design Principles

Federation first: any number of MCP servers, REST APIs, gRPC services, and A2A agents can be registered as virtual servers exposing unified MCP-compatible tools.
Plugin-extensibility: the 40+ plugin system (cpex package) lets operators inject content moderation, PII filtering, SQL sanitization, TOON compression, rate-limiting, audit trails, and more without code changes.
Transport agnosticism: HTTP, JSON-RPC, WebSocket, SSE, stdio, and streamable-HTTP all supported simultaneously.
Airgapped-capable: the Admin UI bundles HTMX 2.0.3 locally so it operates without CDN dependencies.
Vendor-agnostic observability: OpenTelemetry traces with multiple backends; zero overhead when disabled.

Key Manifesto Quote

From README: "ContextForge is an open source registry and proxy that federates any Model Context Protocol (MCP) server, A2A server, or REST/gRPC API, providing centralized governance, discovery, and observability. It optimizes agent and tool calling, and supports plugins."

Architecture

ContextForge — Architecture

Distribution

PyPI: pip install mcp-contextforge-gateway / uvx --from mcp-contextforge-gateway mcpgateway
Docker/OCI: ghcr.io/ibm/mcp-context-forge
Kubernetes: Helm charts in charts/ with Redis-backed federation
Ansible: ansible/ playbooks for deployment automation

Install Complexity

Multi-step. Requires Python ≥ 3.11, .env configuration (from .env.example), and optional Redis for multi-cluster federation.

CLI Binary

mcpgateway — a thin wrapper around Uvicorn exposing the FastAPI app. Entry point registered in pyproject.toml as mcpgateway.cli:main. Default host: 127.0.0.1:4444.

Also ships: mcpgateway.translate (stdio-to-SSE bridge), mcpgateway.wrapper (stdio MCP client wrapper), mcpgateway.utils.create_jwt_token (JWT generation utility).

Admin UI

HTMX 2.0.3 + Alpine.js web dashboard bundled inside the Python package. Available at http://localhost:4444/admin when MCPGATEWAY_UI_ENABLED=true. Features: real-time log viewer with filtering/search/export, tool catalog management, virtual server configuration, user management.

Directory Structure

mcpgateway/         # Core FastAPI application
  main.py           # Application entry point
  config.py         # Pydantic Settings configuration
  db.py             # SQLAlchemy ORM models
  schemas.py        # Pydantic validation schemas
  services/         # Business logic (50+ services)
  routers/          # HTTP endpoints (19 routers)
  middleware/       # Cross-cutting concerns (16 middleware)
  transports/       # SSE, WebSocket, stdio, streamable-HTTP
  plugins/          # Plugin integration via cpex package
  alembic/          # Database migrations
  admin_ui/         # HTMX+Alpine.js front-end bundle
plugins/            # 40+ plugin implementations (Python)
plugins_rust/       # Rust plugins for perf-critical paths
a2a-agents/         # A2A agent implementations
mcp-servers/        # MCP server templates
crates/             # Rust crates (mcp_runtime, wrapper)
charts/             # Helm charts
ansible/            # Infrastructure automation
tests/              # 7,000+ test suite

Required Runtime

Python ≥ 3.11
SQLAlchemy (SQLite default; Postgres supported)
Redis (optional, for multi-cluster federation)
OpenTelemetry SDK (optional, for tracing)

Target AI Tools

Any MCP-compatible client (Claude Code, Claude Desktop, Cursor, etc.). Also exposes A2A endpoints for OpenAI and Anthropic agent protocols.

Components

ContextForge — Components

CLI Commands / Binaries

Name	Purpose
`mcpgateway`	Main server CLI — starts FastAPI/Uvicorn on port 4444
`mcpgateway.translate`	Bridges stdio MCP servers to SSE/HTTP endpoints
`mcpgateway.wrapper`	stdio MCP client wrapper for use in MCP Inspector
`mcpgateway.utils.create_jwt_token`	Generates JWT tokens for API access

Core Modules (19 routers)

Routers include: tools, prompts, resources, servers (virtual), auth, health, metrics, admin, federation, llm-providers, and transport-specific endpoints.

Plugin System (40+ plugins)

Each plugin is a directory in plugins/ implementing the cpex plugin interface. Key plugins:

Plugin	Purpose
`ai_artifacts_normalizer`	Normalizes AI-generated artifact formats
`argument_normalizer`	Normalizes tool call arguments
`cached_tool_result`	Caches tool results to reduce redundant calls
`circuit_breaker`	Prevents cascading failures on upstream tools
`citation_validator`	Validates citations in LLM outputs
`code_formatter`	Auto-formats code in tool outputs
`code_safety_linter`	Safety linting on code generation outputs
`content_moderation`	Content moderation for tool inputs/outputs
`deny_filter`	Blocks specific tool calls or content patterns
`file_type_allowlist`	Restricts file operations to allowed types
`harmful_content_detector`	Detects harmful content in LLM responses
`header_filter`	Filters/modifies HTTP headers
`header_injector`	Injects headers into upstream requests
`html_to_markdown`	Converts HTML tool outputs to Markdown
`json_repair`	Repairs malformed JSON from tool calls
`jwt_claims_extraction`	Extracts JWT claims for RBAC decisions
`license_header_injector`	Injects license headers into code outputs
`markdown_cleaner`	Cleans Markdown formatting from outputs
`output_length_guard`	Enforces maximum output length limits
`privacy_notice_injector`	Injects privacy notices into responses
`regex_filter`	Regex-based content filtering
`resource_filter`	Filters accessible resources
`response_cache_by_prompt`	Caches responses keyed by prompt hash
`robots_license_guard`	Enforces robots.txt and license constraints
`safe_html_sanitizer`	Sanitizes HTML in tool outputs
`schema_guard`	Validates tool inputs against JSON schema
`span_attribute_customizer`	Customizes OTel span attributes
`sparc_static_validator`	SPARC methodology static validation
`sql_sanitizer`	Sanitizes SQL queries before execution
`summarizer`	Summarizes lengthy tool outputs
`timezone_translator`	Translates timezone representations
`tools_telemetry_exporter`	Exports tool usage telemetry
`toon_encoder`	TOON (Token Optimization) compression
`unified_pdp`	Unified Policy Decision Point
`vault`	HashiCorp Vault integration for secrets
`virus_total_checker`	VirusTotal scanning for file uploads
`watchdog`	Monitors agent/tool health
`webhook_notification`	Sends webhook notifications on events
`altk_json_processor`	ALTK JSON processing

Services (50+)

Business logic services covering: tool registration, server federation, prompt management, resource management, authentication/JWT, RBAC, rate limiting, caching, health checks, audit logging.

Middleware (16 layers)

Request logging, authentication, CORS, rate limiting, error handling, OpenTelemetry instrumentation.

Transports

SSE (with configurable keepalive), WebSocket, stdio, streamable-HTTP, JSON-RPC over HTTP.

Observability

mcpgateway/observability.py — vendor-agnostic OpenTelemetry instrumentation. Supports: Phoenix (Arize), Jaeger, Zipkin, Tempo, DataDog, New Relic, any OTLP backend. Zero overhead when disabled.

Prompts

ContextForge — Prompts

ContextForge is a gateway/proxy infrastructure tool, not an LLM prompting framework. It ships prompt-related infrastructure (a Prompts Registry with Jinja2 templates, multimodal support, and versioning) but does not include agent-instruction prompt files in the traditional sense.

AGENTS.md (Developer Instructions)

The AGENTS.md file provides instructions for AI coding assistants working on the ContextForge codebase itself. This is the closest thing to a "prompt file":

## Project Overview

ContextForge is an open source registry and proxy that federates MCP, A2A, and REST/gRPC APIs
with centralized governance, discovery, and observability. It federates tools, agents, and APIs,
optimizes agent and tool calling, and supports plugins, auth/RBAC, rate-limiting, virtual servers,
multi-transport protocols, and an optional Admin UI.

## Authentication & RBAC Overview

ContextForge implements a **two-layer security model**:

1. **Token Scoping (Layer 1)**: Controls what resources a user CAN SEE (data filtering)
2. **RBAC (Layer 2)**: Controls what actions a user CAN DO (permission checks)

### Token Scoping Quick Reference

**API / legacy tokens** — JWT `teams` claim is the sole authority:

| JWT `teams` State | `is_admin: true` | `is_admin: false` |
|-------------------|------------------|-------------------|
| Key MISSING       | PUBLIC-ONLY []   | PUBLIC-ONLY []    |
| `teams: null`     | ADMIN BYPASS     | PUBLIC-ONLY []    |
| `teams: []`       | PUBLIC-ONLY []   | PUBLIC-ONLY []    |
| `teams: ["t1"]`   | Team + Public    | Team + Public     |

Prompting technique: Architectural reference document with decision table. Designed to prevent AI assistants from misimplementing the RBAC model by showing all state combinations explicitly.

CLAUDE.md (Developer Agent Instructions)

AGENTS.md

(The .claude/CLAUDE.md simply symlinks to the root AGENTS.md — one source of truth for all AI assistants.)

Prompts Registry (Runtime Feature)

ContextForge ships a PromptService that manages Jinja2 prompt templates for use by MCP clients. Prompts are versioned, support rollback, and can include multimodal content. This is infrastructure for storing and serving prompts to LLMs — not the framework's own agent instructions.

Plugin AGENTS.md (Plugin Developer Instructions)

# plugins/AGENTS.md

Plugin framework overview. Plugins are Python packages installed via `pip install <plugin-name>`.
Each plugin implements:
- `pre_call_hook(request) -> request`: Transform/filter before upstream
- `post_call_hook(response) -> response`: Transform/filter after upstream
- `metadata()`: Name, version, description, configuration schema

Prompting technique: Contract-first specification with interface requirements — tells AI assistants exactly what a plugin must implement without prose explanation.

Uniqueness

ContextForge — Uniqueness & Positioning

Differs from Seeds

ContextForge is the only framework in this corpus that operates as a production enterprise API gateway for MCP/A2A infrastructure. Closest seed is taskmaster-ai (also MCP-anchored) but taskmaster bundles one MCP server for task management — ContextForge federates N upstream MCP servers behind a single authenticated endpoint with RBAC, rate-limiting, audit logging, and 40+ plugins. Unlike claude-flow which uses MCP as an execution substrate for agent coordination, ContextForge uses MCP as a protocol standard to unify disparate tool backends. Unlike all 11 seeds, ContextForge is not a coding methodology or agent harness — it is deployment infrastructure that any AI client uses passively.

Unique Characteristics

40+ plugin pipeline: No other framework in the corpus ships anything close to this scope of content-moderation, safety, PII, compression, and observability plugins for MCP tool calls.
TOON compression: Token Optimization via toon_encoder plugin — reduces token cost of tool responses automatically.
gRPC-to-MCP translation: Automatic service discovery via gRPC reflection, exposing gRPC services as MCP tools.
Two-layer RBAC: Token scoping (what resources you can see) + permission RBAC (what actions you can take) — enterprise-grade access control not seen in other MCP servers.
Airgapped Admin UI: HTMX+Alpine.js bundled locally, no CDN dependencies — suitable for air-gapped enterprise environments.
IBM backing: Corporate maintenance, security scanning (Snyk, detect-secrets, whitesource), and 7,000-test suite imply production readiness.
Multi-transport simultaneous support: SSE, WebSocket, stdio, HTTP, streamable-HTTP all exposed at once.

Positioning

Enterprise MCP control plane. Solves the problem that MCP servers are decentralized and unmanaged in team environments. ContextForge is the "Nginx/Kong for MCP" — insert it between AI clients and your tool ecosystem.

Observable Failure Modes

Complexity cliff: The 40+ plugin system, 50+ services, and multi-transport architecture make ContextForge significantly harder to operate than simpler alternatives. Misconfigured plugins could silently corrupt tool outputs.
Single point of failure: Without HA/Redis federation, the gateway is a SPOF for all connected AI tools.
Plugin ordering sensitivity: The sequential plugin pipeline means ordering matters; documentation on plugin interaction contracts is sparse.
PyPI-only distribution: No native Rust/Go distribution — Python startup latency may be an issue in serverless environments.

Cross-References

Competes with: Agentgateway, Plano (all are AI proxy/gateway frameworks targeting MCP)
Inspired by: Kong, Nginx, AWS API Gateway (classical API gateway patterns applied to MCP)

Workflow

ContextForge — Workflow

ContextForge is an infrastructure gateway, not a coding-workflow methodology. Its operational workflow is deployment and configuration, not agent task execution.

Deployment Workflow

Phase	Artifact
Install	`pip install mcp-contextforge-gateway` or `docker pull ghcr.io/ibm/mcp-context-forge`
Configure	`.env` file from `.env.example` — set JWT secret, DB URL, Redis URL, admin credentials
Start server	`mcpgateway --host 0.0.0.0 --port 4444`
Register servers	POST to `/api/servers` to register MCP/REST/gRPC backends
Create virtual servers	Bundle registered tools into virtual MCP servers via Admin UI or API
Configure plugins	`plugins/config.yaml` to enable/chain content moderation, PII, circuit-breaker, etc.
Connect clients	Point MCP clients at `http://localhost:4444/mcp` with Bearer token auth

Gateway Request Flow

Client sends MCP JSON-RPC request with Bearer token
JWT middleware validates token, extracts team scopes (RBAC Layer 1 — token scoping)
RBAC middleware checks action permissions (Layer 2 — permission checks)
Rate-limiting middleware checks per-user/team limits
Plugin pipeline executes (pre-call plugins: schema guard, deny filter, PII, etc.)
Request proxied to registered upstream server/tool
Plugin pipeline on response (post-call: content moderation, output length guard, TOON encoder, etc.)
OpenTelemetry spans emitted
Audit log entry written to DB

Approval Gates

None — ContextForge is an infrastructure layer. All decisions (allow/deny) are policy-enforced automatically via plugins and RBAC, not via interactive human gates.

Admin UI Workflow

Real-time log viewer: filter, search, export request/response logs
Tool catalog management: register, update, disable tools
Virtual server builder: bundle tools into virtual MCP servers
User/team management: create tokens with team scopes
Health dashboard: per-server health status

Kubernetes/HA Workflow

Redis-backed federation allows multiple ContextForge instances to share state. Helm chart in charts/ enables Kubernetes deployment with replica sets.

Memory Context

ContextForge — Memory & Context

State Storage

Primary: SQLAlchemy ORM with SQLite (default) or PostgreSQL. Stores: registered servers, tools, prompts, resources, virtual server definitions, user accounts, team memberships, audit records, plugin configurations.

Caching: Redis (optional) for multi-cluster federation and shared caching. Also mcpgateway/cache/ directory for local caching layers.

Schema Management: Alembic migrations in mcpgateway/alembic/ for database schema evolution.

State Files

mcpgateway.db — SQLite database (default location, configurable via DATABASE_URL)
.env — runtime configuration
plugins/config.yaml — plugin pipeline configuration
plugins/plugin_parity_config.yaml — plugin parity testing configuration

Session / Conversation State

ContextForge does not manage conversation state itself — it proxies requests to upstream MCP servers that may maintain their own session state. The gateway is stateless at the request level but maintains persistent registry state.

Memory Persistence

Registry data: persistent across restarts (SQLite/Postgres)
Audit logs: persistent in database
Rate limit counters: in-memory (Redis-backed in HA mode)
OpenTelemetry spans: exported to external backends (Jaeger, Phoenix, etc.) — not stored locally

Context Compaction

Not applicable. ContextForge is infrastructure, not an LLM orchestrator.

Cross-Session Handoff

Not applicable at the agent level. The registry and tool catalog persist indefinitely, so any new AI client connecting gets the same tool landscape as previous clients.

Prompt Registry (State)

The PromptService stores Jinja2 templates in the database with:

Version history and rollback capability
Multimodal content support
Tag-based categorization
Team-scoped access control

Orchestration

ContextForge — Orchestration

Multi-Agent Support

ContextForge does not orchestrate agents. It provides the infrastructure that multi-agent systems connect to. An A2A (Agent-to-Agent) gateway module enables agents to discover and communicate with other agents registered in the system.

Execution Mode

Event-driven (background daemon). The mcpgateway process runs continuously as an HTTP server, responding to MCP JSON-RPC requests from connected clients.

Orchestration Pattern

None — ContextForge is a proxy/gateway. It does not sequence or parallelize agent tasks. Upstream LLM clients or orchestrators make sequencing decisions; ContextForge handles routing, auth, and policy enforcement on each individual request.

Multi-Model Routing

Not directly. ContextForge includes llm_provider_configs.py and llm_schemas.py suggesting LLM provider routing capability, but the primary value is MCP tool routing, not LLM routing (unlike Plano).

Isolation Mechanism

None at the tool execution level. Each tool call is proxied to its registered upstream server; isolation is the upstream server's responsibility.

A2A Gateway

The Agent-to-Agent (A2A) module enables:

OpenAI-compatible agent routing
Anthropic agent routing
External AI agent integration
Agent capability discovery

Federation / HA

Redis-backed multi-cluster federation allows multiple ContextForge instances to share the tool registry and distribute load. This is the closest to a distributed orchestration pattern — but it is registry federation, not workflow orchestration.

Plugin Pipeline as Sequential Orchestration

The plugin pipeline (pre-call → upstream → post-call) is a sequential transformation chain applied to every request. Plugins run in configured order within a single request lifecycle.

Consensus Mechanism

None. Registry state is single-writer with Redis cache invalidation for HA.

Ui Cli Surface

ContextForge — UI/CLI Surface

CLI Binary

Name: mcpgateway Type: Thin Uvicorn wrapper (not a standalone runtime) Entry point: mcpgateway.cli:main Default: mcpgateway --host 127.0.0.1 --port 4444

Also ships:

python -m mcpgateway.translate — stdio-to-SSE/HTTP bridge
python -m mcpgateway.wrapper — stdio MCP client
python -m mcpgateway.utils.create_jwt_token — JWT generation

Subcommands: None (flags are passed through to Uvicorn: --reload, --workers, --ssl-keyfile, etc.)

Local Web Dashboard

Exists: Yes Type: Web dashboard (Admin UI) Port: 4444 (configurable via MCG_PORT) URL: http://localhost:4444/admin Tech Stack: HTMX 2.0.3 + Alpine.js (bundled, airgapped-capable)

Features:

Real-time log viewer with filtering, search, and export
Tool catalog management (CRUD for registered tools/servers)
Virtual server builder (bundle tools into virtual MCP servers)
User and team management (JWT token issuance with scope)
Health dashboard with per-upstream-server status
gRPC-to-MCP translation configuration
Plugin pipeline management

IDE Integration

None directly. Integrates with any MCP-compatible IDE through the MCP protocol.

Observability Stack

OpenTelemetry: Full OTel SDK integration in mcpgateway/observability.py
Supported backends: Phoenix (Arize), Jaeger, Zipkin, Tempo, DataDog, New Relic, any OTLP
Traces: Per-tool-call spans with LLM-specific metrics (token usage, cost, model performance)
Metrics: Prometheus-compatible health endpoints
Logs: Structured JSON logs with configurable sampling rate and skip patterns
Audit: governance_audit_log-style DB records (via JdbcAuditSink pattern)

API Surface

MCP JSON-RPC: POST /mcp (MCP 2025-11-25 compliant)
SSE transport: GET /sse, POST /messages
Admin REST API: 19 routers covering all CRUD operations
Health: GET /health
Metrics: GET /metrics
A2A: A2A protocol endpoints for agent-to-agent routing

Development UX

make dev — dev server with autoreload on :8000
make serve — production gunicorn on :4444
make autoflake isort black pre-commit — code quality pipeline
make detect-secrets-scan — secret detection baseline management
7,000+ test suite with Makefile targets for individual test categories

Related frameworks

same archetype · same primary tool · same memory type

OpenHarness ★ 13k

A11 Governance

Open-source Python agent runtime providing complete harness infrastructure: tools, memory, governance, swarm coordination, and…

Trae Agent ★ 12k

A11 Governance

Research-friendly open-source CLI coding agent by ByteDance, designed for academic ablation studies and modular LLM provider…

Sweep AI ★ 7.7k

A11 Governance

Autonomous GitHub bot that converts issues to pull requests using a sequential multi-agent pipeline.

Agent Governance Toolkit (microsoft) ★ 2.3k

A11 Governance

Enterprise-grade AI agent governance: YAML policy enforcement, 12-vector prompt injection defense, zero-trust identity,…

TDD Guard ★ 2.1k

A11 Governance

Mechanically enforces the Red-Green-Refactor TDD cycle by blocking file writes that violate TDD principles via a PreToolUse hook…

Agentic Coding Flywheel Setup (ACFS) ★ 1.5k

A11 Governance

Take a complete beginner from laptop to three AI coding agents running on a VPS in 30 minutes via an idempotent manifest-driven…

Distribution

Type: mcp-server
License: Apache-2.0
Install: multi-step
Version: unknown (PyPI: mcp-contextforge-gateway, commit 2026-05-26)

Surfaces

CLI binary: mcpgateway
CLI subcmds: 0
Local UI: web-dashboard
UI port: 4444
Tech stack: HTMX 2.0.3 + Alpine.js (bundled, airgapped)

Components

Commands: 4
Skills: 0
Subagents: 0
Hooks: 0
MCP servers: 1
MCP tools: 0
Scripts: 5
Templates: 3

Workflow

Phases: 7
Approval gates: 0
Spec format: none
Spec storage: none
Delta or full: none

Orchestration

Multi-agent: No
Pattern: none
Isolation: none
Consensus: none
Prompt chaining: No

Multi-model

Multi-model: No
BYOK: Yes
Modal: text

Execution

Mode: background-daemon
Crash recovery: No
Compaction: No
Session handoff: Yes
Streaming: Yes

Memory

Type: sqlite
Persistence: global
Search: none
State files: 3 files

Quality

TDD: Optional
TDD mechanism: none
Validators: 3
Self-review: none

Git / Observability

Auto commit: No
Auto PR: No
Auto merge: No
Worktree/feat: No
Audit log: Yes
Audit format: sqlite
Replay: No

Tools

Primary: any-mcp-client
Targets: 4
Portability: high

Signals

Stars: 3.8k
Last commit: 2026-05-26
Contributors: 30
Maintainer: active
Quality score: 3.2/10