Skip to content
/

Archestra

archestra · archestra-ai/archestra · ★ 3.7k · last commit 2026-05-26

Primitive shape 5 total
Commands 3 Subagents 1 Hooks 1
00

Summary

Archestra — Summary

Archestra is a CNCF/Linux Foundation-hosted enterprise AI platform providing a private MCP registry, Kubernetes-native MCP orchestrator, RAG knowledge base, dual-LLM security sub-agents, cost monitoring/optimization, and full observability (metrics, traces, logs). It runs as a Docker Compose or Helm-deployed service pair (Next.js frontend on port 3000, Fastify backend on port 9000) and positions itself as the "ChatGPT-like interface + MCP governance layer" for organizations. Its non-probabilistic security engine prevents prompt injection and data exfiltration by isolating dangerous tool responses through dual-LLM sub-agents. Cost optimizer can reduce token spend up to 96% by automatically switching to cheaper models for simpler tasks.

Compared to seeds: unlike ccmemory (a single Neo4j-backed MCP memory server) or taskmaster-ai (a task-management MCP bundler), Archestra is a full enterprise AI platform — it manages the entire lifecycle of MCP servers across an organization including registration, governance, Kubernetes deployment, credential management, and cost control. The closest analogy is ContextForge (also an MCP gateway) but Archestra adds a chat UI, RAG knowledge base, RBAC with custom roles, and Kubernetes operator — making it closer to an internal AI cloud platform than a pure proxy.

01

Overview

Archestra — Overview

Origin

Developed by archestra-ai, CNCF and Linux Foundation member project. TypeScript/Node.js stack. AGPL-3.0 licensed. 3,745 stars, 879 forks as of 2026-05-26. Active development.

Philosophy

"Simplify AI usage in your company, providing user-friendly MCP toolbox, observability and control built on a strong security foundation."

Archestra's core argument: MCP in organizations creates three risks — MCP chaos (servers scattered on individual machines), data exfiltration (prompt injection via tool outputs), and runaway costs. All three require a centralized platform response, not individual tool hygiene.

Target Audiences

  • Platform teams: Centralize MCP servers, manage credentials, prevent data exfiltration
  • Developers: Deploy MCP servers org-wide, build agents without worrying about security
  • Management: 1-click MCP adoption for technical and non-technical users, cost visibility

Key Security Concept: Lethal Trifecta

Archestra explicitly models the "Lethal Trifecta" (coined by Simon Willison) — the combination of: MCP tool with access to sensitive data + prompt injection vector + exfiltration capability. The dual-LLM sub-agent architecture isolates dangerous tool responses so the main agent never sees potentially injected instructions.

Manifesto Quotes

"Mitigate MCP chaos, move MCP servers from individual machines to a centralized orchestrator"

"Non-probabilistic security to prevent data exfiltration"

"Reduce AI costs up to 96%"

Performance

Self-reported: 45ms at P95 for tool routing latency. Terraform provider and Helm chart available for production deployment.

02

Architecture

Archestra — Architecture

Distribution

Docker Compose (quickstart) and Helm chart (production). The platform is a single Docker image containing both frontend and backend.

docker pull archestra/platform:latest
docker run -p 9000:9000 -p 3000:3000 \
  -e ARCHESTRA_QUICKSTART=true \
  -v /var/run/docker.sock:/var/run/docker.sock \
  -v archestra-postgres-data:/var/lib/postgresql/data \
  -v archestra-app-data:/app/data \
  archestra/platform

Services

Port Service
3000 Next.js frontend (chat, MCP registry, settings, observability)
9000 Fastify backend API
9050 Prometheus metrics endpoint
3002 Grafana (manual start via Tilt)
3200 Tempo (distributed tracing)
9090 Prometheus storage

Directory Structure

platform/
  backend/        # Fastify API server (TypeScript)
  frontend/       # Next.js frontend (TypeScript)
  e2e-tests/      # Playwright e2e tests
  benchmarks/     # Performance benchmarks
.claude/
  commands/       # fix-pr.md, review-pr.md, resolve-conflicts
  skills/         # Developer AI skills
  settings.json   # PostToolUse hook for Biome formatting
.mcp.json         # MCP server config for development

Tech Stack

  • Frontend: Next.js + React + Shadcn/UI + Tailwind
  • Backend: Fastify + TypeScript + Drizzle ORM
  • Database: PostgreSQL (embedded in Docker image)
  • Observability: Prometheus + Grafana + Tempo
  • Kubernetes: Tilt for dev, Helm for production
  • IaC: Terraform provider (archestra-ai/terraform-provider-archestra)

Required Runtime

  • Docker (with volume mounts for PostgreSQL data and app data)
  • Kubernetes + Helm (for production)

Target AI Tools

Any MCP-compatible AI client. The platform exposes MCP gateways per-profile for connecting Claude Code, Claude Desktop, Cursor, etc. to the centralized MCP registry.

MCP Gateway Endpoint

GET/POST http://localhost:9000/v1/mcp/:profileId — profile-scoped MCP endpoint (stateless JSON-RPC mode). Authentication via Bearer archestra_token.

03

Components

Archestra — Components

Core Platform Features (Frontend Routes)

Route Feature
/chat ChatGPT-like chat with MCP tools and private prompt registry
/tools Unified tools management with server-side pagination
/mcp/registry Private MCP registry — install and manage MCP servers
/mcp/registry/installation-requests Pending MCP server installation approval queue
/mcp/logs MCP tool call logs with audit trail
/llm/logs LLM proxy request logs
/llm/cost/statistics Token usage analytics with time series charts
/llm/cost/limits Per-profile token usage limits
/llm/cost/token-price Model pricing configuration
/llm/cost/optimization-rules Cost optimization policies (model switching rules)
/settings LLM & MCP gateway config, dual-LLM settings, account, members, teams
/settings/roles Admin: custom RBAC role management
/settings/appearance Admin: theme, logo, fonts customization

Backend APIs

Endpoint Purpose
GET/POST /v1/mcp/:profileId Profile-scoped MCP gateway (JSON-RPC stateless mode)
POST /mcp_proxy/:id MCP proxy to Kubernetes pods
GET /api/mcp_server/:id/logs Container logs (streaming)
POST /api/mcp_server/:id/restart Restart MCP server pod
GET /api/mcp-tool-calls Paginated MCP tool call audit logs
GET /api/profile-tools Profile-tool relationship management
GET /metrics Prometheus metrics

Security Sub-Agent (Dual LLM)

The dual-LLM architecture runs dangerous tool responses through a separate "sanitizer" LLM before the main agent sees them. Prevents prompt injection from malicious tool outputs.

Knowledge Base (RAG)

Built-in retrieval-augmented knowledge base — no external vector database required. Accessible through the chat interface with citations.

Cost Optimizer

Dynamic model switching rules that automatically route to cheaper models for simpler tasks. Self-reported up to 96% cost reduction.

Claude Code Developer Tools

.claude/commands/:

  • fix-pr.md — Fix failing PR checks
  • review-pr.md — Review a PR
  • resolve-conflicts — Resolve merge conflicts

.claude/settings.json:

  • PostToolUse hook (Write|Edit|MultiEdit matcher): runs Biome formatter automatically on platform/** file edits

RBAC

Custom role management with enterprise edition features (config.enterpriseFeatures.core). Free vs enterprise tier separation via .ee.ts file naming convention.

05

Prompts

Archestra — Prompts

Archestra is a platform product, not a prompting framework. The prompt-like artifacts are developer Claude Code instructions.

CLAUDE.md (Developer Agent Instructions — platform/)

## Important Rules

1. **Use pnpm** for package management
2. **Use Tilt for development** - `tilt up` to start the full environment
3. **Use shadcn/ui components** - Add with `npx shadcn@latest add <component>`
4. **Documentation Updates** - For any feature or system changes, audit `../docs/pages`
5. **Always Add Tests** - When working on any feature, ALWAYS add or modify appropriate
   test cases (unit tests, integration tests, or e2e tests under `platform/e2e-tests/tests`)
6. **Enterprise Edition Imports** - NEVER directly import from `.ee.ts` files unless the
   importing file is itself an `.ee.ts` file. Use runtime conditional logic with
   `config.enterpriseFeatures.core` checks instead to avoid bundling enterprise code
   into free builds
7. **No Auto Commits** - Never commit or push changes without explicit user approval.
   Always ask before running git commit or git push
8. **No Database Modifications Without Approval** - NEVER run INSERT, UPDATE, DELETE,
   or any data-modifying SQL queries without explicit user approval.
9. **NEVER MENTION REAL CUSTOMER NAMES OR IDENTIFIERS ANYWHERE IN CODE, COMMENTS,
   TESTS, DOCS, COMMITS, OR PR TEXT!!!!!!!!!!**
10. **Never copy anything from Sentry into code** — Sentry is for diagnosing; describe
    bugs in neutral terms.

Prompting technique: Absolute prohibition list ("Iron Laws") with business-justification inline. The all-caps rule #9 signals highest severity. This is a persona-instruction pattern preventing security/compliance failures.

.claude/commands/review-pr.md (Slash Command)

This command provides context for PR review. Pattern: structured workflow with pre-defined review criteria. The exact content is not publicly accessible but the file exists as a Claude Code command.

.claude/settings.json (PostToolUse Hook)

{
  "hooks": {
    "PostToolUse": [{
      "matcher": "Write|Edit|MultiEdit",
      "hooks": [{
        "type": "command",
        "command": "jq -r '.tool_input.file_path // .tool_response.filePath // empty' | { read -r f; case \"$f\" in \"$CLAUDE_PROJECT_DIR\"/platform/*) cd \"$CLAUDE_PROJECT_DIR/platform\" && pnpm exec biome check --write --files-ignore-unknown=true --no-errors-on-unmatched \"$f\" ;; esac; } || true"
      }]
    }]
  }
}

Prompting technique: Not a prompt — this is an automation hook that runs Biome formatter automatically after every file edit within platform/. Enforces code formatting standards without requiring developer discipline. The || true prevents hook failures from blocking work.

09

Uniqueness

Archestra — Uniqueness & Positioning

Differs from Seeds

Archestra is an enterprise AI platform occupying a different tier from all 11 seeds. The seeds are agent-harness/methodology tools (superpowers, spec-kit, BMAD-METHOD) or MCP utility servers (ccmemory, taskmaster-ai). Archestra is closer to an internal cloud platform for AI: it governs the entire lifecycle of MCP servers across an organization. The most comparable seed is ccmemory (MCP-based memory server) or taskmaster-ai (MCP task server) but Archestra operates at a completely different scope — it doesn't provide a single tool, it provides the infrastructure that hosts and governs many tools. ContextForge (same batch) is the closest peer, but Archestra adds a full chat UI, RAG knowledge base, Kubernetes-native orchestration, and explicitly models the "Lethal Trifecta" security risk.

Unique Characteristics

  1. Lethal Trifecta security model: Explicitly models and defends against the three-part vulnerability pattern (sensitive data access + prompt injection + exfiltration channel) — the most security-aware framework in the corpus.
  2. Dual-LLM sub-agent: Architecturally isolated sanitizer LLM between tool output and main agent — prevents prompt injection at the protocol level, not just the prompt level.
  3. Kubernetes-native MCP orchestration: MCP servers run as K8s pods with lifecycle management, restart capabilities, and log streaming — no other framework in corpus manages MCP server infrastructure at this level.
  4. 96% cost reduction claim: Dynamic model switching based on task complexity — the most aggressive cost optimization claim in the corpus.
  5. ChatGPT-like UI bundled: Platform ships its own chat interface — it is both the infrastructure and the end-user UI.
  6. CNCF/Linux Foundation affiliation: Only framework in this batch with formal CNCF/LF membership, implying enterprise governance standards.
  7. Non-probabilistic security: Claims deterministic (non-LLM-probabilistic) enforcement of data access boundaries.

Positioning

"Enterprises adopting MCP need more than a protocol — they need a governance platform." Archestra is what you deploy before you let your developers point Claude Code at your internal tools.

Observable Failure Modes

  • AGPL-3.0 license: Forces open-sourcing of derived works — a significant adoption barrier for enterprise customers building on top of the platform.
  • Complexity: Running a full PostgreSQL + Next.js + Fastify + Tempo + Prometheus + Grafana + Kubernetes stack for MCP management is operationally heavy.
  • Dual-LLM cost: The security sub-agent doubles LLM calls for tool interactions — the "non-probabilistic security" comes with a token cost premium.
  • Vendor lock-in: Archestra's Terraform provider and Helm chart create operational dependency on the platform.

Cross-References

  • Competes with: ContextForge (IBM), Agentgateway (LF), Plano (Katanemo) — all MCP gateway/proxy frameworks
  • CNCF/LF member: same foundation as Agentgateway
  • Security model: Lethal Trifecta documented by Simon Willison, acknowledged in README
04

Workflow

Archestra — Workflow

Platform Administrator Workflow

Phase Artifact
Deploy Docker Compose or Helm chart
Configure LLM providers Settings → LLM Gateway (API keys, model list)
Add MCP servers Registry → Add server (self-hosted Docker or remote URL)
Configure profiles Define profile-specific tool sets and cost limits
Set RBAC roles Settings → Roles (admin-only)
Configure cost limits Cost → Limits (per-profile token budgets)
Enable dual-LLM Settings → Dual LLM (security sub-agent activation)

Developer Workflow

Phase Artifact
Connect to platform Configure AI client with /v1/mcp/:profileId endpoint + token
Browse registry /mcp/registry — discover available MCP servers
Request installation Submit MCP server installation requests (approval workflow)
Use chat /chat — ChatGPT-like interface with MCP tools

MCP Server Lifecycle

  1. Developer submits MCP server to registry
  2. Platform admin reviews installation request (/mcp/registry/installation-requests)
  3. Admin approves → platform deploys server to Kubernetes pod
  4. Server appears in tool catalog, available to authorized profiles

Tool Call Audit Flow

  1. AI client → MCP gateway (:9000/v1/mcp/:profileId)
  2. Fastify backend authenticates, routes to correct MCP server pod
  3. Tool executes in Kubernetes-managed container
  4. Response optionally processed by dual-LLM sanitizer
  5. Audit record written to mcp-tool-calls table
  6. Observable in /mcp/logs

Approval Gates

  • MCP server installation requests require admin approval
  • Enterprise edition features require RBAC role assignment
  • Cost limits auto-block requests over per-profile token budgets

Cost Optimization Flow

  1. Incoming LLM request evaluated against optimization rules
  2. Rule matches → model switched to cheaper alternative
  3. Request proxied via LLM gateway with substituted model
  4. Token usage tracked against per-profile limits
06

Memory Context

Archestra — Memory & Context

State Storage

Database: PostgreSQL (embedded in Docker image via volume mount at /var/lib/postgresql/data)

Schema Management: Drizzle ORM with migrations (pnpm db:migrate, pnpm db:generate)

Key tables:

  • mcp-tool-calls — audit log of every tool call with timing, request, response
  • llm-proxy-logs — LLM request/response logs
  • profiles, profile-tools — tool access configuration per profile
  • cost statistics tables — token usage time series
  • RBAC tables (enterprise edition)

Application Data

Persistent volume at /app/data in Docker. Stores: knowledge base documents, session data, uploaded files.

Observability State

  • Prometheus (:9090): metrics time series
  • Tempo (:3200): distributed traces (OpenTelemetry)
  • Grafana (:3002): dashboards over Prometheus/Tempo

All three are included in the platform (not external dependencies) and persist in Docker volumes.

Session / Conversation State

Chat sessions (/chat) are stored server-side. The knowledge base with RAG-indexed documents persists across sessions. Context variables (cost limits, model preferences) are profile-scoped.

Memory Persistence Level

Global — the PostgreSQL database and Prometheus/Tempo stores persist indefinitely, not tied to individual AI client sessions.

Cross-Session Handoff

Yes — the platform is a persistent service. Conversations, knowledge base, and tool configurations survive across AI client restarts.

Cost State

Per-profile token usage counters track spending in real time. Budget enforcement blocks requests automatically when limits are reached. Historical usage is stored in time series for analytics at /llm/cost/statistics.

Developer Tools Memory

.claude/ directory provides Claude Code with developer-level instructions (CLAUDE.md, commands). No persistent memory mechanism for the development Claude instance.

07

Orchestration

Archestra — Orchestration

Multi-Agent Support

Yes — the dual-LLM architecture uses a dedicated security sub-agent to sanitize tool responses. This is a fixed 2-agent topology, not a general multi-agent orchestration system.

Orchestration Pattern

Hierarchical: The main agent calls tools → the dual-LLM sub-agent inspects potentially dangerous responses → cleaned response returns to main agent. Not user-configurable; it is a fixed security pipeline.

Execution Mode

Background daemon — the platform runs as a persistent service pair (Fastify backend + Next.js frontend) with Kubernetes managing MCP server pod lifecycles.

Isolation Mechanism

Container — MCP servers run in Kubernetes pods. Each registered MCP server gets its own container with resource limits. The mcp_proxy endpoint routes requests to the correct pod.

Multi-Model Routing

Yes. The cost optimizer can automatically switch to cheaper models based on configured rules. The LLM gateway provides a unified OpenAI-compatible API that abstracts provider differences. Multiple providers (OpenAI, Anthropic, etc.) can be configured.

Model Role Mapping

  • Main agent: configured provider + model per profile
  • Security sub-agent (dual-LLM): separate model designated for sanitization
  • Cost optimizer: routing rules can specify fallback/cheaper models per task type

Kubernetes-Native Orchestration

MCP server lifecycle management:

  • Deployment: admin-approved servers deployed as Kubernetes pods
  • Health: pod status visible via /api/mcp_server/:id/logs
  • Restart: /api/mcp_server/:id/restart
  • State management: Tilt for development, Helm for production

Consensus Mechanism

None. Single-writer PostgreSQL database. No distributed consensus.

Prompt Chaining

The dual-LLM flow is a form of prompt chaining: tool response → sanitizer prompt → sanitized response → main agent context. Not exposed as a user-programmable construct.

08

Ui Cli Surface

Archestra — UI/CLI Surface

CLI Binary

None shipped for end-users. Development uses standard pnpm scripts and tilt for environment management. No archestra CLI binary.

Local Web Dashboard

Exists: Yes — this is a core product feature Type: Full web application (not just a dashboard) Ports: Frontend: 3000, Backend API: 9000 Tech Stack: Next.js + React + Shadcn/UI + Tailwind CSS

Features by area:

Chat Interface (/chat)

  • ChatGPT-like conversational UI with MCP tool access
  • Private prompt registry for org-wide prompt sharing
  • Citations from RAG knowledge base

MCP Registry (/mcp/registry)

  • Browse, install, and manage MCP servers (self-hosted + third-party)
  • Installation request queue with admin approval workflow
  • Server status, logs, and restart controls

Observability

  • /mcp/logs — MCP tool call audit trail (paginated)
  • /llm/logs — LLM proxy request logs
  • Grafana dashboards at :3002 (Prometheus + Tempo)

Cost Management

  • /llm/cost/statistics — time-series token usage charts
  • /llm/cost/limits — per-profile token budget enforcement
  • /llm/cost/token-price — model pricing configuration
  • /llm/cost/optimization-rules — automatic model switching policies

Settings

  • LLM gateway configuration (providers, models, API keys)
  • MCP gateway configuration
  • Dual-LLM security sub-agent toggle
  • Custom RBAC roles (admin-only, enterprise edition)
  • Theme, logo, font customization (admin-only)

Development Tools

  • Tilt UI at :10350 — dev environment orchestration dashboard
  • Drizzle Studio at local.drizzle.studio — database browser
  • Playwright e2e tests targeting Chromium, WebKit, Firefox

IDE Integration

Claude Code: .claude/commands/ (3 commands) + settings.json PostToolUse hook for auto-formatting.

API Surface

  • MCP gateway: GET/POST /v1/mcp/:profileId (profile-scoped MCP endpoint)
  • MCP proxy: POST /mcp_proxy/:id (Kubernetes pod routing)
  • Audit API: GET /api/mcp-tool-calls (paginated)
  • Admin API: full CRUD for servers, profiles, users, roles
  • Metrics: GET /metrics (Prometheus format, port 9050)

Related frameworks

same archetype · same primary tool · same memory type

claude-mem (thedotmack) ★ 78k

Background worker service captures every tool call as an observation, AI-compresses sessions, and auto-injects relevant past…

pi (badlogic/earendil) ★ 55k

A minimal, hackable, multi-provider terminal coding agent that adapts to your workflows via npm-installable TypeScript Extensions…

Agent Skills (Addy Osmani) ★ 46k

Encodes senior-engineer software development lifecycle as 23 auto-routed skills and 7 slash commands for any AI coding agent.

wshobson/agents Plugin Marketplace ★ 36k

Single Markdown source for 83 domain-specialized plugins that auto-generates idiomatic artifacts for five AI coding harnesses.

TabbyML/Tabby ★ 34k

Self-hosted AI coding assistant server (alternative to GitHub Copilot) with admin dashboard, RAG-based completions, and multi-IDE…

Compound Engineering ★ 17k

Make each unit of engineering work compound into easier future work via brainstorm→plan→execute→review→learn cycles.