Archestra

archestra · archestra-ai/archestra · ★ 3.7k · last commit 2026-05-26

Primitive shape 5 total

Commands 3 Subagents 1 Hooks 1

Summary

Archestra — Summary

Archestra is a CNCF/Linux Foundation-hosted enterprise AI platform providing a private MCP registry, Kubernetes-native MCP orchestrator, RAG knowledge base, dual-LLM security sub-agents, cost monitoring/optimization, and full observability (metrics, traces, logs). It runs as a Docker Compose or Helm-deployed service pair (Next.js frontend on port 3000, Fastify backend on port 9000) and positions itself as the "ChatGPT-like interface + MCP governance layer" for organizations. Its non-probabilistic security engine prevents prompt injection and data exfiltration by isolating dangerous tool responses through dual-LLM sub-agents. Cost optimizer can reduce token spend up to 96% by automatically switching to cheaper models for simpler tasks.

Compared to seeds: unlike ccmemory (a single Neo4j-backed MCP memory server) or taskmaster-ai (a task-management MCP bundler), Archestra is a full enterprise AI platform — it manages the entire lifecycle of MCP servers across an organization including registration, governance, Kubernetes deployment, credential management, and cost control. The closest analogy is ContextForge (also an MCP gateway) but Archestra adds a chat UI, RAG knowledge base, RBAC with custom roles, and Kubernetes operator — making it closer to an internal AI cloud platform than a pure proxy.

Overview

Archestra — Overview

Origin

Developed by archestra-ai, CNCF and Linux Foundation member project. TypeScript/Node.js stack. AGPL-3.0 licensed. 3,745 stars, 879 forks as of 2026-05-26. Active development.

Philosophy

"Simplify AI usage in your company, providing user-friendly MCP toolbox, observability and control built on a strong security foundation."

Archestra's core argument: MCP in organizations creates three risks — MCP chaos (servers scattered on individual machines), data exfiltration (prompt injection via tool outputs), and runaway costs. All three require a centralized platform response, not individual tool hygiene.

Target Audiences

Platform teams: Centralize MCP servers, manage credentials, prevent data exfiltration
Developers: Deploy MCP servers org-wide, build agents without worrying about security
Management: 1-click MCP adoption for technical and non-technical users, cost visibility

Key Security Concept: Lethal Trifecta

Archestra explicitly models the "Lethal Trifecta" (coined by Simon Willison) — the combination of: MCP tool with access to sensitive data + prompt injection vector + exfiltration capability. The dual-LLM sub-agent architecture isolates dangerous tool responses so the main agent never sees potentially injected instructions.

Manifesto Quotes

"Mitigate MCP chaos, move MCP servers from individual machines to a centralized orchestrator"

"Non-probabilistic security to prevent data exfiltration"

"Reduce AI costs up to 96%"

Performance

Self-reported: 45ms at P95 for tool routing latency. Terraform provider and Helm chart available for production deployment.

Architecture

Archestra — Architecture

Distribution

Docker Compose (quickstart) and Helm chart (production). The platform is a single Docker image containing both frontend and backend.

docker pull archestra/platform:latest
docker run -p 9000:9000 -p 3000:3000 \
  -e ARCHESTRA_QUICKSTART=true \
  -v /var/run/docker.sock:/var/run/docker.sock \
  -v archestra-postgres-data:/var/lib/postgresql/data \
  -v archestra-app-data:/app/data \
  archestra/platform

Services

Port	Service
3000	Next.js frontend (chat, MCP registry, settings, observability)
9000	Fastify backend API
9050	Prometheus metrics endpoint
3002	Grafana (manual start via Tilt)
3200	Tempo (distributed tracing)
9090	Prometheus storage

Directory Structure

platform/
  backend/        # Fastify API server (TypeScript)
  frontend/       # Next.js frontend (TypeScript)
  e2e-tests/      # Playwright e2e tests
  benchmarks/     # Performance benchmarks
.claude/
  commands/       # fix-pr.md, review-pr.md, resolve-conflicts
  skills/         # Developer AI skills
  settings.json   # PostToolUse hook for Biome formatting
.mcp.json         # MCP server config for development

Tech Stack

Frontend: Next.js + React + Shadcn/UI + Tailwind
Backend: Fastify + TypeScript + Drizzle ORM
Database: PostgreSQL (embedded in Docker image)
Observability: Prometheus + Grafana + Tempo
Kubernetes: Tilt for dev, Helm for production
IaC: Terraform provider (archestra-ai/terraform-provider-archestra)

Required Runtime

Docker (with volume mounts for PostgreSQL data and app data)
Kubernetes + Helm (for production)

Target AI Tools

Any MCP-compatible AI client. The platform exposes MCP gateways per-profile for connecting Claude Code, Claude Desktop, Cursor, etc. to the centralized MCP registry.

MCP Gateway Endpoint

GET/POST http://localhost:9000/v1/mcp/:profileId — profile-scoped MCP endpoint (stateless JSON-RPC mode). Authentication via Bearer archestra_token.

Components

Archestra — Components

Core Platform Features (Frontend Routes)

Route	Feature
`/chat`	ChatGPT-like chat with MCP tools and private prompt registry
`/tools`	Unified tools management with server-side pagination
`/mcp/registry`	Private MCP registry — install and manage MCP servers
`/mcp/registry/installation-requests`	Pending MCP server installation approval queue
`/mcp/logs`	MCP tool call logs with audit trail
`/llm/logs`	LLM proxy request logs
`/llm/cost/statistics`	Token usage analytics with time series charts
`/llm/cost/limits`	Per-profile token usage limits
`/llm/cost/token-price`	Model pricing configuration
`/llm/cost/optimization-rules`	Cost optimization policies (model switching rules)
`/settings`	LLM & MCP gateway config, dual-LLM settings, account, members, teams
`/settings/roles`	Admin: custom RBAC role management
`/settings/appearance`	Admin: theme, logo, fonts customization

Backend APIs

Endpoint	Purpose
`GET/POST /v1/mcp/:profileId`	Profile-scoped MCP gateway (JSON-RPC stateless mode)
`POST /mcp_proxy/:id`	MCP proxy to Kubernetes pods
`GET /api/mcp_server/:id/logs`	Container logs (streaming)
`POST /api/mcp_server/:id/restart`	Restart MCP server pod
`GET /api/mcp-tool-calls`	Paginated MCP tool call audit logs
`GET /api/profile-tools`	Profile-tool relationship management
`GET /metrics`	Prometheus metrics

Security Sub-Agent (Dual LLM)

The dual-LLM architecture runs dangerous tool responses through a separate "sanitizer" LLM before the main agent sees them. Prevents prompt injection from malicious tool outputs.

Knowledge Base (RAG)

Built-in retrieval-augmented knowledge base — no external vector database required. Accessible through the chat interface with citations.

Cost Optimizer

Dynamic model switching rules that automatically route to cheaper models for simpler tasks. Self-reported up to 96% cost reduction.

Claude Code Developer Tools

.claude/commands/:

fix-pr.md — Fix failing PR checks
review-pr.md — Review a PR
resolve-conflicts — Resolve merge conflicts

.claude/settings.json:

PostToolUse hook (Write|Edit|MultiEdit matcher): runs Biome formatter automatically on platform/** file edits

RBAC

Custom role management with enterprise edition features (config.enterpriseFeatures.core). Free vs enterprise tier separation via .ee.ts file naming convention.

Prompts

Archestra — Prompts

Archestra is a platform product, not a prompting framework. The prompt-like artifacts are developer Claude Code instructions.

CLAUDE.md (Developer Agent Instructions — platform/)

## Important Rules

1. **Use pnpm** for package management
2. **Use Tilt for development** - `tilt up` to start the full environment
3. **Use shadcn/ui components** - Add with `npx shadcn@latest add <component>`
4. **Documentation Updates** - For any feature or system changes, audit `../docs/pages`
5. **Always Add Tests** - When working on any feature, ALWAYS add or modify appropriate
   test cases (unit tests, integration tests, or e2e tests under `platform/e2e-tests/tests`)
6. **Enterprise Edition Imports** - NEVER directly import from `.ee.ts` files unless the
   importing file is itself an `.ee.ts` file. Use runtime conditional logic with
   `config.enterpriseFeatures.core` checks instead to avoid bundling enterprise code
   into free builds
7. **No Auto Commits** - Never commit or push changes without explicit user approval.
   Always ask before running git commit or git push
8. **No Database Modifications Without Approval** - NEVER run INSERT, UPDATE, DELETE,
   or any data-modifying SQL queries without explicit user approval.
9. **NEVER MENTION REAL CUSTOMER NAMES OR IDENTIFIERS ANYWHERE IN CODE, COMMENTS,
   TESTS, DOCS, COMMITS, OR PR TEXT!!!!!!!!!!**
10. **Never copy anything from Sentry into code** — Sentry is for diagnosing; describe
    bugs in neutral terms.

Prompting technique: Absolute prohibition list ("Iron Laws") with business-justification inline. The all-caps rule #9 signals highest severity. This is a persona-instruction pattern preventing security/compliance failures.

.claude/commands/review-pr.md (Slash Command)

This command provides context for PR review. Pattern: structured workflow with pre-defined review criteria. The exact content is not publicly accessible but the file exists as a Claude Code command.

.claude/settings.json (PostToolUse Hook)

{
  "hooks": {
    "PostToolUse": [{
      "matcher": "Write|Edit|MultiEdit",
      "hooks": [{
        "type": "command",
        "command": "jq -r '.tool_input.file_path // .tool_response.filePath // empty' | { read -r f; case \"$f\" in \"$CLAUDE_PROJECT_DIR\"/platform/*) cd \"$CLAUDE_PROJECT_DIR/platform\" && pnpm exec biome check --write --files-ignore-unknown=true --no-errors-on-unmatched \"$f\" ;; esac; } || true"
      }]
    }]
  }
}

Prompting technique: Not a prompt — this is an automation hook that runs Biome formatter automatically after every file edit within platform/. Enforces code formatting standards without requiring developer discipline. The || true prevents hook failures from blocking work.

Uniqueness

Archestra — Uniqueness & Positioning

Differs from Seeds

Archestra is an enterprise AI platform occupying a different tier from all 11 seeds. The seeds are agent-harness/methodology tools (superpowers, spec-kit, BMAD-METHOD) or MCP utility servers (ccmemory, taskmaster-ai). Archestra is closer to an internal cloud platform for AI: it governs the entire lifecycle of MCP servers across an organization. The most comparable seed is ccmemory (MCP-based memory server) or taskmaster-ai (MCP task server) but Archestra operates at a completely different scope — it doesn't provide a single tool, it provides the infrastructure that hosts and governs many tools. ContextForge (same batch) is the closest peer, but Archestra adds a full chat UI, RAG knowledge base, Kubernetes-native orchestration, and explicitly models the "Lethal Trifecta" security risk.

Unique Characteristics

Lethal Trifecta security model: Explicitly models and defends against the three-part vulnerability pattern (sensitive data access + prompt injection + exfiltration channel) — the most security-aware framework in the corpus.
Dual-LLM sub-agent: Architecturally isolated sanitizer LLM between tool output and main agent — prevents prompt injection at the protocol level, not just the prompt level.
Kubernetes-native MCP orchestration: MCP servers run as K8s pods with lifecycle management, restart capabilities, and log streaming — no other framework in corpus manages MCP server infrastructure at this level.
96% cost reduction claim: Dynamic model switching based on task complexity — the most aggressive cost optimization claim in the corpus.
ChatGPT-like UI bundled: Platform ships its own chat interface — it is both the infrastructure and the end-user UI.
CNCF/Linux Foundation affiliation: Only framework in this batch with formal CNCF/LF membership, implying enterprise governance standards.
Non-probabilistic security: Claims deterministic (non-LLM-probabilistic) enforcement of data access boundaries.

Positioning

"Enterprises adopting MCP need more than a protocol — they need a governance platform." Archestra is what you deploy before you let your developers point Claude Code at your internal tools.

Observable Failure Modes

AGPL-3.0 license: Forces open-sourcing of derived works — a significant adoption barrier for enterprise customers building on top of the platform.
Complexity: Running a full PostgreSQL + Next.js + Fastify + Tempo + Prometheus + Grafana + Kubernetes stack for MCP management is operationally heavy.
Dual-LLM cost: The security sub-agent doubles LLM calls for tool interactions — the "non-probabilistic security" comes with a token cost premium.
Vendor lock-in: Archestra's Terraform provider and Helm chart create operational dependency on the platform.

Cross-References

Competes with: ContextForge (IBM), Agentgateway (LF), Plano (Katanemo) — all MCP gateway/proxy frameworks
CNCF/LF member: same foundation as Agentgateway
Security model: Lethal Trifecta documented by Simon Willison, acknowledged in README

Workflow

Archestra — Workflow

Platform Administrator Workflow

Phase	Artifact
Deploy	Docker Compose or Helm chart
Configure LLM providers	Settings → LLM Gateway (API keys, model list)
Add MCP servers	Registry → Add server (self-hosted Docker or remote URL)
Configure profiles	Define profile-specific tool sets and cost limits
Set RBAC roles	Settings → Roles (admin-only)
Configure cost limits	Cost → Limits (per-profile token budgets)
Enable dual-LLM	Settings → Dual LLM (security sub-agent activation)

Developer Workflow

Phase	Artifact
Connect to platform	Configure AI client with `/v1/mcp/:profileId` endpoint + token
Browse registry	`/mcp/registry` — discover available MCP servers
Request installation	Submit MCP server installation requests (approval workflow)
Use chat	`/chat` — ChatGPT-like interface with MCP tools

MCP Server Lifecycle

Developer submits MCP server to registry
Platform admin reviews installation request (/mcp/registry/installation-requests)
Admin approves → platform deploys server to Kubernetes pod
Server appears in tool catalog, available to authorized profiles

Tool Call Audit Flow

AI client → MCP gateway (:9000/v1/mcp/:profileId)
Fastify backend authenticates, routes to correct MCP server pod
Tool executes in Kubernetes-managed container
Response optionally processed by dual-LLM sanitizer
Audit record written to mcp-tool-calls table
Observable in /mcp/logs

Approval Gates

MCP server installation requests require admin approval
Enterprise edition features require RBAC role assignment
Cost limits auto-block requests over per-profile token budgets

Cost Optimization Flow

Incoming LLM request evaluated against optimization rules
Rule matches → model switched to cheaper alternative
Request proxied via LLM gateway with substituted model
Token usage tracked against per-profile limits

Memory Context

Archestra — Memory & Context

State Storage

Database: PostgreSQL (embedded in Docker image via volume mount at /var/lib/postgresql/data)

Schema Management: Drizzle ORM with migrations (pnpm db:migrate, pnpm db:generate)

Key tables:

mcp-tool-calls — audit log of every tool call with timing, request, response
llm-proxy-logs — LLM request/response logs
profiles, profile-tools — tool access configuration per profile
cost statistics tables — token usage time series
RBAC tables (enterprise edition)

Application Data

Persistent volume at /app/data in Docker. Stores: knowledge base documents, session data, uploaded files.

Observability State

Prometheus (:9090): metrics time series
Tempo (:3200): distributed traces (OpenTelemetry)
Grafana (:3002): dashboards over Prometheus/Tempo

All three are included in the platform (not external dependencies) and persist in Docker volumes.

Session / Conversation State

Chat sessions (/chat) are stored server-side. The knowledge base with RAG-indexed documents persists across sessions. Context variables (cost limits, model preferences) are profile-scoped.

Memory Persistence Level

Global — the PostgreSQL database and Prometheus/Tempo stores persist indefinitely, not tied to individual AI client sessions.

Cross-Session Handoff

Yes — the platform is a persistent service. Conversations, knowledge base, and tool configurations survive across AI client restarts.

Cost State

Per-profile token usage counters track spending in real time. Budget enforcement blocks requests automatically when limits are reached. Historical usage is stored in time series for analytics at /llm/cost/statistics.

Developer Tools Memory

.claude/ directory provides Claude Code with developer-level instructions (CLAUDE.md, commands). No persistent memory mechanism for the development Claude instance.

Orchestration

Archestra — Orchestration

Multi-Agent Support

Yes — the dual-LLM architecture uses a dedicated security sub-agent to sanitize tool responses. This is a fixed 2-agent topology, not a general multi-agent orchestration system.

Orchestration Pattern

Hierarchical: The main agent calls tools → the dual-LLM sub-agent inspects potentially dangerous responses → cleaned response returns to main agent. Not user-configurable; it is a fixed security pipeline.

Execution Mode

Background daemon — the platform runs as a persistent service pair (Fastify backend + Next.js frontend) with Kubernetes managing MCP server pod lifecycles.

Isolation Mechanism

Container — MCP servers run in Kubernetes pods. Each registered MCP server gets its own container with resource limits. The mcp_proxy endpoint routes requests to the correct pod.

Multi-Model Routing

Yes. The cost optimizer can automatically switch to cheaper models based on configured rules. The LLM gateway provides a unified OpenAI-compatible API that abstracts provider differences. Multiple providers (OpenAI, Anthropic, etc.) can be configured.

Model Role Mapping

Main agent: configured provider + model per profile
Security sub-agent (dual-LLM): separate model designated for sanitization
Cost optimizer: routing rules can specify fallback/cheaper models per task type

Kubernetes-Native Orchestration

MCP server lifecycle management:

Deployment: admin-approved servers deployed as Kubernetes pods
Health: pod status visible via /api/mcp_server/:id/logs
Restart: /api/mcp_server/:id/restart
State management: Tilt for development, Helm for production

Consensus Mechanism

None. Single-writer PostgreSQL database. No distributed consensus.

Prompt Chaining

The dual-LLM flow is a form of prompt chaining: tool response → sanitizer prompt → sanitized response → main agent context. Not exposed as a user-programmable construct.

Ui Cli Surface

Archestra — UI/CLI Surface

CLI Binary

None shipped for end-users. Development uses standard pnpm scripts and tilt for environment management. No archestra CLI binary.

Local Web Dashboard

Exists: Yes — this is a core product feature Type: Full web application (not just a dashboard) Ports: Frontend: 3000, Backend API: 9000 Tech Stack: Next.js + React + Shadcn/UI + Tailwind CSS

Features by area:

Chat Interface (`/chat`)

ChatGPT-like conversational UI with MCP tool access
Private prompt registry for org-wide prompt sharing
Citations from RAG knowledge base

MCP Registry (`/mcp/registry`)

Browse, install, and manage MCP servers (self-hosted + third-party)
Installation request queue with admin approval workflow
Server status, logs, and restart controls

Observability

/mcp/logs — MCP tool call audit trail (paginated)
/llm/logs — LLM proxy request logs
Grafana dashboards at :3002 (Prometheus + Tempo)

Cost Management

/llm/cost/statistics — time-series token usage charts
/llm/cost/limits — per-profile token budget enforcement
/llm/cost/token-price — model pricing configuration
/llm/cost/optimization-rules — automatic model switching policies

Settings

LLM gateway configuration (providers, models, API keys)
MCP gateway configuration
Dual-LLM security sub-agent toggle
Custom RBAC roles (admin-only, enterprise edition)
Theme, logo, font customization (admin-only)

Development Tools

Tilt UI at :10350 — dev environment orchestration dashboard
Drizzle Studio at local.drizzle.studio — database browser
Playwright e2e tests targeting Chromium, WebKit, Firefox

IDE Integration

Claude Code: .claude/commands/ (3 commands) + settings.json PostToolUse hook for auto-formatting.

API Surface

MCP gateway: GET/POST /v1/mcp/:profileId (profile-scoped MCP endpoint)
MCP proxy: POST /mcp_proxy/:id (Kubernetes pod routing)
Audit API: GET /api/mcp-tool-calls (paginated)
Admin API: full CRUD for servers, profiles, users, roles
Metrics: GET /metrics (Prometheus format, port 9050)

Related frameworks

same archetype · same primary tool · same memory type

claude-mem (thedotmack) ★ 78k

A8 Cross-runtime harness

Background worker service captures every tool call as an observation, AI-compresses sessions, and auto-injects relevant past…

pi (badlogic/earendil) ★ 55k

A8 Cross-runtime harness

A minimal, hackable, multi-provider terminal coding agent that adapts to your workflows via npm-installable TypeScript Extensions…

Agent Skills (Addy Osmani) ★ 46k

A8 Cross-runtime harness

Encodes senior-engineer software development lifecycle as 23 auto-routed skills and 7 slash commands for any AI coding agent.

wshobson/agents Plugin Marketplace ★ 36k

A8 Cross-runtime harness

Single Markdown source for 83 domain-specialized plugins that auto-generates idiomatic artifacts for five AI coding harnesses.

TabbyML/Tabby ★ 34k

A8 Cross-runtime harness

Self-hosted AI coding assistant server (alternative to GitHub Copilot) with admin dashboard, RAG-based completions, and multi-IDE…

Compound Engineering ★ 17k

A8 Cross-runtime harness

Make each unit of engineering work compound into easier future work via brainstorm→plan→execute→review→learn cycles.

Distribution

Type: docker-image
License: AGPL-3.0
Install: container
Version: unknown (commit 2026-05-26, AGPL-3.0)

Surfaces

CLI binary: No
CLI subcmds: 0
Local UI: web-dashboard
UI port: 3000
Tech stack: Next.js + React + Shadcn/UI + Tailwind CSS; backend Fastify on :9000

Components

Commands: 3
Skills: 0
Subagents: 1
Hooks: 1
MCP servers: 0
MCP tools: 0
Scripts: 5
Templates: 0

Workflow

Phases: 7
Approval gates: 1
Spec format: none
Spec storage: none
Delta or full: none

Orchestration

Multi-agent: Yes
Pattern: hierarchical
Isolation: container
Consensus: none
Prompt chaining: Yes

Multi-model

Multi-model: Yes
BYOK: Yes
Modal: text

Execution

Mode: background-daemon
Crash recovery: Yes
Compaction: No
Session handoff: Yes
Streaming: Yes

Memory

Type: hybrid
Persistence: global
Search: vector
State files: 4 files

Quality

TDD: Optional
TDD mechanism: none
Validators: 3
Self-review: none

Git / Observability

Auto commit: No
Auto PR: No
Auto merge: No
Worktree/feat: No
Audit log: Yes
Audit format: proprietary
Replay: No

Tools

Primary: any-mcp-client
Targets: 4
Portability: high

Signals

Stars: 3.7k
Last commit: 2026-05-26
Contributors: 30
Maintainer: active
Quality score: 6.5/10

Summary

Archestra — Summary

Overview

Archestra — Overview

Origin

Philosophy

Target Audiences

Key Security Concept: Lethal Trifecta

Manifesto Quotes

Performance

Architecture

Archestra — Architecture

Distribution

Services

Directory Structure

Tech Stack

Required Runtime

Target AI Tools

MCP Gateway Endpoint

Components

Archestra — Components

Core Platform Features (Frontend Routes)

Backend APIs

Security Sub-Agent (Dual LLM)

Knowledge Base (RAG)

Cost Optimizer

Claude Code Developer Tools

RBAC

Prompts

Archestra — Prompts

CLAUDE.md (Developer Agent Instructions — platform/)

.claude/commands/review-pr.md (Slash Command)

.claude/settings.json (PostToolUse Hook)

Uniqueness

Archestra — Uniqueness & Positioning

Differs from Seeds

Unique Characteristics

Positioning

Observable Failure Modes

Cross-References

Workflow

Archestra — Workflow

Platform Administrator Workflow

Developer Workflow

MCP Server Lifecycle

Tool Call Audit Flow

Approval Gates

Cost Optimization Flow

Memory Context

Archestra — Memory & Context

State Storage

Application Data

Observability State

Session / Conversation State

Memory Persistence Level

Cross-Session Handoff

Cost State

Developer Tools Memory

Orchestration

Archestra — Orchestration

Multi-Agent Support

Orchestration Pattern

Execution Mode

Isolation Mechanism

Multi-Model Routing

Model Role Mapping

Kubernetes-Native Orchestration

Consensus Mechanism

Prompt Chaining

Ui Cli Surface

Archestra — UI/CLI Surface

CLI Binary

Local Web Dashboard

Chat Interface (/chat)

MCP Registry (/mcp/registry)

Observability

Cost Management

Settings

Development Tools

IDE Integration

Chat Interface (`/chat`)

MCP Registry (`/mcp/registry`)