Vibe Check MCP

vibe-check-mcp · PV-Bhat/vibe-check-mcp-server · ★ 486 · last commit 2026-05-24

Primitive shape 5 total

MCP tools 5

Summary

Vibe Check MCP — Summary

Vibe Check MCP is an MCP server that acts as an AI meta-mentor for coding agents, interrupting pattern inertia with Chain-Pattern Interrupts (CPI) to prevent Reasoning Lock-In (RLI). It invokes a second LLM (Gemini, OpenAI, Anthropic, or OpenRouter) to give metacognitive feedback to the primary agent before irreversible actions or when assumptions need challenging.

The tool exposes 5 MCP tools: vibe_check (challenge assumptions, prevent tunnel vision), vibe_learn (log mistakes/fixes for future reflection), update_constitution, reset_constitution, and check_constitution (session rule management). A research paper claims agents calling Vibe Check improved success +27% and halved harmful actions -41%.

Final maintenance release: v2.8.0. The project is no longer actively maintained; community forks are welcome under MIT. Featured on PulseMCP "Most Popular (This Week)" and listed in Anthropic's official MCP repo.

Compared to seeds: no direct equivalent in the 11 seeds. Closest to an adversarial-subagent review pattern in superpowers, but vibe-check-mcp operates as an MCP tool injected into any agent's system prompt, not as a skill file. It is the only tool in the corpus that externally benchmarks and improves agent metacognition rather than task execution.

Overview

Vibe Check MCP — Overview

Origin

Built by Pruthvi Bhat (pruthvibhat.com) at the MURST initiative (murst.org). Released as MIT. Final maintenance release v2.8.0. Archived/unmaintained as of this analysis.

Philosophy

Large language models can confidently follow flawed plans. Without an external nudge, they spiral into overengineering or misalignment. Vibe Check provides that nudge through short reflective pauses that interrupt "Pattern Inertia" and "Reasoning Lock-In."

"KISS overzealous agents goodbye. Plug & play agent oversight tool."

"Think of it as a rubber-duck debugger for LLMs — a quick sanity check before your agent goes down the wrong path."

The system prompt for the second LLM (the meta-mentor) defines 4 thinking modes:

What's going on here? What's the nature of the problem?
What does the agent need to hear right now: patterns, loops, unspoken assumptions?
Technical guidance or alignment question?
If plan looks accurate: reminder, best practices, or soft go-ahead?

Research Backing

From ResearchGate publication on "Chain-Pattern Interrupt (CPI)":

Agents calling Vibe Check improved success rate +27%
Harmful actions halved: -41%

CPI (Chain-Pattern Interrupt)

CPI is the underlying framework that consumes vibe_check signals:

Phase-aware prompts that challenge assumptions
Enforces intervention policy before agent resumes
Separate CPI repo: https://github.com/PV-Bhat/cpi

Explicit Antipatterns

Over-engineering without external reflection
Reasoning Lock-In (confident but flawed plan execution)
Skipping reflection before irreversible actions
"Vibe coding" without alignment checks

Architecture

Vibe Check MCP — Architecture

Distribution

Type: MCP server (npm package)
License: MIT
Install complexity: one-liner (npx)

Install Commands

# STDIO transport
npx -y @pv-bhat/vibe-check-mcp start --stdio

# HTTP transport
npx -y @pv-bhat/vibe-check-mcp start --http --port 2091

# MCP client config (Claude Desktop, Cursor, etc.)
{
  "mcpServers": {
    "vibe-check-mcp": {
      "command": "npx",
      "args": ["-y", "@pv-bhat/vibe-check-mcp", "start", "--stdio"]
    }
  }
}

Directory Layout

src/
├── index.ts                  # Express + MCP server, STDIO + HTTP transports
├── tools/
│   ├── vibeCheck.ts          # vibe_check tool handler
│   ├── vibeLearn.ts          # vibe_learn tool handler
│   ├── vibeDistil.ts         # (internal)
│   └── constitution.ts       # update/reset/check constitution tools
├── utils/
│   ├── llm.ts               # Multi-provider LLM dispatcher (Gemini/OpenAI/Anthropic/OpenRouter)
│   ├── state.ts             # Session history management
│   ├── storage.ts           # vibe_learn storage
│   └── ...

docs/
├── architecture.md
├── integrations/cpi.md       # CPI integration guide
├── agent-prompting.md
└── ...

scripts/                      # Docker automation
Dockerfile

Transport Options

STDIO: standard MCP protocol over stdin/stdout
Streamable HTTP: HTTP at configurable port (default 2091)

Required Runtime

Node.js >= 20
API key for at least one provider: GEMINI_API_KEY, OPENAI_API_KEY, ANTHROPIC_API_KEY, or OpenRouter key

Target AI Tools

Any MCP-compatible client:

Claude Desktop
Cursor
Windsurf
Any MCP-aware agent

Components

Vibe Check MCP — Components

MCP Tools (5 tools)

Tool	Purpose
`vibe_check`	Challenge assumptions and prevent tunnel vision. Primary tool — call before major actions or when plan may be flawed.
`vibe_learn`	Capture mistakes, preferences, and successes for future reflection.
`update_constitution`	Set/merge session rules the CPI layer will enforce.
`reset_constitution`	Clear session rules.
`check_constitution`	Inspect effective rules for the current session.

vibe_check Input Schema

interface VibeCheckInput {
  goal: string;             // Current objective
  plan: string;             // Current plan
  userPrompt?: string;      // Original user request (MUST include for safety)
  progress?: string;        // Work done so far
  uncertainties?: string[]; // Known uncertainties
  taskContext?: string;     // Additional context
  sessionId?: string;       // For history continuity across calls
  modelOverride?: {
    provider?: string;
    model?: string;
  };
}

vibe_check Output

Returns { questions: string } — metacognitive questions from the second LLM in the meta-mentor system prompt.

Multi-Provider LLM Support

Provider	Env Var
Gemini (default)	`GEMINI_API_KEY`
OpenAI	`OPENAI_API_KEY`
Anthropic	`ANTHROPIC_API_KEY`
OpenRouter	OpenRouter key

Constitution (Session Rules)

The constitution is a per-session rule set that the CPI layer enforces. Can be set programmatically via update_constitution, giving agents a way to declare their own alignment constraints.

History Continuity

When sessionId is supplied, prior advice is summarized and injected into subsequent vibe_check calls — maintaining context across multiple calls in a session.

Prompts

Vibe Check MCP — Prompt Excerpts

Excerpt 1: Meta-Mentor System Prompt (from src/utils/llm.ts)

Technique: Role-play as "experienced feedback provider" with 4-step decision tree

const systemPrompt = `You are a meta-mentor. You're an experienced feedback provider that 
specializes in understanding intent, dysfunctional patterns in AI agents, and in responding 
in ways that further the goal. You need to carefully reason and process the information provided, 
to determine your output.

Your tone needs to always be a mix of these traits based on the context of which pushes the 
message in the most appropriate affect: Gentle & Validating, Unafraid to push many questions 
but humble enough to step back, Sharp about problems and eager to help about problem-solving 
& giving tips and/or advice, stern and straightforward when spotting patterns & the agent 
being stuck in something that could derail things.

Here's what you need to think about (Do not output the full thought process, only what is 
explicitly requested):
1. What's going on here? What's the nature of the problem is the agent tackling? What's the 
   approach, situation and goal? Is there any prior context that clarifies context further? 
2. What does the agent need to hear right now: Are there any clear patterns, loops, or unspoken 
   assumptions being missed here? Or is the agent doing fine - in which case should I interrupt 
   it or provide soft encouragement and a few questions?
3. In case the issue is technical - I need to provide guidance and help. In case I spot something 
   that's clearly not accounted for/ assumed/ looping/ or otherwise could be out of alignment with 
   the user or agent stated goals - I need to point out what I see gently and ask questions on if 
   the agent agrees. If I don't see/ can't interpret an explicit issue - what intervention would 
   provide valuable feedback here?
4. In case the plan looks to be accurate - based on the context, can I remind the agent of how 
   to continue, what not to forget, or should I soften and step back for the agent to continue?`

Analysis: The system prompt defines a 4-step decision tree for the meta-mentor, with explicit tone guidance ("Gentle & Validating ... but Stern when spotting loops"). The meta-mentor is instructed to reason through all 4 steps privately but output only the final recommendation — separating internal reasoning from external communication.

Excerpt 2: Agent Prompting Integration (from README)

Technique: Concrete instruction template for system prompt injection

Example snippet:
\`\`\`
As an autonomous agent you will:
1. Call vibe_check after planning and before major actions.
2. Provide the full user request and your current plan.
3. Optionally, record resolved issues with vibe_learn.
\`\`\`

Analysis: Three numbered rules for system prompt integration. Rule 2 ("full user request") is load-bearing — without the original user intent, the meta-mentor cannot detect alignment drift. The word "full" is essential; truncated user requests prevent accurate vibe-checking.

Excerpt 3: Fallback Questions (from src/tools/vibeCheck.ts)

Technique: Graceful degradation with non-LLM fallback

The code includes a generateFallbackQuestions() function that produces basic reflective questions without any API dependency. This ensures the tool never silently fails — if the external LLM call errors, the agent still receives useful reflection prompts.

Analysis: Fallback-first design prevents API failure from breaking the agent loop. The fallback is minimal (no external LLM) but still provides value — demonstrating that metacognitive reflection can be partially served with deterministic question templates.

Uniqueness

Vibe Check MCP — Uniqueness & Positioning

Differs From Seeds

No direct equivalent in the 11 seeds. The closest analogy is the adversarial-subagent self_review_pattern in superpowers, but vibe-check-mcp is fundamentally different: it operates as an MCP tool injected into any agent's system prompt, not as a skill file. It uses a second external LLM as the meta-mentor (not the same model reviewing itself), and it focuses on metacognitive alignment (is the agent on the right path?) rather than code correctness or spec compliance.

Also distinct from heavy3-code-audit (which reviews code via external LLMs) — vibe-check asks "is the agent thinking clearly about this problem?" while heavy3 asks "is this code correct?"

Observable Failure Modes

Maintenance status: Final release v2.8.0. Project unmaintained as of this analysis; community forks only.
External LLM dependency: Requires API key for Gemini/OpenAI/Anthropic/OpenRouter — not free, not local.
Non-blocking: The tool provides questions; agents may ignore them. No enforcement mechanism.
Prompt injection risk: userPrompt parameter is explicitly noted in roadmap as needing sanitization.
Hardcoded prompts: Roadmap notes "Prompt externalization" as a planned improvement — current prompts are in source code, not config files.

Distinctive Opinion

LLMs benefit from having a meta-cognitive "rubber duck" — a second LLM that asks reflective questions rather than generating more code. The CPI (Chain-Pattern Interrupt) theory: agents with external reflection checkpoints are measurably more reliable than those without, with the specific mechanism being interruption of "pattern inertia."

Research Backing

ResearchGate publication on CPI: +27% success rate, -41% harmful actions
PulseMCP "Most Popular (This Week)" Oct 2025
Listed in Anthropic's official MCP servers list
5k+ monthly calls on Smithery.ai at peak

Workflow

Vibe Check MCP — Workflow

Integration Pattern

The tool is used reactively — the primary agent calls vibe_check at key moments during its execution. The recommended integration from docs/agent-prompting.md:

As an autonomous agent you will:
1. Call vibe_check after planning and before major actions.
2. Provide the full user request and your current plan.
3. Optionally, record resolved issues with vibe_learn.

Per-Call Flow

Step	Action	Artifact
1. Agent calls `vibe_check`	Sends goal, plan, userPrompt, progress, uncertainties	MCP tool call
2. Server checks history	Retrieves sessionId history summary (if session exists)	History context
3. LLM dispatch	Routes to configured provider (Gemini/OpenAI/Anthropic/OpenRouter)	API call to second LLM
4. Meta-mentor response	Second LLM generates metacognitive questions/feedback	`{ questions: string }`
5. History update	Adds call + response to session history	History file
6. Return to agent	Agent receives questions; decides whether to adjust plan	Agent continues

Meta-Mentor Decision Tree (Second LLM System Prompt)

What's going on? What's the nature of the problem, approach, and goal?
What does the agent need to hear? Patterns, loops, unspoken assumptions missed? Or doing fine → soft encouragement?
Technical issue? → provide guidance. Alignment issue? → point out and ask questions. No issue? → reminder of best practices.
If plan accurate → remind agent how to continue, what not to forget, or step back.

Approval Gates

None — the tool provides questions/feedback; agent decides how to act. No blocking mechanism.

Fallback

If the external LLM call fails, a fallback generator produces basic reflective questions without API dependency.

Memory Context

Vibe Check MCP — Memory & Context

State Storage

Session-scoped in-memory + file-based. History is tracked per sessionId.

Session History

When a sessionId is supplied to vibe_check, the tool:

Retrieves history summary for that session
Injects it into the second LLM call as context
Adds the new call + response to the session history

This enables context continuity across multiple vibe_check calls within a session.

vibe_learn Storage

vibe_learn logs mistakes, preferences, and successes to a persistent store (via src/utils/storage.ts). These are available for future vibe_check calls as learning context.

Constitution Storage

Per-session rules stored by sessionId. Managed via update_constitution/reset_constitution/check_constitution tools.

Cross-Session Handoff

Partial — sessionId enables continuity within a session. No automatic cross-session persistence beyond vibe_learn logs.

Memory Type

File-based (vibe_learn storage) + in-memory (session history during server lifetime).

Orchestration

Vibe Check MCP — Orchestration

Multi-Agent Pattern

Pattern: none (from the primary agent's perspective). The MCP server invokes a second LLM as a tool call, but the primary agent is unaware of the secondary model — it just receives the metacognitive questions.

From the system level: the vibe_check call creates a synchronous LLM-to-LLM consultation:

Primary Agent (Claude/etc.)
  → calls vibe_check(goal, plan, progress, ...)
  → MCP server dispatches to second LLM (Gemini/OpenAI/Anthropic/OpenRouter)
  → second LLM generates metacognitive questions
  → primary agent receives questions, decides next action

Multi-Model

Yes — the secondary LLM is configurable per call via modelOverride:

Default: Gemini
Supported: OpenAI, Anthropic, OpenRouter

Execution Mode

Event-driven — vibe_check is called reactively by the primary agent at key decision points, not on a schedule.

Isolation Mechanism

None — the MCP server runs as a separate process; the agent interacts via MCP protocol.

Transport

STDIO (standard MCP)
Streamable HTTP (port 2091 default)

Crash Recovery

Fallback questions if LLM call fails. Server runs as long as node process is live.

Ui Cli Surface

Vibe Check MCP — UI / CLI Surface

CLI Binary

Yes — npx -y @pv-bhat/vibe-check-mcp start [--stdio|--http]

Not a thin wrapper: own Express + MCP server runtime
Subcommands: start, install, doctor
Transport flags: --stdio, --http, --port

UI / Dashboard

None. All interaction is via MCP tool calls from the connected client.

IDE Integration

Any MCP-compatible client:

Claude Desktop
Cursor
Windsurf
Custom agent setups

Observability

Docker setup available (scripts/, Dockerfile)
CHANGELOG.md tracks version history
CI workflow: npm audit + source security scan on every PR
No dashboard, no metrics endpoint

Health Check

HTTP mode: curl http://127.0.0.1:2091/health confirms server is live.

Registration Platforms

Listed on multiple MCP directories:

Anthropic official MCP repo
MCP Registry
PulseMCP (Most Popular Oct 2025)
Smithery.ai (5k+ monthly calls)
MSEEP.ai, MCPHub.tools, MCP.so, Glama.json

Related frameworks

same archetype · same primary tool · same memory type

Taskmaster AI ★ 27k

A3 MCP-anchored

Converts a PRD into a dependency-ordered JSON task graph that AI coding agents execute one task at a time, eliminating context…

ccmemory ★ 1

A3 MCP-anchored

Accumulates decisions, corrections, and failed approaches from Claude Code sessions into a queryable Neo4j graph so each new…

Pimzino spec-workflow-mcp ★ 4.2k

A3 MCP-anchored

MCP server providing spec-driven development workflow with dashboard-backed approval gates, implementation logging, and VSCode…

MCP Shrimp Task Manager ★ 2.1k

A3 MCP-anchored

Convert natural language requests into structured AI development tasks with chain-of-thought enforcement, reflection gates, and…

Bernstein ★ 460

A3 MCP-anchored

Govern parallel CLI coding agents with a deterministic Python scheduler, HMAC-chained audit trail, and compliance-ready signed…

LeanSpec ★ 252

A3 MCP-anchored

Provides a unified spec CLI and MCP server over any existing spec backend (markdown, GitHub Issues, ADO), making spec-driven…

Distribution

Type: mcp-server
License: MIT
Install: one-liner
Version: 2.8.0

Surfaces

CLI binary: @pv-bhat/vibe-check-mcp
CLI subcmds: 3
Local UI: No
UI port: 2091
Tech stack: Express.js

Components

Commands: 0
Skills: 0
Subagents: 0
Hooks: 0
MCP servers: 1
MCP tools: 5
Scripts: 2
Templates: 0

Workflow

Phases: 4
Approval gates: 0
Spec format: none
Spec storage: none
Delta or full: none

Orchestration

Multi-agent: No
Pattern: none
Max concurrent: 1
Isolation: process
Consensus: none
Prompt chaining: No

Multi-model

Multi-model: Yes
BYOK: Yes
Modal: text

Execution

Mode: event-driven
Crash recovery: Yes
Compaction: No
Session handoff: Yes
Streaming: No

Memory

Type: file-based
Persistence: session
Search: none
State files: 1 file

Quality

TDD: No
TDD mechanism: none
Validators: 1
Self-review: adversarial-subagent

Git / Observability

Auto commit: No
Auto PR: No
Auto merge: No
Worktree/feat: No
Audit log: No
Audit format: none
Replay: No

Tools

Primary: claude-desktop
Targets: 4
Portability: high

Signals

Stars: 486
Last commit: 2026-05-24
Maintainer: archived
Quality score: 3.7/10