Skip to content
/

openspec-playwright

openspec-playwright · wxhou/openspec-playwright · ★ 4 · last commit 2026-05-26

Extends OpenSpec's spec lifecycle to automated E2E verification by deriving Playwright tests directly from spec files via DOM exploration.

Best whenE2E tests should be derived from the same spec artifacts that drove implementation — not written from scratch after the fact.
Skip ifRunning npx playwright test without a file argument (executes all changes), Writing tests before exploring the real DOM for selector verification
vs seeds
spec-kit's hooks run existing tests; this generates the tests from specs.
Primitive shape 14 total
Commands 1 Subagents 3 MCP tools 10
00

Summary

OpenSpec Playwright — Summary

OpenSpec Playwright (openspec-pw) is an npm CLI tool that integrates OpenSpec's spec-driven development lifecycle with Playwright E2E test automation for Claude Code projects. It installs a single /opsx:e2e command that drives a three-agent pipeline: Explorer (reads OpenSpec specs, navigates real DOM, extracts selectors), Planner (generates test-plan.md), and Generator (writes verified Playwright .spec.ts files). A Healer agent using Playwright MCP auto-repairs test failures. The tool ships a vision-check subcommand that uses a local Ollama VLM for screenshot-based layout anomaly detection. With 4 stars and active development (last commit May 2026, v0.3.26), it is the most operationally complete OpenSpec extension in the corpus — it adds actual test execution infrastructure rather than just planning artifacts.

Compared to seeds: closest to openspec (same change lifecycle, same /opsx command namespace) but adds a full E2E test generation and execution layer — something no seed implements. The openspec-pw doctor and openspec-pw audit commands make this closer to spec-kit (which has validation hooks) than to any other seed.

01

Overview

OpenSpec Playwright — Overview

Origin

Created by GitHub user wxhou. Published May 2026, version 0.3.26. Active development (published to npm as openspec-playwright).

Philosophy

From README:

"A setup tool that integrates OpenSpec's spec-driven development with Playwright's three-agent test pipeline for automated E2E verification."

The core insight: OpenSpec generates planning artifacts (specs, tasks) but stops at implementation. OpenSpec Playwright extends the lifecycle into automated E2E verification — converting OpenSpec spec files directly into Playwright test cases, with the DOM as the ground truth for selectors.

Pipeline Architecture

/opsx:e2e <change-name>
│
├── 1. Select change → read openspec/changes/<name>/specs/
├── 2. Detect auth → check specs for login/auth markers
├── 3. Validate env → run seed.spec.ts
├── 4. Explore app → /browse explores real DOM
│   ├── Read app-knowledge.md
│   ├── Extract routes from specs
│   ├── Navigate + snapshot + screenshot
│   └── Write app-exploration.md
│   └── Extract patterns → update app-knowledge.md
├── 5. Planner → generates test-plan.md
├── 6. Generator → creates <name>.spec.ts
│   └── Verifies selectors in real browser before writing
├── 7. Configure auth → auth.setup.ts
├── 8. Configure playwright.config.ts
├── 9. Execute tests
├── 10. Healer → auto-heals failures via MCP
└── 11. Report → openspec/reports/playwright-e2e-<name>.md

Two Modes

  • Change mode (/opsx:e2e <name>): reads OpenSpec specs to derive routes and test cases
  • All mode (/opsx:e2e all): full app exploration, generates Page Objects, no OpenSpec change required

Vision Check

Optional VLM-powered layout validation using Ollama local model:

  • Multi-viewport screenshot analysis
  • Baseline diff detection
  • HTML report generation
02

Architecture

OpenSpec Playwright — Architecture

Distribution

  • Type: npm-package (openspec-playwright)
  • Binary: openspec-pw
  • Install: npm install -g openspec-playwright

Runtime Requirements

  • Node.js >= 20
  • Claude Code with .claude/ directory
  • gstack (for exploration + browser QA): git clone https://github.com/garrytan/gstack.git ~/.claude/skills/gstack
  • OpenSpec initialized: openspec init
  • Playwright MCP: claude mcp add playwright npx @playwright/mcp@latest
  • Playwright browsers: npx playwright install --with-deps
  • Ollama (optional, for vision-check)

Directory Structure

openspec-playwright (npm package)
├── bin/
│   └── openspec-pw.js      # CLI entry point
├── src/
│   ├── cli.ts
│   ├── commands/           # CLI command implementations
│   └── ...
└── templates/              # Installed to tests/playwright/
    ├── seed.spec.ts
    ├── auth.setup.ts
    ├── credentials.yaml
    ├── app-knowledge.md
    ├── pages/BasePage.ts
    ├── e2e-command.md      # /opsx:e2e command file
    ├── test-plan.md
    ├── report.md
    └── ...

Project layout after openspec-pw init:
tests/playwright/
├── seed.spec.ts             # Env validation
├── auth.setup.ts            # Session recording
├── global.teardown.ts       # Optional post-test cleanup
├── credentials.yaml         # Test users
├── app-knowledge.md         # Cross-change selector patterns
└── changes/<name>/
    └── <name>.spec.ts

openspec/changes/<name>/specs/playwright/
├── app-exploration.md       # Routes + verified selectors
└── test-plan.md

openspec/reports/
└── playwright-e2e-<name>-<timestamp>.md

Key Dependency: gstack

gstack provides the /browse skill used by the Explorer agent to navigate the real DOM and extract selectors. Without gstack, the exploration phase cannot run.

Authentication

Supports API login (preferred) and UI login (fallback). Multi-user support for role-based tests via credentials.yaml.

03

Components

OpenSpec Playwright — Components

CLI Binary: openspec-pw

Subcommand Purpose
openspec-pw init Install E2E command + templates (one-time setup)
openspec-pw update Sync commands + templates from npm
openspec-pw run <name> Execute E2E tests for a change (server lifecycle)
openspec-pw migrate Migrate old test files to new structure
openspec-pw audit Audit tests for orphaned specs and issues
openspec-pw doctor Check all prerequisites
openspec-pw vision-check Analyze screenshots with Ollama VLM
openspec-pw explore Parallel Playwright route exploration
openspec-pw uninstall Remove integration from project

Claude Code Command (1)

Command Purpose
/opsx:e2e <change-name> Run full E2E pipeline for a change

Installed by openspec-pw init to .claude/commands/opsx/CLAUDE.md.

Agents (3 — conceptual roles within /opsx:e2e pipeline)

Agent Role Tools Used Output
Explorer /browse (gstack), Playwright MCP app-exploration.md, updates app-knowledge.md
Planner test-plan.md
Generator Playwright MCP (selector verification) <name>.spec.ts, optional Page Objects
Healer Playwright MCP (browser_snapshot, browser_navigate, browser_run_code) Repaired .spec.ts

Templates (13)

Template Purpose
seed.spec.ts Environment validation test
auth.setup.ts Session recording (login once)
credentials.yaml Test user credentials
app-knowledge.md Cross-change selector patterns
pages/BasePage.ts Page Object base class
e2e-command.md /opsx:e2e command definition
test-plan.md Test case planning template
report.md E2E report template
global.teardown.ts Optional post-test cleanup
playwright.config.ts Playwright configuration
app-exploration.md Route exploration findings
e2e-test.ts Test file template
CLAUDE.md Agent instructions for test generation

MCP Integration

  • @playwright/mcp — used by Healer agent for browser automation
  • Must be installed separately: claude mcp add playwright npx @playwright/mcp@latest
05

Prompts

OpenSpec Playwright — Prompts

Excerpt 1: /opsx:e2e command input/output specification

From templates/e2e-command.md:

## Input

- **Change name**: `/opsx:e2e <name>` or `/opsx:e2e all`
- **Specs**: `openspec/changes/<name>/specs/*.md` (if change mode)
- **Credentials**: `E2E_USERNAME` + `E2E_PASSWORD` env vars

## Output

- **Test file**: `tests/playwright/changes/<name>/<name>.spec.ts`
- **Page Objects** (all mode): `tests/playwright/pages/<Route>Page.ts`
- **Auth setup**: `tests/playwright/auth.setup.ts` (if auth required)
- **Report**: `openspec/reports/playwright-e2e-<name>-<timestamp>.md`
- **App Bug Registry**: `openspec/reports/app-bug-registry.md`
- **Test plan**: `openspec/changes/<name>/specs/playwright/test-plan.md`

Technique: Explicit I/O contract in the command file — the agent knows exactly which files to read and produce before starting.


Excerpt 2: Regression isolation rule

From templates/e2e-command.md:

> **⚠️ Full regression is opt-in only.** Default: `openspec-pw run <name>` → one spec file. 
> Do NOT run `npx playwright test` (no file), `--only-changed`, or any command that 
> executes multiple `.spec.ts` files unless the user explicitly requests it. This 
> includes running the same command twice across different changes to simulate regression.

> **Role mapping**: Planner (Step 4–5) → test-plan.md; Generator (Step 6) → `.spec.ts` + 
> Page Objects; Healer (Step 9) → repairs failures via MCP.

Technique: Explicit prohibition with rationale. The "do NOT run" phrasing is an Iron Law-style negative constraint preventing accidental full-suite execution.


Excerpt 3: Project dependencies architecture rationale

From templates/e2e-command.md:

| Feature | Project Dependencies | globalSetup/globalTeardown |
|---------|---------------------|---------------------------|
| HTML report visibility | ✅ Shown as project | ❌ Not shown |
| Trace recording | ✅ Full support | ❌ Not supported |
| Playwright fixtures | ✅ Fully supported | ❌ Not supported |
| Browser via fixture | ✅ Automatic | ❌ Manual launch |

Technique: Decision table embedded in the command prompt. The agent can reference this to explain architectural choices to users who ask why auth setup is implemented as a project dependency rather than globalSetup.

09

Uniqueness

OpenSpec Playwright — Uniqueness

differs_from_seeds

Closest seed: openspec (same change lifecycle, /opsx command namespace). The architectural delta is a complete E2E test generation and execution layer that no seed implements: OpenSpec covers proposal → specs → tasks → apply; this extends the lifecycle to include explore → plan tests → generate tests → execute → heal → report. The Healer agent (auto-repairing failing tests via Playwright MCP browser control) is unique in the entire corpus. The vision-check subcommand using a local Ollama VLM for screenshot-based layout validation has no parallel anywhere in the seed set. The closest seed for validation discipline is spec-kit which has PostToolUse hooks running tests after edits, but spec-kit doesn't generate the tests from spec artifacts — the test generation from spec files is the key distinction here.

Positioning

"OpenSpec's missing last mile." OpenSpec produces planning artifacts and stops at implementation. OpenSpec Playwright completes the verification loop by deriving Playwright tests directly from the same spec files that drove the implementation, ensuring the E2E test suite reflects the current feature spec.

Observable Failure Modes

  1. gstack dependency: the Explorer phase requires gstack which is not an npm package — it's a git clone with a custom setup script. This external dependency is fragile and may become stale.
  2. Ollama for vision-check: requires a local Ollama installation with a VLM model downloaded. Not suitable for CI without containerized Ollama.
  3. Selector drift: app-knowledge.md accumulates selectors that may become stale as the app evolves. No automatic staleness detection.
  4. Regression scope confusion: the command warns heavily against running full Playwright test suite — but team members unfamiliar with the tool may run npx playwright test directly, executing all change tests simultaneously.
  5. Healer loops: if the Healer cannot repair a failure after several MCP calls, there is no documented bail-out condition.
04

Workflow

OpenSpec Playwright — Workflow

Prerequisites Phase

Step Command Gate
Install CLI npm install -g openspec-playwright
Install gstack git clone gstack ~/.claude/skills/gstack && ./setup
Install OpenSpec openspec init
Initialize E2E openspec-pw init
Install Playwright MCP claude mcp add playwright ...
Install browsers npx playwright install --with-deps
Validate env npx playwright test tests/playwright/seed.spec.ts Must pass
Configure auth (if needed) Record login session

E2E Pipeline (triggered by /opsx:e2e <change-name>)

Step Phase Artifact
1 Select change Read openspec/changes/<name>/specs/
2 Detect auth Check specs for auth markers
3 Validate env Run seed.spec.ts
4 Explore app-exploration.md, app-knowledge.md
5 Plan test-plan.md
6 Generate <name>.spec.ts, optional Page Objects
7 Configure auth auth.setup.ts
8 Configure Playwright playwright.config.ts
9 Execute Test run
10 Heal (if failures) Repaired test files via MCP
11 Report openspec/reports/playwright-e2e-<name>-<timestamp>.md

Approval Gates

Gate Type
seed.spec.ts must pass before /opsx:e2e auto-validator
Selector verification in real browser before writing test auto-validator
Manual verification for role-based auth setup human-required

Isolation Rules

From README:

"Default: openspec-pw run <name> → one spec file. Do NOT run npx playwright test (no file) — this executes ALL spec files across ALL changes. Full regression is opt-in only."

App Bug Registry

openspec/reports/app-bug-registry.md — cumulative per-project bug tracking file that persists across changes.

06

Memory Context

OpenSpec Playwright — Memory & Context

State Storage

File-based across two directory trees:

# OpenSpec artifacts (planning context)
openspec/changes/<name>/specs/
openspec/changes/<name>/specs/playwright/
├── app-exploration.md      # This change's routes + selectors
└── test-plan.md

# Test assets (execution context)
tests/playwright/
├── app-knowledge.md         # Cross-change selector patterns (accumulates)
└── changes/<name>/
    └── <name>.spec.ts

# Reports (audit trail)
openspec/reports/
├── playwright-e2e-<name>-<timestamp>.md
└── app-bug-registry.md      # Cumulative across all changes

Cross-Session Knowledge

app-knowledge.md is the only file that persists and accumulates across changes. It stores:

  • Known selectors per route
  • Authentication patterns
  • Known issues per page

This makes subsequent /opsx:e2e runs faster — the Explorer can reuse known selectors instead of re-navigating.

Change-Level Isolation

Each change's exploration and tests are isolated:

  • app-exploration.md is per-change
  • <name>.spec.ts is per-change
  • Running tests for change A does not execute change B's tests

Persistence

  • Scope: project (per-repository)
  • Cross-session: Yes — all test files and exploration data persist
  • Context compaction: app-knowledge.md grows unbounded; no compaction mechanism

Session Handoff

If /opsx:e2e is interrupted mid-pipeline, resuming requires re-running the command. No checkpoint resume mechanism within the pipeline.

07

Orchestration

OpenSpec Playwright — Orchestration

Multi-Agent

Yes — three conceptual agent roles within the /opsx:e2e pipeline: Explorer, Planner/Generator, Healer.

Orchestration Pattern

Sequential. Explorer → Planner → Generator → (Healer if failures) → Reporter. No parallel execution within a single change.

Agent Definitions

Agents are not separately spawned processes — they are sequential phases within a single Claude Code session, distinguished by their tool access and task focus.

Role Tools Mode
Explorer /browse (gstack), DOM navigation Read-only exploration
Planner Document generation
Generator Playwright MCP (selector verification) Write .spec.ts
Healer Playwright MCP (full browser control) Repair failed tests

Isolation Mechanism

Per-change test isolation (one .spec.ts per change, run via openspec-pw run <name>).

MCP Integration

Playwright MCP (@playwright/mcp) provides:

  • browser_snapshot — accessibility tree snapshot
  • browser_navigate — page navigation
  • browser_run_code — JavaScript execution in browser
  • Additional browser tools for the Healer

Multi-Model

No. Single model (Claude Code).

Execution Mode

One-shot per change invocation. Re-running /opsx:e2e <name> re-runs the full pipeline.

Auto-Validation

The seed test (seed.spec.ts) runs before the pipeline. Selector verification runs in a live browser before writing the test file. These are the two auto-validators.

Crash Recovery

None within pipeline. Re-run /opsx:e2e <name> from the start.

08

Ui Cli Surface

OpenSpec Playwright — UI & CLI Surface

Dedicated CLI Binary

Yes.

  • Binary: openspec-pw
  • Package: openspec-playwright on npm
  • Version analyzed: 0.3.26

Subcommands

Subcommand Description
init One-time setup: installs command + templates
update Sync CLI + commands + templates from npm
run <name> Execute E2E tests with server lifecycle
migrate Migrate old test files to new structure
audit Audit tests for orphaned specs
doctor Check all prerequisites
vision-check Screenshot layout analysis via Ollama VLM
explore Parallel Playwright route exploration
uninstall Remove integration from project

Vision Check Feature

Unique capability — local VLM screenshot analysis:

openspec-pw vision-check

Features:

  • Multi-viewport capture (desktop, tablet, mobile)
  • Baseline diff against previous captures
  • HTML report with anomalies highlighted
  • Powered by Ollama (requires local install)

Claude Code Integration

Installs /opsx:e2e command to .claude/commands/opsx/ via openspec-pw init.

Review UI

No web dashboard. Reports are Markdown files at openspec/reports/playwright-e2e-*.md.

Doctor Output

openspec-pw doctor
# Checks: Node.js version, gstack install, OpenSpec install, Playwright MCP, browser install

Audit Output

openspec-pw audit
# Reports: orphaned tests (specs deleted but tests exist), missing tests, selector issues

Related frameworks

same archetype · same primary tool · same memory type

Taskmaster AI ★ 27k

Converts a PRD into a dependency-ordered JSON task graph that AI coding agents execute one task at a time, eliminating context…

ccmemory ★ 1

Accumulates decisions, corrections, and failed approaches from Claude Code sessions into a queryable Neo4j graph so each new…

Pimzino spec-workflow-mcp ★ 4.2k

MCP server providing spec-driven development workflow with dashboard-backed approval gates, implementation logging, and VSCode…

MCP Shrimp Task Manager ★ 2.1k

Convert natural language requests into structured AI development tasks with chain-of-thought enforcement, reflection gates, and…

Bernstein ★ 460

Govern parallel CLI coding agents with a deterministic Python scheduler, HMAC-chained audit trail, and compliance-ready signed…

LeanSpec ★ 252

Provides a unified spec CLI and MCP server over any existing spec backend (markdown, GitHub Issues, ADO), making spec-driven…