openspec-playwright

openspec-playwright · wxhou/openspec-playwright · ★ 4 · last commit 2026-05-26

Extends OpenSpec's spec lifecycle to automated E2E verification by deriving Playwright tests directly from spec files via DOM exploration.

Best whenE2E tests should be derived from the same spec artifacts that drove implementation — not written from scratch after the fact.

Skip ifRunning npx playwright test without a file argument (executes all changes), Writing tests before exploring the real DOM for selector verification

vs seeds

spec-kit's hooks run existing tests; this generates the tests from specs.

Primitive shape 14 total

Commands 1 Subagents 3 MCP tools 10

Summary

OpenSpec Playwright — Summary

OpenSpec Playwright (openspec-pw) is an npm CLI tool that integrates OpenSpec's spec-driven development lifecycle with Playwright E2E test automation for Claude Code projects. It installs a single /opsx:e2e command that drives a three-agent pipeline: Explorer (reads OpenSpec specs, navigates real DOM, extracts selectors), Planner (generates test-plan.md), and Generator (writes verified Playwright .spec.ts files). A Healer agent using Playwright MCP auto-repairs test failures. The tool ships a vision-check subcommand that uses a local Ollama VLM for screenshot-based layout anomaly detection. With 4 stars and active development (last commit May 2026, v0.3.26), it is the most operationally complete OpenSpec extension in the corpus — it adds actual test execution infrastructure rather than just planning artifacts.

Compared to seeds: closest to openspec (same change lifecycle, same /opsx command namespace) but adds a full E2E test generation and execution layer — something no seed implements. The openspec-pw doctor and openspec-pw audit commands make this closer to spec-kit (which has validation hooks) than to any other seed.

Overview

OpenSpec Playwright — Overview

Origin

Created by GitHub user wxhou. Published May 2026, version 0.3.26. Active development (published to npm as openspec-playwright).

Philosophy

From README:

"A setup tool that integrates OpenSpec's spec-driven development with Playwright's three-agent test pipeline for automated E2E verification."

The core insight: OpenSpec generates planning artifacts (specs, tasks) but stops at implementation. OpenSpec Playwright extends the lifecycle into automated E2E verification — converting OpenSpec spec files directly into Playwright test cases, with the DOM as the ground truth for selectors.

Pipeline Architecture

/opsx:e2e <change-name>
│
├── 1. Select change → read openspec/changes/<name>/specs/
├── 2. Detect auth → check specs for login/auth markers
├── 3. Validate env → run seed.spec.ts
├── 4. Explore app → /browse explores real DOM
│   ├── Read app-knowledge.md
│   ├── Extract routes from specs
│   ├── Navigate + snapshot + screenshot
│   └── Write app-exploration.md
│   └── Extract patterns → update app-knowledge.md
├── 5. Planner → generates test-plan.md
├── 6. Generator → creates <name>.spec.ts
│   └── Verifies selectors in real browser before writing
├── 7. Configure auth → auth.setup.ts
├── 8. Configure playwright.config.ts
├── 9. Execute tests
├── 10. Healer → auto-heals failures via MCP
└── 11. Report → openspec/reports/playwright-e2e-<name>.md

Two Modes

Change mode (/opsx:e2e <name>): reads OpenSpec specs to derive routes and test cases
All mode (/opsx:e2e all): full app exploration, generates Page Objects, no OpenSpec change required

Vision Check

Optional VLM-powered layout validation using Ollama local model:

Multi-viewport screenshot analysis
Baseline diff detection
HTML report generation

Architecture

OpenSpec Playwright — Architecture

Distribution

Type: npm-package (openspec-playwright)
Binary: openspec-pw
Install: npm install -g openspec-playwright

Runtime Requirements

Node.js >= 20
Claude Code with .claude/ directory
gstack (for exploration + browser QA): git clone https://github.com/garrytan/gstack.git ~/.claude/skills/gstack
OpenSpec initialized: openspec init
Playwright MCP: claude mcp add playwright npx @playwright/mcp@latest
Playwright browsers: npx playwright install --with-deps
Ollama (optional, for vision-check)

Directory Structure

openspec-playwright (npm package)
├── bin/
│   └── openspec-pw.js      # CLI entry point
├── src/
│   ├── cli.ts
│   ├── commands/           # CLI command implementations
│   └── ...
└── templates/              # Installed to tests/playwright/
    ├── seed.spec.ts
    ├── auth.setup.ts
    ├── credentials.yaml
    ├── app-knowledge.md
    ├── pages/BasePage.ts
    ├── e2e-command.md      # /opsx:e2e command file
    ├── test-plan.md
    ├── report.md
    └── ...

Project layout after openspec-pw init:
tests/playwright/
├── seed.spec.ts             # Env validation
├── auth.setup.ts            # Session recording
├── global.teardown.ts       # Optional post-test cleanup
├── credentials.yaml         # Test users
├── app-knowledge.md         # Cross-change selector patterns
└── changes/<name>/
    └── <name>.spec.ts

openspec/changes/<name>/specs/playwright/
├── app-exploration.md       # Routes + verified selectors
└── test-plan.md

openspec/reports/
└── playwright-e2e-<name>-<timestamp>.md

Key Dependency: gstack

gstack provides the /browse skill used by the Explorer agent to navigate the real DOM and extract selectors. Without gstack, the exploration phase cannot run.

Authentication

Supports API login (preferred) and UI login (fallback). Multi-user support for role-based tests via credentials.yaml.

Components

OpenSpec Playwright — Components

CLI Binary: `openspec-pw`

Subcommand	Purpose
`openspec-pw init`	Install E2E command + templates (one-time setup)
`openspec-pw update`	Sync commands + templates from npm
`openspec-pw run <name>`	Execute E2E tests for a change (server lifecycle)
`openspec-pw migrate`	Migrate old test files to new structure
`openspec-pw audit`	Audit tests for orphaned specs and issues
`openspec-pw doctor`	Check all prerequisites
`openspec-pw vision-check`	Analyze screenshots with Ollama VLM
`openspec-pw explore`	Parallel Playwright route exploration
`openspec-pw uninstall`	Remove integration from project

Claude Code Command (1)

Command	Purpose
`/opsx:e2e <change-name>`	Run full E2E pipeline for a change

Installed by openspec-pw init to .claude/commands/opsx/CLAUDE.md.

Agents (3 — conceptual roles within /opsx:e2e pipeline)

Agent Role	Tools Used	Output
Explorer	`/browse` (gstack), Playwright MCP	`app-exploration.md`, updates `app-knowledge.md`
Planner	—	`test-plan.md`
Generator	Playwright MCP (selector verification)	`<name>.spec.ts`, optional Page Objects
Healer	Playwright MCP (`browser_snapshot`, `browser_navigate`, `browser_run_code`)	Repaired `.spec.ts`

Templates (13)

Template	Purpose
`seed.spec.ts`	Environment validation test
`auth.setup.ts`	Session recording (login once)
`credentials.yaml`	Test user credentials
`app-knowledge.md`	Cross-change selector patterns
`pages/BasePage.ts`	Page Object base class
`e2e-command.md`	`/opsx:e2e` command definition
`test-plan.md`	Test case planning template
`report.md`	E2E report template
`global.teardown.ts`	Optional post-test cleanup
`playwright.config.ts`	Playwright configuration
`app-exploration.md`	Route exploration findings
`e2e-test.ts`	Test file template
`CLAUDE.md`	Agent instructions for test generation

MCP Integration

@playwright/mcp — used by Healer agent for browser automation
Must be installed separately: claude mcp add playwright npx @playwright/mcp@latest

Prompts

OpenSpec Playwright — Prompts

Excerpt 1: /opsx:e2e command input/output specification

From templates/e2e-command.md:

## Input

- **Change name**: `/opsx:e2e <name>` or `/opsx:e2e all`
- **Specs**: `openspec/changes/<name>/specs/*.md` (if change mode)
- **Credentials**: `E2E_USERNAME` + `E2E_PASSWORD` env vars

## Output

- **Test file**: `tests/playwright/changes/<name>/<name>.spec.ts`
- **Page Objects** (all mode): `tests/playwright/pages/<Route>Page.ts`
- **Auth setup**: `tests/playwright/auth.setup.ts` (if auth required)
- **Report**: `openspec/reports/playwright-e2e-<name>-<timestamp>.md`
- **App Bug Registry**: `openspec/reports/app-bug-registry.md`
- **Test plan**: `openspec/changes/<name>/specs/playwright/test-plan.md`

Technique: Explicit I/O contract in the command file — the agent knows exactly which files to read and produce before starting.

Excerpt 2: Regression isolation rule

From templates/e2e-command.md:

> **⚠️ Full regression is opt-in only.** Default: `openspec-pw run <name>` → one spec file. 
> Do NOT run `npx playwright test` (no file), `--only-changed`, or any command that 
> executes multiple `.spec.ts` files unless the user explicitly requests it. This 
> includes running the same command twice across different changes to simulate regression.

> **Role mapping**: Planner (Step 4–5) → test-plan.md; Generator (Step 6) → `.spec.ts` + 
> Page Objects; Healer (Step 9) → repairs failures via MCP.

Technique: Explicit prohibition with rationale. The "do NOT run" phrasing is an Iron Law-style negative constraint preventing accidental full-suite execution.

Excerpt 3: Project dependencies architecture rationale

From templates/e2e-command.md:

| Feature | Project Dependencies | globalSetup/globalTeardown |
|---------|---------------------|---------------------------|
| HTML report visibility | ✅ Shown as project | ❌ Not shown |
| Trace recording | ✅ Full support | ❌ Not supported |
| Playwright fixtures | ✅ Fully supported | ❌ Not supported |
| Browser via fixture | ✅ Automatic | ❌ Manual launch |

Technique: Decision table embedded in the command prompt. The agent can reference this to explain architectural choices to users who ask why auth setup is implemented as a project dependency rather than globalSetup.

Uniqueness

OpenSpec Playwright — Uniqueness

differs_from_seeds

Closest seed: openspec (same change lifecycle, /opsx command namespace). The architectural delta is a complete E2E test generation and execution layer that no seed implements: OpenSpec covers proposal → specs → tasks → apply; this extends the lifecycle to include explore → plan tests → generate tests → execute → heal → report. The Healer agent (auto-repairing failing tests via Playwright MCP browser control) is unique in the entire corpus. The vision-check subcommand using a local Ollama VLM for screenshot-based layout validation has no parallel anywhere in the seed set. The closest seed for validation discipline is spec-kit which has PostToolUse hooks running tests after edits, but spec-kit doesn't generate the tests from spec artifacts — the test generation from spec files is the key distinction here.

Positioning

"OpenSpec's missing last mile." OpenSpec produces planning artifacts and stops at implementation. OpenSpec Playwright completes the verification loop by deriving Playwright tests directly from the same spec files that drove the implementation, ensuring the E2E test suite reflects the current feature spec.

Observable Failure Modes

gstack dependency: the Explorer phase requires gstack which is not an npm package — it's a git clone with a custom setup script. This external dependency is fragile and may become stale.
Ollama for vision-check: requires a local Ollama installation with a VLM model downloaded. Not suitable for CI without containerized Ollama.
Selector drift: app-knowledge.md accumulates selectors that may become stale as the app evolves. No automatic staleness detection.
Regression scope confusion: the command warns heavily against running full Playwright test suite — but team members unfamiliar with the tool may run npx playwright test directly, executing all change tests simultaneously.
Healer loops: if the Healer cannot repair a failure after several MCP calls, there is no documented bail-out condition.

Workflow

OpenSpec Playwright — Workflow

Prerequisites Phase

Step	Command	Gate
Install CLI	`npm install -g openspec-playwright`	—
Install gstack	`git clone gstack ~/.claude/skills/gstack && ./setup`	—
Install OpenSpec	`openspec init`	—
Initialize E2E	`openspec-pw init`	—
Install Playwright MCP	`claude mcp add playwright ...`	—
Install browsers	`npx playwright install --with-deps`	—
Validate env	`npx playwright test tests/playwright/seed.spec.ts`	Must pass
Configure auth (if needed)	Record login session	—

E2E Pipeline (triggered by `/opsx:e2e <change-name>`)

Step	Phase	Artifact
1	Select change	Read `openspec/changes/<name>/specs/`
2	Detect auth	Check specs for auth markers
3	Validate env	Run `seed.spec.ts`
4	Explore	`app-exploration.md`, `app-knowledge.md`
5	Plan	`test-plan.md`
6	Generate	`<name>.spec.ts`, optional Page Objects
7	Configure auth	`auth.setup.ts`
8	Configure Playwright	`playwright.config.ts`
9	Execute	Test run
10	Heal (if failures)	Repaired test files via MCP
11	Report	`openspec/reports/playwright-e2e-<name>-<timestamp>.md`

Approval Gates

Gate	Type
`seed.spec.ts` must pass before /opsx:e2e	auto-validator
Selector verification in real browser before writing test	auto-validator
Manual verification for role-based auth setup	human-required

Isolation Rules

From README:

"Default: openspec-pw run <name> → one spec file. Do NOT run npx playwright test (no file) — this executes ALL spec files across ALL changes. Full regression is opt-in only."

App Bug Registry

openspec/reports/app-bug-registry.md — cumulative per-project bug tracking file that persists across changes.

Memory Context

OpenSpec Playwright — Memory & Context

State Storage

File-based across two directory trees:

# OpenSpec artifacts (planning context)
openspec/changes/<name>/specs/
openspec/changes/<name>/specs/playwright/
├── app-exploration.md      # This change's routes + selectors
└── test-plan.md

# Test assets (execution context)
tests/playwright/
├── app-knowledge.md         # Cross-change selector patterns (accumulates)
└── changes/<name>/
    └── <name>.spec.ts

# Reports (audit trail)
openspec/reports/
├── playwright-e2e-<name>-<timestamp>.md
└── app-bug-registry.md      # Cumulative across all changes

Cross-Session Knowledge

app-knowledge.md is the only file that persists and accumulates across changes. It stores:

Known selectors per route
Authentication patterns
Known issues per page

This makes subsequent /opsx:e2e runs faster — the Explorer can reuse known selectors instead of re-navigating.

Change-Level Isolation

Each change's exploration and tests are isolated:

app-exploration.md is per-change
<name>.spec.ts is per-change
Running tests for change A does not execute change B's tests

Persistence

Scope: project (per-repository)
Cross-session: Yes — all test files and exploration data persist
Context compaction: app-knowledge.md grows unbounded; no compaction mechanism

Session Handoff

If /opsx:e2e is interrupted mid-pipeline, resuming requires re-running the command. No checkpoint resume mechanism within the pipeline.

Orchestration

OpenSpec Playwright — Orchestration

Multi-Agent

Yes — three conceptual agent roles within the /opsx:e2e pipeline: Explorer, Planner/Generator, Healer.

Orchestration Pattern

Sequential. Explorer → Planner → Generator → (Healer if failures) → Reporter. No parallel execution within a single change.

Agent Definitions

Agents are not separately spawned processes — they are sequential phases within a single Claude Code session, distinguished by their tool access and task focus.

Role	Tools	Mode
Explorer	`/browse` (gstack), DOM navigation	Read-only exploration
Planner	—	Document generation
Generator	Playwright MCP (selector verification)	Write `.spec.ts`
Healer	Playwright MCP (full browser control)	Repair failed tests

Isolation Mechanism

Per-change test isolation (one .spec.ts per change, run via openspec-pw run <name>).

MCP Integration

Playwright MCP (@playwright/mcp) provides:

browser_snapshot — accessibility tree snapshot
browser_navigate — page navigation
browser_run_code — JavaScript execution in browser
Additional browser tools for the Healer

Multi-Model

No. Single model (Claude Code).

Execution Mode

One-shot per change invocation. Re-running /opsx:e2e <name> re-runs the full pipeline.

Auto-Validation

The seed test (seed.spec.ts) runs before the pipeline. Selector verification runs in a live browser before writing the test file. These are the two auto-validators.

Crash Recovery

None within pipeline. Re-run /opsx:e2e <name> from the start.

Ui Cli Surface

OpenSpec Playwright — UI & CLI Surface

Dedicated CLI Binary

Yes.

Binary: openspec-pw
Package: openspec-playwright on npm
Version analyzed: 0.3.26

Subcommands

Subcommand	Description
`init`	One-time setup: installs command + templates
`update`	Sync CLI + commands + templates from npm
`run <name>`	Execute E2E tests with server lifecycle
`migrate`	Migrate old test files to new structure
`audit`	Audit tests for orphaned specs
`doctor`	Check all prerequisites
`vision-check`	Screenshot layout analysis via Ollama VLM
`explore`	Parallel Playwright route exploration
`uninstall`	Remove integration from project

Vision Check Feature

Unique capability — local VLM screenshot analysis:

openspec-pw vision-check

Features:

Multi-viewport capture (desktop, tablet, mobile)
Baseline diff against previous captures
HTML report with anomalies highlighted
Powered by Ollama (requires local install)

Claude Code Integration

Installs /opsx:e2e command to .claude/commands/opsx/ via openspec-pw init.

Review UI

No web dashboard. Reports are Markdown files at openspec/reports/playwright-e2e-*.md.

Doctor Output

openspec-pw doctor
# Checks: Node.js version, gstack install, OpenSpec install, Playwright MCP, browser install

Audit Output

openspec-pw audit
# Reports: orphaned tests (specs deleted but tests exist), missing tests, selector issues

Related frameworks

same archetype · same primary tool · same memory type

Taskmaster AI ★ 27k

A3 MCP-anchored

Converts a PRD into a dependency-ordered JSON task graph that AI coding agents execute one task at a time, eliminating context…

ccmemory ★ 1

A3 MCP-anchored

Accumulates decisions, corrections, and failed approaches from Claude Code sessions into a queryable Neo4j graph so each new…

Pimzino spec-workflow-mcp ★ 4.2k

A3 MCP-anchored

MCP server providing spec-driven development workflow with dashboard-backed approval gates, implementation logging, and VSCode…

MCP Shrimp Task Manager ★ 2.1k

A3 MCP-anchored

Convert natural language requests into structured AI development tasks with chain-of-thought enforcement, reflection gates, and…

Bernstein ★ 460

A3 MCP-anchored

Govern parallel CLI coding agents with a deterministic Python scheduler, HMAC-chained audit trail, and compliance-ready signed…

LeanSpec ★ 252

A3 MCP-anchored

Provides a unified spec CLI and MCP server over any existing spec backend (markdown, GitHub Issues, ADO), making spec-driven…

Distribution

Type: npm-package
License: MIT
Install: multi-step
Version: 0.3.26

Surfaces

CLI binary: openspec-pw
CLI subcmds: 9
Local UI: No

Components

Commands: 1
Skills: 0
Subagents: 3
Hooks: 0
MCP servers: 1
MCP tools: 10
Scripts: 1
Templates: 13

Workflow

Phases: 11
Approval gates: 1
Spec format: markdown
Spec storage: per-feature-folder
Delta or full: whole-file

Orchestration

Multi-agent: Yes
Pattern: sequential
Max concurrent: 1
Isolation: none
Consensus: none
Prompt chaining: Yes

Multi-model

Multi-model: No
BYOK: Yes
Modal: text+vision

Execution

Mode: one-shot
Crash recovery: No
Compaction: No
Session handoff: Yes
Streaming: No

Memory

Type: file-based
Persistence: project
Search: none
State files: 6 files

Quality

TDD: Yes
TDD mechanism: post-hook-test-runner
Validators: 3
Self-review: none

Git / Observability

Auto commit: No
Auto PR: No
Auto merge: No
Worktree/feat: No
Audit log: Yes
Audit format: structured-md
Replay: No

Tools

Primary: claude-code
Targets: 1
Portability: low

Signals

Stars: 4
Last commit: 2026-05-26
Contributors: 1
Maintainer: active
Quality score: 4.7/10

Summary

OpenSpec Playwright — Summary

Overview

OpenSpec Playwright — Overview

Origin

Philosophy

Pipeline Architecture

Two Modes

Vision Check

Architecture

OpenSpec Playwright — Architecture

Distribution

Runtime Requirements

Directory Structure

Key Dependency: gstack

Authentication

Components

OpenSpec Playwright — Components

CLI Binary: openspec-pw

Claude Code Command (1)

Agents (3 — conceptual roles within /opsx:e2e pipeline)

Templates (13)

MCP Integration

Prompts

OpenSpec Playwright — Prompts

Excerpt 1: /opsx:e2e command input/output specification

Excerpt 2: Regression isolation rule

Excerpt 3: Project dependencies architecture rationale

Uniqueness

OpenSpec Playwright — Uniqueness

differs_from_seeds

Positioning

Observable Failure Modes

Workflow

OpenSpec Playwright — Workflow

Prerequisites Phase

E2E Pipeline (triggered by /opsx:e2e <change-name>)

Approval Gates

Isolation Rules

App Bug Registry

Memory Context

OpenSpec Playwright — Memory & Context

State Storage

Cross-Session Knowledge

Change-Level Isolation

Persistence

Session Handoff

Orchestration

OpenSpec Playwright — Orchestration

Multi-Agent

Orchestration Pattern

Agent Definitions

Isolation Mechanism

MCP Integration

Multi-Model

Execution Mode

Auto-Validation

Crash Recovery

Ui Cli Surface

OpenSpec Playwright — UI & CLI Surface

Dedicated CLI Binary

Subcommands

Vision Check Feature

Claude Code Integration

Review UI

Doctor Output

Audit Output

Related frameworks

CLI Binary: `openspec-pw`

E2E Pipeline (triggered by `/opsx:e2e <change-name>`)