Browser Harness

browser-harness · browser-use/browser-harness · ★ 14k · last commit 2026-05-20

Primitive shape 25 total

Commands 5 Skills 20

Summary

Browser Harness — Summary

Browser Harness is a self-healing CDP (Chrome DevTools Protocol) harness that connects an LLM directly to the user's real running Chrome browser with minimal abstraction (~1k lines, 4 core files). The agent writes and improves agent-workspace/agent_helpers.py during execution — the harness literally improves itself each run. It ships as a skill-md file (SKILL.md) registerable with Claude Code (~/.claude/CLAUDE.md) or Codex, an install guide (install.md), interaction skills for CDP mechanics (17 skill files), and domain-specific playbooks (agent-workspace/domain-skills/) contributed by users and generated by the agent. The browser-harness binary is installed globally via uv tool install -e . for any-directory access.

Browser Harness is closest to superpowers from the seeds (both use skill-md files for behavioral guidance) but differs fundamentally: it ships a Python CDP daemon (browser-harness binary) that connects to the user's real Chrome browser, not a sandboxed cloud VM. The self-improving agent_helpers.py pattern (agent writes missing helpers during execution) is not seen in any seed framework.

Overview

Browser Harness — Overview

Origin

Built by browser-use (browser-use.com), the team behind the Browser Use library. Published post as "The Bitter Lesson of Agent Harnesses" (browser-use.com/posts/bitter-lesson-agent-harnesses) and "Web Agents That Actually Learn" (browser-use.com/posts/web-agents-that-actually-learn).

Philosophy

From the README:

"Connect an LLM directly to your real browser with a thin, editable CDP harness. For browser tasks where you need complete freedom. One websocket to Chrome, nothing between."

"The agent writes what's missing during execution. The harness improves itself every run."

The central thesis ("Bitter Lesson"):

Thick harnesses add overhead without proportional benefit
Agents should fix their own missing capabilities (self-healing)
Skills should be generated by the agent from real execution, not hand-authored

From the README on contributing:

"Skills are written by the harness, not by you. Just run your task with the agent — when it figures something non-obvious out, it files the skill itself. Please don't hand-author skill files; agent-generated ones reflect what actually works in the browser."

Key Design Constraints (from SKILL.md)

Coordinate clicks default (Input.dispatchMouseEvent) — works through iframes/shadow DOM/cross-origin
Connect to user's running Chrome, don't launch own browser
cdp-use only for CDPClient.send_raw — prefer raw CDP strings
run.py stays tiny — no argparse, subcommands, or extra control layer

Self-Healing Pattern

agent needs to upload a file
  ↓
agent_helpers.py → helper missing
  ↓
agent writes it (adds custom helper)
  ↓
file uploaded

Architecture

Browser Harness — Architecture

Distribution

Clone + editable install: git clone ... && uv tool install -e .
PyPI: uv tool install browser-harness (non-editable)

Directory Structure

src/browser_harness/
  __init__.py
  run.py          # Main entry point (tiny by design)
  daemon.py       # CDP WebSocket daemon
  helpers.py      # Core pre-imported helpers
  admin.py        # Version, update, doctor commands
  _ipc.py         # IPC (Unix socket / TCP loopback)
agent-workspace/
  agent_helpers.py    # Agent-editable helper code
  domain-skills/      # Community per-site playbooks
    github/
    linkedin/
    amazon/
    ...
interaction-skills/   # CDP mechanics reference
  connection.md
  cookies.md
  cross-origin-iframes.md
  dialogs.md
  downloads.md
  drag-and-drop.md
  dropdowns.md
  iframes.md
  network-requests.md
  print-as-pdf.md
  profile-sync.md
  screenshots.md
  scrolling.md
  shadow-dom.md
  tabs.md
  uploads.md
  viewport.md
SKILL.md          # Day-to-day usage skill (register with Claude Code)
install.md        # Install + setup (skill file for agents)
AGENTS.md
docs/
tests/
pyproject.toml

Required Runtime

Python 3.x
uv (recommended) or pip
Chrome/Chromium running with remote debugging enabled

CLI Binary

browser-harness — global binary, heredoc invocation
Uses Unix socket at /tmp/bu-<NAME>.sock (POSIX) or TCP loopback (Windows) for IPC
Daemon auto-starts and connects to Chrome on first call

Browser Connection

Two modes:

chrome://inspect/#remote-debugging checkbox (real Chrome profile, sticky per-profile)
--remote-debugging-port=9222 --user-data-dir=<custom-path> (isolated profile, no popups)

Components

Browser Harness — Components

Core Files (4 files, ~1k lines total)

run.py

Entry point for browser-harness CLI
Pre-imports all helpers
Calls ensure_daemon() before exec
Thin by design — no argparse, no subcommands

daemon.py

CDP WebSocket daemon process
Connects to Chrome's DevTools Protocol
IPC via Unix socket or TCP loopback
BU_NAME namespaces daemon's IPC, pid, log files

helpers.py

Pre-imported helpers for agent use:
- capture_screenshot() — capture desktop state
- click_at_xy(x, y) — coordinate click (compositor-level)
- wait_for_load() — wait for page load
- page_info() — current page info
- ensure_real_tab() — recover from stale tab
- js(code) — execute JavaScript
- cdp(domain, params) — raw CDP call
- http_get(url) — non-browser HTTP (fast bulk)
- new_tab(url) — open new tab (not goto)

admin.py

_version(), NAME
daemon_alive(), ensure_daemon(), restart_daemon()
run_doctor(), run_doctor_fix_snap()
run_update(), print_update_banner()
start_remote_daemon(), stop_remote_daemon()
list_cloud_profiles(), list_local_profiles(), sync_local_profile()

Agent-Editable Code

agent-workspace/agent_helpers.py

Agent writes missing helpers here during execution
This is the self-healing mechanism — agent extends the harness

Skill Files

SKILL.md

Day-to-day usage instructions
Register with Claude Code via ~/.claude/CLAUDE.md import
Register with Codex via symlink to ~/.codex/skills/

install.md

First-time install + browser setup (named as skill for agents)

interaction-skills/ (17 files)

CDP mechanics reference: connection, cookies, cross-origin-iframes, dialogs, downloads, drag-and-drop, dropdowns, iframes, network-requests, print-as-pdf, profile-sync, screenshots, scrolling, shadow-dom, tabs, uploads, viewport

agent-workspace/domain-skills/

Community per-site playbooks (github/, linkedin/, amazon/, etc.)
Agent-generated, not hand-authored
BH_DOMAIN_SKILLS=1 to enable

CLI Commands

browser-harness <<'PY'  (heredoc invocation — primary usage)
browser-harness --version
browser-harness --doctor
browser-harness doctor [--fix-snap]
browser-harness --update [-y]
browser-harness --reload

Prompts

Browser Harness — Prompts

Verbatim Excerpt 1: SKILL.md — Core CDP Guidance

## What actually works

- Screenshots first: use capture_screenshot() to understand the current page quickly, find visible targets, and decide whether you need a click, a selector, or more navigation.
- Clicking: capture_screenshot() → read the pixel off the image → click_at_xy(x, y) → capture_screenshot() to verify. Suppress the Playwright-habit reflex of "locate first, then click" — no getBoundingClientRect, no selector hunt. Drop to DOM only when the target has no visible geometry (hidden input, 0×0 node).
- Bulk HTTP: http_get(url) + ThreadPoolExecutor. No browser for static pages (249 Netflix pages in 2.8s).
- After goto: wait_for_load().
- Wrong/stale tab: ensure_real_tab().
- Verification: print(page_info()) is the simplest "is this alive?" check, but screenshots are the default way to verify whether a visible action actually worked.
- DOM reads: use js(...) for inspection and extraction when the screenshot shows that coordinates are the wrong tool.

Technique: "Coordinate clicks default" opinionated guidance. Explicit suppression of Playwright-style habits. Describes decision tree as "if X then Y" rules. Anti-pattern prohibition ("no getBoundingClientRect, no selector hunt").

Verbatim Excerpt 2: SKILL.md — Design Constraints

## Design constraints

- Coordinate clicks default. Input.dispatchMouseEvent goes through iframes/shadow/cross-origin at the compositor level.
- Connect to the user's running Chrome. Don't launch your own browser.
- cdp-use is only for CDPClient.send_raw. Prefer raw CDP strings over typed wrappers.
- run.py stays tiny. No argparse, subcommands, or extra control layer.

Technique: Iron-Law constraints. Each constraint is a prohibition or a default behavior. Four constraints govern the entire architecture. Pattern is closest to superpowers' "explicit antipatterns" approach but applied to implementation details.

Verbatim Excerpt 3: Self-Healing Pattern (from README)

  ● agent: wants to upload a file
  │
  ● agent-workspace/agent_helpers.py → helper missing
  │
  ● agent writes it                         agent_helpers.py
  │                                                       + custom helper
  ✓ file uploaded

Technique: Visual ascii-art workflow diagram showing the self-healing loop. The prompt pattern: when a capability is missing, write it to agent_helpers.py rather than failing or asking the user.

Uniqueness

Browser Harness — Uniqueness

Differs From Seeds

Browser Harness is closest to superpowers from the seeds (both use skill-md files, both ship for Claude Code and Codex). However, Browser Harness differs fundamentally: it ships a working Python CDP daemon binary (browser-harness) rather than just behavioral skill files; it connects to the user's real running Chrome browser (not a cloud sandbox); and its self-improving agent_helpers.py pattern (agent writes missing helpers during execution) is unique across the entire corpus. The domain-skills system (agent-generated per-site playbooks) resembles BMAD's persona-driven knowledge base but is generated empirically from real browser sessions rather than hand-authored. Unlike E2B Desktop or AgentBay (both cloud-managed VMs), Browser Harness uses the user's actual Chrome profile — cookies, extensions, login state all available.

Positioning

The minimal, self-improving CDP bridge between an LLM and the user's real browser. Positioned explicitly against "thick harnesses" — the Bitter Lesson claim is that simpler is better for browser automation.

Observable Failure Modes

One stream at a time: Only one local Chrome instance per daemon — no parallelism on local browser
secuopmode dependency: Relies on Chrome remote debugging being enabled by user; Chrome 144+ requires per-attach popup click
Agent writes to agent_helpers.py: Agent can corrupt its own tool code in edge cases
Domain skills are opt-in: BH_DOMAIN_SKILLS=1 must be set explicitly; new users may not discover them
Editable install fragility: git pull required for updates in editable mode; uncommitted changes block update

What Makes It Extraordinary

The self-healing pattern — "the harness improves itself every run" — is the most novel agent execution paradigm in this batch. The principle that domain skills should be agent-generated (not hand-authored) and that agents should extend their own tool library during execution is not implemented in any other framework in the corpus.

Workflow

Browser Harness — Workflow

Setup Workflow (from SKILL.md / install.md)

Phase	Artifact	Description
Install	`browser-harness` binary	`git clone ... && uv tool install -e .`
Register Skill	`~/.claude/CLAUDE.md` import	`@~/Developer/browser-harness/SKILL.md`
Enable Chrome debugging	Chrome checkbox / CLI flag	`chrome://inspect/#remote-debugging` checkbox
First run	Daemon auto-starts	`browser-harness <<'PY' ... PY`

Typical Task Workflow

Step	Action	Note
1	Screenshot → `capture_screenshot()`	Understand current page
2	Read pixel → find target	Coordinate-first approach
3	Click → `click_at_xy(x, y)`	Compositor-level (works through iframes/shadow DOM)
4	Verify → `capture_screenshot()`	Confirm action worked
5	(If missing helper)	Agent writes to `agent_helpers.py`

Self-Healing Loop

Task requires unknown capability
  ↓
Check agent_helpers.py — helper missing
  ↓
Agent writes helper to agent_helpers.py
  ↓
Re-run with new helper available
  ↓
If successful → helper persists for future runs

Domain Skill Activation

BH_DOMAIN_SKILLS=1 browser-harness <<'PY'
new_tab("https://github.com")
# domain-skills/github/ playbooks auto-loaded
PY

Remote Browser (Parallel Sub-agents)

start_remote_daemon("work")   # separate cloud browser via Browser Use Cloud
BU_NAME=work browser-harness <<'PY'
new_tab("https://example.com")
PY

Memory Context

Browser Harness — Memory & Context

State Storage

agent_helpers.py: Persistent agent-editable helper code. This is the primary memory artifact — capabilities the agent discovers are written here and persist across runs.
domain-skills/: Per-site playbooks the agent generates and files as PRs. Community knowledge that persists beyond individual runs.
interaction-skills/: Reference skill files (read-only to agent — contribute via PR).

Memory Type

File-based (Python code as memory). agent_helpers.py is the externalized capability memory.

Cross-Session Handoff

Yes — agent_helpers.py and domain-skills/ persist across sessions (they're in the repo checkout). When the harness is installed as an editable clone, helper improvements persist automatically.

Domain Skill Memory

Per-site playbook (domain-skills/github/)
  ↓ loaded when BH_DOMAIN_SKILLS=1
  ↓ agent reads all files in matching domain-skills/<site>/ directory
  ↓ uses saved selectors, flows, and edge cases

Profile Sync (Cloud)

sync_local_profile() — uploads local Chrome profile cookies to Browser Use cloud. Enables cookie-based session state to persist across cloud browser instances.

IPC State

Unix socket at /tmp/bu-<NAME>.sock (POSIX) or TCP loopback (Windows) — namespaced by BU_NAME. Daemon state in pid file at same namespace.

Update Memory

The harness tracks available updates — print_update_banner() prints once per day when newer version available.

Compaction

Not applicable — the harness is a thin CDP bridge, not an LLM agent framework. Context management is delegated to the agent (Claude Code, Codex, etc.).

Orchestration

Browser Harness — Orchestration

Multi-Agent Support

Yes — via BU_NAME namespacing. Each sub-agent gets its own isolated cloud browser via a distinct BU_NAME value.

BU_NAME=work browser-harness <<'PY' ... PY
BU_NAME=research browser-harness <<'PY' ... PY

Each named daemon connects to a different cloud browser instance.

Orchestration Pattern

Parallel — named daemons run independent browser sessions simultaneously. No sequential coordination built-in.

Isolation Mechanism

Local: None (connects to user's running Chrome — shared browser state). Remote: Cloud browser per BU_NAME (Browser Use cloud API, isolated per daemon).

Subagent Definition Format

Not applicable as an agent framework. The "sub-agents" are parallel BU_NAME daemon instances.

Multi-Model Usage

Not applicable — harness is model-agnostic.

Execution Mode

Continuous (daemon persists) — the daemon auto-starts and stays running between browser-harness calls. Each call connects to the persistent daemon.

Crash Recovery

ensure_real_tab() recovers from stale tab state. Daemon auto-restarts if not alive (ensure_daemon()). --reload flag stops daemon for code update pickup.

Context Compaction

Not applicable at harness level.

Streaming Output

Real-time CDP events propagate through the IPC layer. Screenshots returned as binary data per call.

Remote Browser Billing

"Running remote daemons bill until timeout." Cloud browsers are metered.

Ui Cli Surface

Browser Harness — UI / CLI Surface

CLI Binary

Name: browser-harness
Not a thin wrapper — own Python daemon runtime
Install: uv tool install -e . (editable, global) or uv tool install browser-harness (PyPI)

Invocation Pattern

# Primary — heredoc form (prevents shell quote mangling)
browser-harness <<'PY'
new_tab("https://example.com")
print(page_info())
PY

# Maintenance commands
browser-harness --version
browser-harness --doctor
browser-harness doctor [--fix-snap]
browser-harness --update [-y]
browser-harness --reload

Local Web UI

None — the harness connects to the user's Chrome browser (user provides the UI). No separate dashboard.

Observability

--doctor — diagnose install, daemon, and browser state
print_update_banner() — daily update check (once per day)
Browser's existing DevTools inspector (user-owned Chrome)

Skill Registration

To make harness available in any agent session:

Claude Code: Add @~/Developer/browser-harness/SKILL.md to ~/.claude/CLAUDE.md
Codex: Symlink SKILL.md to ~/.codex/skills/browser-harness/SKILL.md

Remote Browser Dashboard

liveUrl printed when starting remote daemon — share with user to watch in real-time via Browser Use cloud UI.

Cross-Tool Portability

Medium — designed for Claude Code and Codex explicitly. Skill file can be adapted for other agents. Python-only (pyproject.toml).

Related frameworks

same archetype · same primary tool · same memory type

alirezarezvani/claude-skills ★ 16k

A18 Self-evolving

313+ skills for 12 AI tools covering engineering, marketing, C-level advisory, compliance, research, and finance — all from one…

MoAI-ADK ★ 1.0k

A18 Self-evolving

Implements Harness Engineering as a Go-binary-installed Claude Code environment with auto-TDD/DDD methodology selection, 20-event…

REAP (c-d-cc/reap) ★ 41

A18 Self-evolving

Prevent context loss, scattered development, and forgotten lessons through a generation-based lifecycle where AI and human…

Codex Harness MCP ★ 7

A18 Self-evolving

Gives MCP-capable coding agents a local contract-lifecycle harness with governance audits and explicit completion gates.

meta-agent-teams (jbrahy) ★ 2

A18 Self-evolving

Build self-improving AI agent teams via a supervised training loop: specialist agents advise, a meta-agent evolves prompts based…

SwarmVault ★ 492

A18 Self-evolving

Production-grade CLI for Karpathy's LLM Wiki pattern: ingests any content into a local-first durable markdown wiki + knowledge…

Distribution

Type: standalone-repo
License: MIT
Install: clone-and-configure

Surfaces

CLI binary: browser-harness
CLI subcmds: 5
Local UI: No
Tech stack: null

Components

Commands: 5
Skills: 20
Subagents: 0
Hooks: 0
MCP servers: 0
MCP tools: 0
Scripts: 0
Templates: 1

Workflow

Phases: 5
Approval gates: 0
Spec format: none
Spec storage: none
Delta or full: none

Orchestration

Multi-agent: Yes
Pattern: parallel-fan-out
Isolation: process
Consensus: none
Prompt chaining: No

Multi-model

Multi-model: No
BYOK: Yes
Modal: text+vision

Execution

Mode: continuous-ralph
Crash recovery: Yes
Compaction: No
Session handoff: Yes
Streaming: No

Memory

Type: file-based
Persistence: project
Search: none
State files: 2 files

Quality

TDD: No
TDD mechanism: none
Self-review: none

Git / Observability

Auto commit: No
Auto PR: No
Auto merge: No
Worktree/feat: No
Audit log: No
Audit format: none
Replay: No

Tools

Primary: Claude Code
Targets: 2
Portability: medium

Signals

Stars: 14k
Last commit: 2026-05-20
Maintainer: active
Quality score: 2.3/10