Tier A A8 Cross-runtime harness

wshobson/agents Plugin Marketplace

wshobson-agents · wshobson/agents · ★ 36k · last commit 2026-05-26

Primitive shape 448 total

102

155

191

Commands 102 Skills 155 Subagents 191

Summary

wshobson/agents — Summary

wshobson/agents is the most architecturally sophisticated catalog in this batch: 83 plugins, 191 agents, 155 skills, and 102 commands authored once in canonical Markdown and transpiled by a Python adapter framework to five harnesses (Claude Code, Codex CLI, Cursor, OpenCode, Gemini CLI). The organizing principle is the plugin as the unit of installation — each plugin is a domain-bounded folder containing agents, commands, and skills that are auto-discovered from directory structure, so installing python-development loads only those three Python agents and 16 skills, not the whole marketplace. A four-tier model strategy explicitly assigns Opus 4.7 to architecture/security work, Haiku to fast operational tasks, and inherit (user-chosen) for mid-tier agents. A built-in plugin-eval framework provides three-layer quality certification (static structural, LLM-judge, Monte Carlo). Cross-harness portability is the primary design constraint: source content is authored in harness-neutral Markdown; adapters own all per-harness transforms including model-alias mapping, 8 KB body caps for Codex, and TOML frontmatter generation for Gemini. Compared to seeds, wshobson/agents differs from superpowers (14 skills, single-harness) by two orders of magnitude in catalog size and by its multi-harness adapter architecture; it differs from BMAD-METHOD (personas) by its plugin-boundary isolation model and automated adapter-driven distribution rather than monolithic install.

Overview

wshobson/agents — Overview

Origin

Created by Will Hobson and community contributors (35,958 GitHub stars as of 2026-05-26), actively maintained with last push 2026-05-26. Initially a Claude Code plugin marketplace; expanded to multi-harness support as OpenAI Codex CLI, OpenCode, Cursor, and Gemini CLI gained skill/agent primitives.

Philosophy

From ARCHITECTURE.md:

"Single source of truth. All agent / skill / command authoring happens under plugins/<name>/. Generated harness-specific artifacts ... are produced by adapters and gitignored. Never hand-edit generated files."

"Adapters own per-harness mechanics; source content stays portable. Authors write Claude-Code-quality markdown. Adapters under tools/adapters/ handle every harness-specific transform (frontmatter rewriting, model-alias mapping, body-size caps, tool-name remapping). Source files never carry harness conditional logic."

"Progressive disclosure all the way down. Context files (AGENTS.md, CLAUDE.md, etc.) cap at ~150 lines. Skill bodies cap at ~8 KB (Codex's hard limit). Detail offloads to docs/ and references/details.md. Detail is loaded on demand, not pre-injected."

Manifesto-Style Statements

From README.md:

"One source-of-truth (plugins/), five harnesses. Each harness gets idiomatic, harness-native artifacts — not lowest-common-denominator translations."

"Each plugin is isolated and composable: agents, commands, and skills are auto-discovered from directory structure. Installing a plugin loads only its components into context — not the whole marketplace."

Domain Coverage

83 plugins span: programming languages (Python, JavaScript/TypeScript, Go, Rust, JVM, Julia, .NET, C/C++), infrastructure (Kubernetes, cloud, database), security, SEO, frontend/mobile, ML/AI, documentation, business analytics, startup tooling, quantitative trading, and meta-level tools (conductor, plugin-eval, block-no-verify).

Model Philosophy

Four explicit tiers:

Tier 1 (Opus 4.7): Architecture, security, code review, production-critical
Tier 2 (inherit): User-chosen model for backend, frontend, AI/ML specialists
Tier 3 (Sonnet): Docs, testing, debugging
Tier 4 (Haiku): Fast ops, SEO, deployment, content

Architecture

wshobson/agents — Architecture

Distribution

Primary: Claude Code plugin marketplace (/plugin marketplace add wshobson/agents)
Secondary (generated): Codex CLI, Cursor, OpenCode, Gemini CLI via make generate HARNESS=<x>
License: MIT

Directory Structure (canonical)

plugins/                          # SOURCE OF TRUTH (81 local + 2 external git-subdir)
  <name>/
    .claude-plugin/plugin.json
    agents/*.md                   # Domain experts (persona-md format)
    commands/*.md                 # Slash commands
    skills/<n>/
      SKILL.md                    # Skill body
      references/                 # Progressive-disclosure detail
      assets/                     # Templates
tools/
  adapters/
    base.py                       # Parser, HarnessAdapter ABC
    capabilities.py               # Model alias + capability matrix
    codex.py / cursor.py / opencode.py / gemini.py
  generate.py                     # `make generate HARNESS=<x>`
  validate_generated.py
  doc_gardener.py                 # Drift / dead-link / cap detection
  tests/                          # 386 pytest tests
.claude-plugin/marketplace.json   # Plugin registry
AGENTS.md                         # Canonical context file (committed)
CLAUDE.md                         # @AGENTS.md + Claude-specific addenda
GEMINI.md                         # Gemini CLI setup

Install Methods

# Claude Code (native)
/plugin marketplace add wshobson/agents
/plugin install python-development

# Multi-harness generation (from clone)
gh repo clone wshobson/agents ~/agents
cd ~/agents
make generate HARNESS=codex        # or cursor / opencode / gemini
make generate-all                  # all four

Required Runtime

Python (for adapter scripts: make generate, uv run plugin-eval)
No Node runtime required for Claude Code native install

Target AI Tools

Claude Code (primary), OpenAI Codex CLI, Cursor, OpenCode, Gemini CLI

Generated Artifacts Per Harness

Harness	Output paths
Codex CLI	`.codex/skills/`, `.codex/agents/*.toml`
Cursor	`.cursor-plugin/`, `.cursor/rules/*.mdc`
OpenCode	`.opencode/agents/`, `.opencode/commands/`, `.opencode/skills/`
Gemini CLI	`skills/`, `agents/`, `commands/*.toml`

Quality Gates

make validate — structural checks, blocks CI on errors
make garden — drift detection, dead links, oversize skills, orphans
make test — 386 pytest tests (adapters + validators + round-trip + CLI smoke)

Components

wshobson/agents — Components

Totals

Type	Count
Plugins	83 (81 local + 2 external)
Agents	191
Skills	155
Commands	102
Orchestrators	16 (multi-agent coordination workflows)

Plugin Taxonomy (83 plugins, representative sample)

Language / Framework Specialists (~20 plugins)

python-development — 3 agents (python-pro, django-pro, fastapi-pro), 16 skills
javascript-typescript — JS/TS focused agents and skills
jvm-languages — Java, Kotlin, Scala agents
julia-development — Julia language expert
systems-programming — Rust, C/C++ specialists
dotnet-contribution — .NET ecosystem agents
functional-programming — FP-focused agents

Infrastructure / DevOps (~10 plugins)

kubernetes-operations — K8s deployment and management
cloud-infrastructure — AWS/GCP/Azure specialists
cicd-automation — CI/CD pipeline agents
deployment-strategies — Deployment pattern experts
deployment-validation — Post-deploy validation agents

Security (~5 plugins)

backend-api-security — API security scanning
security-compliance — Compliance auditing
security-scanning — Vulnerability detection
frontend-mobile-security — Client-side security
signed-audit-trails — Audit trail management

Quality / Testing (~5 plugins)

tdd-workflows — TDD specialist agents
unit-testing — Test writing agents
performance-testing-review — Load testing experts
data-validation-suite — Data quality agents

Orchestration / Meta (16 orchestrator plugins)

full-stack-orchestration — Coordinates backend, frontend, testing, deploy agents
agent-teams — Pre-configured multi-agent team setups
conductor — Context-Driven Development setup validator (conductor-validator agent)
agent-orchestration — General orchestration patterns
comprehensive-review — Multi-perspective code analysis coordinator
incident-response — Production incident orchestration
protect-mcp — MCP security governance agent

Business / Non-Engineering (~15 plugins)

content-marketing — Content creation agents
seo-analysis-monitoring — SEO monitoring agents
seo-content-creation — SEO content writing
seo-technical-optimization — Technical SEO auditing
business-analytics — Business data analysis
startup-business-analyst — Startup-focused analysis
quantitative-trading — Quant trading specialists
hr-legal-compliance — HR and legal agents

Developer Tools (~10 plugins)

code-refactoring — Refactoring specialists
debugging-toolkit — Debug-focused agents
code-documentation — Doc generation agents
dependency-management — Dependency audit agents
git-pr-workflows — Git and PR automation agents

Sample Agents (verbatim frontmatter)

python-pro (python-development plugin)

name: python-pro
description: Master Python 3.12+ with modern features, async programming, performance optimization... Use PROACTIVELY for Python development, optimization, or advanced Python patterns.
model: opus

conductor-validator (conductor plugin)

name: conductor-validator
description: Validates Conductor project artifacts for completeness, consistency, and correctness. Use after setup, when diagnosing issues, or before implementation to verify project context.
tools: Read, Glob, Grep, Bash
model: opus
color: cyan

Sample Skills (verbatim frontmatter)

async-python-patterns (python-development plugin)

name: async-python-patterns
description: Master Python asyncio, concurrent programming, and async/await patterns for high-performance applications. Use when building async APIs, concurrent systems, or I/O-bound applications requiring non-blocking operations.

Commands (102 total)

Slash commands distributed across plugins. Examples from python-development: scaffolding command for project setup. Categories include: API scaffolding, security scan, test generation, infrastructure setup, debugging, refactoring, documentation generation.

Quality Evaluation Framework (plugin-eval)

plugin-eval score <path> --depth quick — Static + LLM judge
plugin-eval certify <path> — Full Monte Carlo certification
Three layers: static structural (<2s), LLM judge (~30s, Haiku+Sonnet), Monte Carlo (50-100 runs, 2-5 min)

Prompts

wshobson/agents — Prompt Excerpts

Excerpt 1: python-pro Agent (plugins/python-development/agents/python-pro.md)

Technique: Role-definition with explicit capability matrix + progressive disclosure pattern (frontmatter + structured sections)

---
name: python-pro
description: Master Python 3.12+ with modern features, async programming, performance optimization, and production-ready practices. Expert in the latest Python ecosystem including uv, ruff, pydantic, and FastAPI. Use PROACTIVELY for Python development, optimization, or advanced Python patterns.
model: opus
---

You are a Python expert specializing in modern Python 3.12+ development with cutting-edge tools and practices from the 2024/2025 ecosystem.

## Purpose
Expert Python developer mastering Python 3.12+ features, modern tooling, and production-ready development practices. Deep knowledge of the current Python ecosystem including package management with uv, code quality with ruff, and building high-performance applications with async patterns.

## Capabilities
### Modern Python Features
- Python 3.12+ features including improved error messages, performance optimizations, and type system enhancements
- Advanced async/await patterns with asyncio, aiohttp, and trio
...
### Modern Tooling & Development Environment
- Package management with uv (2024's fastest Python package manager)
- Code formatting and linting with ruff (replacing black, isort, flake8)

Analysis: Standard persona-md pattern with capability taxonomy. Trigger phrase "Use PROACTIVELY" is deliberate — tells Claude Code to auto-spawn rather than wait for explicit invocation. Model pinned to Opus reflecting Tier 1 strategy.

Excerpt 2: async-python-patterns Skill (plugins/python-development/skills/async-python-patterns/SKILL.md)

Technique: When-to-use decision table + sync/async decision guide (tabular knowledge encoding)

---
name: async-python-patterns
description: Master Python asyncio, concurrent programming, and async/await patterns for high-performance applications. Use when building async APIs, concurrent systems, or I/O-bound applications requiring non-blocking operations.
---

# Async Python Patterns

## When to Use This Skill
- Building async web APIs (FastAPI, aiohttp, Sanic)
- Implementing concurrent I/O operations (database, file, network)
...

## Sync vs Async Decision Guide

| Use Case | Recommended Approach |
|----------|---------------------|
| Many concurrent network/DB calls | `asyncio` |
| CPU-bound computation | `multiprocessing` or thread pool |
| Mixed I/O + CPU | Offload CPU work with `asyncio.to_thread()` |
| Simple scripts, few connections | Sync (simpler, easier to debug) |
| Web APIs with high concurrency | Async frameworks (FastAPI, aiohttp) |

**Key Rule:** Stay fully sync or fully async within a call path. Mixing creates hidden blocking and complexity.

Analysis: Progressive-disclosure skill — frontmatter description is the trigger, body provides reference knowledge loaded on demand. Decision table pattern encodes architectural judgment as structured rules rather than prose.

Excerpt 3: conductor-validator Agent

Technique: Validation checklist with structured categorization (A/B/C categories)

---
name: conductor-validator
description: Validates Conductor project artifacts for completeness, consistency, and correctness.
tools: Read, Glob, Grep, Bash
model: opus
color: cyan
---

## Validation Categories

### A. Setup Validation
Verify the foundational Conductor structure exists and is properly configured.
**Required Files:**
- `conductor/index.md` - Navigation hub
- `conductor/product.md` - Product vision and goals
...

### B. Content Validation
Verify required sections exist within each artifact.
**product.md Required Sections:**
- Overview or Introduction
- Problem Statement
- Target Users
- Value Proposition

Analysis: Validator agent pattern — explicit tool grants (Read, Glob, Grep, Bash) scoped to inspection-only operations. Checklist structure makes validation deterministic rather than heuristic.

Uniqueness

wshobson/agents — Uniqueness & Positioning

Differs From Seeds

Closest seed analogs are superpowers (14 skills, single-harness, skills-only behavioral framework) and BMAD-METHOD (34 skills + 6 persona files, monolithic install). wshobson/agents differs from both by: (1) catalog scale — 83 plugins / 191 agents / 155 skills vs superpowers' 14 skills; (2) multi-harness adapter architecture — a Python transpilation layer converts canonical Markdown to five distinct harness formats, whereas superpowers and BMAD ship one format; (3) plugin-boundary isolation — users install individual domain plugins rather than the whole catalog; (4) explicit model tiering baked into agent frontmatter (Opus/Sonnet/Haiku/inherit), whereas superpowers has no model routing. Compared to claude-flow (MCP-anchored, 305 tools in 1 server), wshobson/agents lives entirely in the file-based agent/skill primitive layer with no MCP bundling.

Distinctive Position

The only repo in this batch that treats multi-harness portability as a first-class engineering problem. The adapter framework (tools/adapters/) with its capability matrix, model-alias mapping, and body-size enforcement is more engineering infrastructure than most "skill packs." The plugin-eval quality certification system (static + LLM judge + Monte Carlo) has no analog in other repos in this batch.

Observable Failure Modes

Adapter drift: If harness APIs change (Codex TOML format, OpenCode schema), generated artifacts silently produce wrong output until a make validate run catches it.
8 KB cap violations: Skills exceeding the Codex body cap will be flagged by make garden but may ship broken for Codex users if garden isn't run.
Model alias rot: MODEL_ALIASES in capabilities.py maps logical tier names to model IDs; as models deprecate, this mapping needs manual updates.
Scope creep per plugin: With 191 agents across 83 plugins, quality variance across domain areas (core engineering vs niche domains like quantitative-trading) is likely high.

Explicit Antipatterns (from architecture docs)

"Never hand-edit generated files"
"Source files never carry harness conditional logic"
Context files must not exceed ~150 lines (progressive disclosure violated = context bloat)

Workflow

wshobson/agents — Workflow

User Workflow

Install Phase

Add marketplace: /plugin marketplace add wshobson/agents
Browse catalog: /plugin install <plugin-name>
Plugin auto-discovers agents/skills/commands from directory structure

Development Phase (per plugin)

Agents activate automatically based on description trigger phrases ("Use PROACTIVELY when…")
Skills load on contextual match
Commands invoked explicitly as slash commands

Multi-Harness Generation Workflow

Author content in plugins/<name>/ (canonical Markdown)
Run make generate HARNESS=<x> → adapter emits harness-native artifacts
Run make validate → structural checks
Run make garden → drift detection
Run make test → 386-test suite + CLI smoke tests

Quality Certification Workflow

uv run plugin-eval score path/to/skill --depth quick    # fast
uv run plugin-eval certify path/to/skill                # full Monte Carlo

Phase-to-Artifact Map

Phase	Artifact
Plugin authoring	`plugins/<name>/{agents,skills,commands}/*.md`
Adapter generation	`.codex/`, `.cursor-plugin/`, `.opencode/`, `skills/` (Gemini)
Validation	CI pass/fail report
Quality eval	`plugin-eval` score + certification badge
Install	`~/.claude/plugins/<name>/` (Claude Code)

Approval Gates

None for automated installation. Plugin-eval certification is optional/advisory (not blocking for users, only for contributors).

Spec Format

None (no spec-driven development workflow). This is a skill/agent catalog, not a spec-to-implementation framework.

Memory Context

wshobson/agents — Memory & Context

State Storage

No persistent state managed by the marketplace itself. Individual plugins may write files as part of their workflows (e.g., conductor plugin writes conductor/ directory artifacts), but the marketplace has no central memory store.

Context Loading Strategy

Progressive disclosure is the architectural principle:

AGENTS.md / CLAUDE.md cap at ~150 lines (index, not full content)
Skill bodies cap at ~8 KB (Codex hard limit, enforced globally)
Detail offloads to references/ subdirectories loaded on demand
doc_gardener.py detects oversize skills that violate the cap

Conductor Plugin Memory Pattern

The conductor plugin writes project-scoped markdown files:

conductor/index.md — navigation hub
conductor/product.md — product vision
conductor/tech-stack.md — technology decisions
conductor/workflow.md — development practices
conductor/tracks.md — master track registry

These persist across sessions as file-based project context.

Pensyve Integration

The marketplace includes Pensyve as an external plugin (git-subdir entry) for Claude Code. Pensyve provides persistent external memory. Separate install required per harness.

Cross-Session Handoff

No built-in mechanism. The conductor plugin's markdown files serve as session-to-session context for projects using that plugin.

Context Compaction

AGENTS.md / CLAUDE.md are designed to stay under ~150 lines, which is the primary compaction mechanism — detail is not pre-loaded into context.

Orchestration

wshobson/agents — Orchestration

Multi-Agent Support

Yes. 16 orchestrator plugins coordinate multiple domain agents.

Orchestration Pattern

Hierarchical + parallel fan-out: Orchestrator agents (e.g., full-stack-orchestration) spawn specialized domain agents via the Task tool. The feature-development workflow in wshobson/commands (the predecessor repo) shows the explicit pattern: backend-architect → frontend-developer → test-automator → deployment-engineer in sequence, or tdd-orchestrator mode.

Isolation Mechanism

Process-level (Claude Code's native subagent isolation): each spawned agent gets its own context window. No git-worktree or container isolation.

Multi-Model Routing

Yes — explicit four-tier model strategy:

Tier 1 (Opus 4.7): Architecture, security, code review, production-critical agents
Tier 2 (inherit): Mid-tier agents — user's chosen model
Tier 3 (Sonnet): Docs, testing, debugging agents
Tier 4 (Haiku): Fast ops, SEO, deployment, content agents

Each agent's frontmatter carries model: opus|sonnet|haiku|inherit. Adapters map these aliases to harness-native model IDs via tools/adapters/capabilities.py:MODEL_ALIASES.

Execution Mode

Interactive-loop — user installs plugin, Claude Code activates agents/skills contextually. No daemon, no scheduling.

Consensus

None.

Prompt Chaining

Implicit in orchestrator patterns: orchestrator agent passes results from one specialist as context to the next. Not a formal chaining protocol.

Cross-Tool Portability

High — the primary engineering goal. Five harnesses supported via adapter framework. Capability matrix in tools/adapters/capabilities.py tracks per-harness feature support.

Ui Cli Surface

wshobson/agents — UI / CLI Surface

Dedicated CLI Binary

No standalone binary. Tooling is Python scripts invoked via Makefile:

make generate HARNESS=<x> — adapter generation
make validate — structural validation
make garden — drift detection
make test — test suite
uv run plugin-eval score <path> — quality evaluation

Local Web Dashboard

None.

IDE Integration

Claude Code: Native plugin marketplace integration (/plugin marketplace add)
Cursor: Generated .cursor-plugin/ + .cursor/rules/*.mdc
Gemini CLI: Auto-discovers AGENTS.md via .gemini/settings.json
OpenCode: Generated .opencode/ directory artifacts
Codex CLI: Generated .codex/ directory artifacts

Observability / Audit

make garden provides drift detection: dead links, stale artifacts, oversize skills (>8KB), marketplace orphans
CI workflow (.github/workflows/validate.yml) runs validate + garden + test on every PR, plus CLI smoke tests against OpenCode and Gemini
plugin-eval certify provides Monte Carlo reliability scores

Plugin Marketplace Browser

/plugin marketplace add wshobson/agents then standard Claude Code plugin commands. Web companion at no confirmed URL.

Related frameworks

same archetype · same primary tool · same memory type

claude-mem (thedotmack) ★ 78k

A8 Cross-runtime harness

Background worker service captures every tool call as an observation, AI-compresses sessions, and auto-injects relevant past…

pi (badlogic/earendil) ★ 55k

A8 Cross-runtime harness

A minimal, hackable, multi-provider terminal coding agent that adapts to your workflows via npm-installable TypeScript Extensions…

Agent Skills (Addy Osmani) ★ 46k

A8 Cross-runtime harness

Encodes senior-engineer software development lifecycle as 23 auto-routed skills and 7 slash commands for any AI coding agent.

TabbyML/Tabby ★ 34k

A8 Cross-runtime harness

Self-hosted AI coding assistant server (alternative to GitHub Copilot) with admin dashboard, RAG-based completions, and multi-IDE…

Compound Engineering ★ 17k

A8 Cross-runtime harness

Make each unit of engineering work compound into easier future work via brainstorm→plan→execute→review→learn cycles.

Qodo (PR-Agent) ★ 11k

A8 Cross-runtime harness

Open-source AI PR reviewer with single-call tool architecture, PR compression for large diffs, self-reflection quality gate, and…

Distribution

Type: claude-plugin
License: MIT
Install: one-liner
Version: unknown (no semver; last push 2026-05-26)

Surfaces

CLI binary: No
CLI subcmds: 0
Local UI: No

Components

Commands: 102
Skills: 155
Subagents: 191
Hooks: 0
MCP servers: 0
MCP tools: 0
Scripts: 5
Templates: 0

Workflow

Phases: 3
Approval gates: 0
Spec format: none
Spec storage: none
Delta or full: none

Orchestration

Multi-agent: Yes
Pattern: hierarchical
Isolation: process
Consensus: none
Prompt chaining: Yes

Multi-model

Multi-model: Yes
BYOK: Yes
Modal: text+vision

Execution

Mode: interactive-loop
Crash recovery: No
Compaction: Yes
Session handoff: No
Streaming: Yes

Memory

Type: file-based
Persistence: project
Search: none
State files: 3 files

Quality

TDD: Optional
TDD mechanism: dedicated-skill
Validators: 3
Self-review: adversarial-subagent

Git / Observability

Auto commit: No
Auto PR: No
Auto merge: No
Worktree/feat: No
Audit log: No
Audit format: none
Replay: No

Tools

Primary: claude-code
Targets: 5
Portability: high

Signals

Stars: 36k
Last commit: 2026-05-26
Maintainer: active
Quality score: 4.6/10