Skip to content
/

Agent Skills (Addy Osmani)

agent-skills-addyosmani · addyosmani/agent-skills · ★ 46k · last commit 2026-05-24

Primitive shape 34 total
Commands 7 Skills 23 Subagents 3 Hooks 1
00

Summary

Agent Skills (Addy Osmani) — Summary

Agent Skills by Addy Osmani is a production-grade, cross-platform skill pack for AI coding agents that encodes the full software development lifecycle — from idea refinement and spec writing through TDD, code review, and shipping — as a set of 23 skills and 7 slash commands installable as a Claude Code plugin, Cursor rules, or Gemini CLI skills.

Problem it solves: AI coding agents lack consistent workflows: they skip specs, write tests after code, miss security reviews, and produce brittle work; Agent Skills imposes a discovery-tree that routes each task to the right workflow skill automatically via a meta-skill (using-agent-skills) injected at session start.

Distinctive trait: Every skill is organized around a named development phase with explicit "When to Use / When NOT to use" gates and carries a self-contained decision tree; the meta-skill using-agent-skills acts as a dispatcher that maps any incoming task to the correct skill using a single flowchart.

Target audience: Individual developers using Claude Code, Cursor, Gemini CLI, Windsurf, or OpenCode who want senior-engineer workflows without having to define them themselves.

Scope: 23 skills (including 3 named subagents: code-reviewer, security-auditor, test-engineer), 7 commands mirroring the lifecycle, and a SessionStart hook; cross-platform with installation paths for 6+ coding agents.

Differs from seeds: Most similar to superpowers (same Archetype 1 — skills-only behavioral framework with SessionStart hook injection), but adds explicit slash commands alongside skills (breaking pure Archetype 1), ships named subagent persona files (code-reviewer, security-auditor, test-engineer) that superpowers lacks, and targets a broader cross-tool surface (Cursor, Windsurf, Gemini CLI) rather than the plugin-only model.

01

Overview

Agent Skills (Addy Osmani) — Overview

Origin

Created and maintained by Addy Osmani (Google Chrome DevRel, formerly of the V8 team), agent-skills launched as a community skill pack that bridges battle-tested senior-engineer workflows into AI coding agent sessions. First commit was around early 2025; the repo has grown to 45k+ stars with 28 contributors.

Philosophy

The README states the core premise directly:

"Skills encode the workflows, quality gates, and best practices that senior engineers use when building software. These ones are packaged so AI agents follow them consistently across every phase of development."

The meta-skill using-agent-skills embeds an explicit philosophy:

"Agent Skills is a collection of engineering workflow skills organized by development phase. Each skill encodes a specific process that senior engineers follow. This meta-skill helps you discover and apply the right skill for your current task."

Core operating behaviors are declared as non-negotiable across all skills:

  1. Surface Assumptions — list assumptions before any implementation
  2. Manage Confusion Actively — STOP, name the confusion, present the tradeoff
  3. Push Back When Warranted — "You are not a yes-machine. Sycophancy is a failure mode."

Manifesto Excerpts

From using-agent-skills:

"Push Back When Warranted: You are not a yes-machine. When an approach has clear problems: Point out the issue directly. Explain the concrete downside (quantify when possible — 'this adds ~200ms latency' not 'this might be slower'). Propose an alternative. Accept the human's decision if they override with full information. Sycophancy is a failure mode. 'Of course!' followed by implementing a bad idea helps no one. Honest technical disagreement is more valuable than false agreement."

From spec-driven-development:

"Write a structured specification before writing any code. The spec is the shared source of truth between you and the human engineer — it defines what we're building, why, and how we'll know it's done. Code without a spec is guessing."

From test-driven-development:

"Tests are proof — 'seems right' is not done. A codebase with good tests is an AI agent's superpower; a codebase without tests is a liability."

Relationship to Agent Skills Standard

The repo references agentskills.io as the standard definition. Both anthropics/skills and openai/skills reference the same agentskills.io standard, indicating this is a community-wide format that Addy's implementation complies with.

02

Architecture

Agent Skills (Addy Osmani) — Architecture

Distribution

  • Type: Claude Code plugin (.claude-plugin/plugin.json) + cross-platform skill pack
  • Claude Code install:
    /plugin marketplace add addyosmani/agent-skills
    /plugin install agent-skills@addy-agent-skills
    
    Or local: claude --plugin-dir /path/to/agent-skills
  • Cursor: Copy any SKILL.md into .cursor/rules/
  • Gemini CLI:
    gemini skills install https://github.com/addyosmani/agent-skills.git --path skills
    
  • Windsurf: Add skill contents to Windsurf rules config
  • OpenCode: Via AGENTS.md + skill tool (skill execution pattern)
  • GitHub Copilot: (docs/copilot-setup.md)

Required Runtime

None — pure markdown/shell. The SessionStart hook requires jq (brew/apt installable); gracefully degrades if absent.

Repository Structure

agent-skills/
├── .claude-plugin/
│   ├── marketplace.json        # Marketplace listing metadata (empty/legacy)
│   └── plugin.json             # Plugin manifest: commands, skills, agents paths
├── .claude/
│   └── commands/               # 7 slash commands (.md)
│       ├── spec.md
│       ├── plan.md
│       ├── build.md
│       ├── test.md
│       ├── review.md
│       ├── code-simplify.md
│       └── ship.md
├── .gemini/                    # Gemini CLI config
├── .opencode/                  # OpenCode config
├── agents/                     # 3 named subagent persona files
│   ├── code-reviewer.md
│   ├── security-auditor.md
│   └── test-engineer.md
├── skills/                     # 23 skill directories, each with SKILL.md
│   ├── using-agent-skills/     # Meta-dispatcher skill
│   ├── spec-driven-development/
│   ├── test-driven-development/
│   ├── planning-and-task-breakdown/
│   ├── incremental-implementation/
│   ├── code-review-and-quality/
│   ├── debugging-and-error-recovery/
│   ├── security-and-hardening/
│   ├── performance-optimization/
│   ├── frontend-ui-engineering/
│   ├── api-and-interface-design/
│   ├── git-workflow-and-versioning/
│   ├── ci-cd-and-automation/
│   ├── shipping-and-launch/
│   ├── documentation-and-adrs/
│   ├── browser-testing-with-devtools/
│   ├── context-engineering/
│   ├── source-driven-development/
│   ├── doubt-driven-development/
│   ├── deprecation-and-migration/
│   ├── code-simplification/
│   ├── idea-refine/
│   └── interview-me/
├── hooks/
│   ├── hooks.json              # SessionStart hook declaration
│   ├── session-start.sh        # Injects using-agent-skills meta-skill
│   ├── sdd-cache-pre.sh        # SDD-CACHE support (pre)
│   ├── sdd-cache-post.sh       # SDD-CACHE support (post)
│   ├── simplify-ignore.sh      # Code simplification filter
│   └── SDD-CACHE.md            # Spec-driven dev cache artifact
├── docs/                       # Multi-tool setup guides
└── references/                 # Reference materials

plugin.json manifest

{
  "name": "agent-skills",
  "description": "Production-grade engineering skills for AI coding agents...",
  "author": {"name": "Addy Osmani"},
  "license": "MIT",
  "commands": "./.claude/commands",
  "skills": "./skills",
  "agents": [
    "./agents/code-reviewer.md",
    "./agents/security-auditor.md",
    "./agents/test-engineer.md"
  ]
}

Config Files

  • hooks/hooks.json — SessionStart hook config
  • .claude-plugin/plugin.json — Plugin manifest
  • CLAUDE.md — Claude-specific instructions
  • AGENTS.md — Cross-agent instructions
03

Components

Agent Skills (Addy Osmani) — Components

Slash Commands (7)

Name Purpose
/spec Invoke spec-driven-development skill; define what to build
/plan Invoke planning-and-task-breakdown skill; break into atomic tasks
/build Invoke incremental-implementation skill; build one slice at a time
/test Invoke test-driven-development skill; write tests as proof
/review Invoke code-review-and-quality skill; review before merge
/code-simplify Invoke code-simplification skill; reduce complexity
/ship Invoke shipping-and-launch skill; deploy to production

Commands are thin wrappers: each invokes its corresponding skill and provides contextual guidance for the session start.

Skills (23)

Name Purpose
using-agent-skills Meta-dispatcher: maps any task to correct skill via decision flowchart
spec-driven-development 4-phase gated workflow: Specify→Plan→Tasks→Implement
planning-and-task-breakdown Break specs into atomic, independently-testable tasks
incremental-implementation Build in thin vertical slices; test after each
test-driven-development Red→Green→Refactor cycle; tests are proof
code-review-and-quality 5-dimension review: correctness, readability, arch, security, performance
debugging-and-error-recovery Systematic root-cause investigation before fixes
security-and-hardening OWASP top-10 patterns, input validation, auth checks
performance-optimization Measure first, then optimize; N+1 detection
frontend-ui-engineering Component-based UI patterns, accessibility, rendering
api-and-interface-design REST/GraphQL contract-first design, versioning
git-workflow-and-versioning Atomic commits, branch strategies, conventional commits
ci-cd-and-automation Pipeline design, test gates, deploy strategies
shipping-and-launch Pre-launch checklist, rollout, observability
documentation-and-adrs Architecture Decision Records, API docs
browser-testing-with-devtools Chrome DevTools MCP integration for UI testing
context-engineering CLAUDE.md / AGENTS.md authoring best practices
source-driven-development Verify against official docs before implementing
doubt-driven-development Pause and verify when uncertain; conservative approach
deprecation-and-migration Safe code removal with backward compatibility gates
code-simplification Simplicity heuristics; complexity debt reduction
idea-refine Turn rough ideas into structured requirements
interview-me Interview-mode discovery; elicit requirements via questions

Subagents / Agents (3)

Name Purpose
code-reviewer Staff engineer persona: 5-dimension code review (correctness, readability, arch, security, perf)
security-auditor Security-specialist persona: OWASP audit, threat model, dependency scan
test-engineer Test-specialist persona: coverage analysis, TDD coaching, test design

These are persona-md files (markdown with role/instructions) invokable as subagents.

Hooks (1 SessionStart)

  • SessionStarthooks/session-start.sh — Injects the full using-agent-skills/SKILL.md content as an IMPORTANT priority message at every session start

Additional Hook Scripts

  • sdd-cache-pre.sh / sdd-cache-post.sh — Pre/post tool use hooks for spec-driven development cache management
  • simplify-ignore.sh — PostToolUse hook filtering for code simplification workflow

Scripts

Path Purpose
hooks/session-start.sh SessionStart injection script
hooks/sdd-cache-pre.sh SDD cache pre-tool hook
hooks/sdd-cache-post.sh SDD cache post-tool hook
hooks/simplify-ignore.sh Simplify-ignore filter
hooks/session-start-test.sh Test for session-start hook
hooks/simplify-ignore-test.sh Test for simplify-ignore hook

Templates

None (skills are self-contained; no external template files referenced).

05

Prompts

Agent Skills (Addy Osmani) — Prompts

Prompt 1: spec-driven-development (Phase 1 gate)

File: skills/spec-driven-development/SKILL.md Technique: Staged-gate workflow with explicit assumption surfacing

### Phase 1: Specify

Start with a high-level vision. Ask the human clarifying questions until requirements are concrete.

**Surface assumptions immediately.** Before writing any spec content, list what you're assuming:

ASSUMPTIONS I'M MAKING:

  1. This is a web application (not native mobile)
  2. Authentication uses session-based cookies (not JWT)
  3. The database is PostgreSQL (based on existing Prisma schema)
  4. We're targeting modern browsers only (no IE11) → Correct me now or I'll proceed with these.

Don't silently fill in ambiguous requirements. The spec's entire purpose is to surface misunderstandings *before* code gets written — assumptions are the most dangerous form of misunderstanding.

Technique analysis: Explicit assumption enumeration with a confirmation gate ("→ Correct me now or I'll proceed"). This is a specific prompting pattern for reducing silent hallucination of requirements.


Prompt 2: using-agent-skills (Core Operating Behavior — Push Back)

File: skills/using-agent-skills/SKILL.md Technique: Anti-sycophancy enforcement with explicit failure mode labeling

### 3. Push Back When Warranted

You are not a yes-machine. When an approach has clear problems:

- Point out the issue directly
- Explain the concrete downside (quantify when possible — "this adds ~200ms latency" not "this might be slower")
- Propose an alternative
- Accept the human's decision if they override with full information

Sycophancy is a failure mode. "Of course!" followed by implementing a bad idea helps no one. Honest technical disagreement is more valuable than false agreement.

Technique analysis: Names the failure mode directly ("sycophancy"), provides a negative example ("Of course!"), and sets a concrete standard (quantify downsides). Unusual in skill packs to have explicit anti-patterns for agent behavior rather than just workflow steps.


Prompt 3: test-driven-development (The Prove-It Pattern)

File: skills/test-driven-development/SKILL.md Technique: Red-Green-Refactor with prove-it framing

## The TDD Cycle
RED                GREEN              REFACTOR

Write a test Write minimal code Clean up the that fails ──→ to make it pass ──→ implementation ──→ (repeat) │ │ │ ▼ ▼ ▼ Test FAILS Test PASSES Tests still PASS


### Step 1: RED — Write a Failing Test

Write the test first. It must fail. A test that passes immediately proves nothing.

Technique analysis: ASCII decision flowchart with explicit constraint ("A test that passes immediately proves nothing") framing TDD as an epistemological proof rather than a bureaucratic process.


Prompt 4: incremental-implementation (Vertical Slicing)

File: skills/incremental-implementation/SKILL.md Technique: Commit-gated incremental loop

## The Increment Cycle

┌──────────────────────────────────────┐
│                                      │
│   Implement ──→ Test ──→ Verify ──┐  │
│       ▲                           │  │
│       └───── Commit ◄─────────────┘  │
│              │                       │
│              ▼                       │
│          Next slice                  │
│                                      │
└──────────────────────────────────────┘

For each slice:
1. Implement the smallest complete piece of functionality
2. Test — run the test suite (or write a test if none exists)
3. Verify — confirm the slice works as expected
4. Commit -- save your progress with a descriptive message
5. Move to the next slice — carry forward, don't restart

Technique analysis: Loop diagram with commit as a hard gate before advancing. Prevents the common failure mode of "implement a huge batch and hope it all works."

09

Uniqueness

Agent Skills (Addy Osmani) — Uniqueness

Differs from Seeds

Most similar to superpowers (both are Archetype 1 — skills-only behavioral framework with SessionStart hook injection and no dedicated commands in the core model). The key architectural deltas:

  1. Commands alongside skills — Unlike superpowers (0 commands), agent-skills ships 7 slash commands that mirror its 7 lifecycle phases. This is a hybrid of Archetype 1 and Archetype 2.

  2. Named persona subagents — Ships code-reviewer, security-auditor, and test-engineer as persona-md files, enabling subagent delegation patterns. Superpowers uses the generic Task tool without named personas.

  3. Wider cross-tool surface — Explicitly supports Cursor, Gemini CLI, Windsurf, OpenCode, and GitHub Copilot with dedicated setup guides. Superpowers supports multiple tools but the install model is more plugin-centric.

  4. Broader skill set (23 vs 14) — Covers more lifecycle phases including context-engineering, source-driven-development, doubt-driven-development, and interview-me that superpowers lacks.

  5. No worktree enforcement — Does not have a using-git-worktrees skill or worktree-based isolation mandate. Superpowers treats worktrees as a first-class pattern.

Compared to spec-kit (Archetype 2): agent-skills lacks the mirror command/skill pairing with 2 hooks per command pattern. Compared to BMAD-METHOD (also Archetype 1): agent-skills uses simpler persona files vs. BMAD's complex TOML+MD persona activation sequences.

Positioning

Community-created skill pack by a high-profile developer advocate that is well-documented and actively maintained, offering the broadest cross-tool compatibility of any skill pack in this batch. Its 45k+ stars suggest significant organic adoption.

Observable Failure Modes

  1. Token overhead — The SessionStart hook injects the full using-agent-skills/SKILL.md on every session (large). Heavy sessions will front-load context with the meta-skill's dispatch flowchart regardless of the task type.

  2. No automated verification — All quality gates are instructions, not enforced hooks. A distracted or non-compliant agent will skip the TDD red phase without any system-level catch.

  3. Persona subagents not strongly integrated — The agent files exist but are not automatically invoked at any workflow stage. They require the user or primary agent to explicitly decide to spawn a code-reviewer or security-auditor.

  4. Cursor/Windsurf install is manual copy — Cross-tool portability for non-Claude-Code tools degrades to "copy paste files," losing the plugin lifecycle (no auto-updates).

Explicit Antipatterns (from skills)

  • Writing production code before a failing test ("tests are proof")
  • Silent assumption-filling ("Don't silently fill in ambiguous requirements")
  • Sycophancy ("Of course!" followed by bad ideas)
  • Implementing huge batches without intermediate test/verify cycles
  • Skipping spec for "simple" projects
04

Workflow

Agent Skills (Addy Osmani) — Workflow

Development Lifecycle Phases

Skills map to a linear-but-optional 7-phase lifecycle. Phases are not enforced sequentially — the meta-skill routes to whichever is appropriate.

Phase Skill Key Artifact
1. Ideation idea-refine / interview-me Refined requirements
2. Specification spec-driven-development SPEC.md in project root
3. Planning planning-and-task-breakdown Task list (markdown)
4. Implementation incremental-implementation Working code slices
5. Testing test-driven-development Test suite (red→green)
6. Review code-review-and-quality Categorized review output
7. Ship shipping-and-launch Deployed artifact + runbook

Spec-Driven Development Phases (detailed)

SPECIFY ──→ PLAN ──→ TASKS ──→ IMPLEMENT
   │          │        │          │
   ▼          ▼        ▼          ▼
 Human      Human    Human      Human
 reviews    reviews  reviews    reviews

Phase 1: Specify

  • Ask clarifying questions until requirements are concrete
  • Surface assumptions explicitly before writing
  • Write spec covering: Objective, Commands, Project Structure, Code Style, Testing Strategy, Boundaries (Always/Ask/Never)
  • Save as SPEC.md

Phase 2: Plan

  • Break spec into atomic tasks
  • Each task must be independently testable

Phase 3: Tasks

  • Convert plan into numbered, ordered task list

Phase 4: Implement

  • One slice at a time
  • Tests first (TDD)
  • Verify before advancing

Approval Gates

Gate Type Where
Assumption confirmation before spec writing freetext-clarify spec-driven-development Phase 1
Human spec review before planning file-review spec-driven-development Phase 2
Human plan review before task list file-review spec-driven-development Phase 3
Human task review before implementation file-review spec-driven-development Phase 4
"Correct me now or I'll proceed" assumption gate yes-no any skill start
Stop and clarify when confused freetext-clarify using-agent-skills core behavior

Phase-to-Artifact Map

Phase Artifact
Specify SPEC.md (project root)
Plan Task breakdown (markdown, often in spec or separate file)
TDD Red Failing test file
TDD Green Implementation file
Code review Categorized findings (Critical/Important/Suggestion)
Ship Deployment + runbook

Spec Format

Markdown. Six mandated sections: Objective, Commands, Project Structure, Code Style, Testing Strategy, Boundaries.

06

Memory Context

Agent Skills (Addy Osmani) — Memory & Context

State Storage

  • Type: File-based
  • Persistence: Project-scoped
  • Primary artifact: SPEC.md — written to project root when spec-driven-development skill runs

Session Context Injection

The SessionStart hook injects the full using-agent-skills/SKILL.md content as an IMPORTANT priority message at every session start. This ensures the dispatch flowchart and core operating behaviors are active for the entire session without the user needing to reference them.

Hook output:

{
  "priority": "IMPORTANT",
  "message": "agent-skills loaded. Use the skill discovery flowchart to find the right skill for your task.\n\n<full SKILL.md content>"
}

SDD Cache

The hooks/SDD-CACHE.md artifact and sdd-cache-pre.sh / sdd-cache-post.sh scripts implement a pre/post tool-use caching mechanism for spec-driven development context. This caches spec context between tool invocations to reduce repeated loading.

Cross-Session Handoff

  • SPEC.md persists in the project root across sessions
  • The meta-skill using-agent-skills is re-injected at each session start (stateless injection)
  • No dedicated conversation history or summary mechanism; state lives in the spec file

Context Compaction

No explicit compaction handling. Skills are designed to be self-contained invocations. The SDD cache provides partial mitigation for spec-driven workflows.

None — no full-text search, vector search, or database. Context retrieval is file-read (Claude Code's native file tool).

07

Orchestration

Agent Skills (Addy Osmani) — Orchestration

Multi-Agent Support

Yes — the plugin ships 3 named subagent persona files:

  • agents/code-reviewer.md — Staff Engineer persona for code review
  • agents/security-auditor.md — Security specialist persona
  • agents/test-engineer.md — Test engineer persona

These are invokable as subagents from Claude Code's Task tool.

Orchestration Pattern

Sequential — skills map to sequential development lifecycle phases. The meta-skill routes to one skill at a time; there is no parallel fan-out or swarm coordination. Multi-agent usage is a manual decision by the user or the primary agent when invoking a reviewer/auditor.

Subagent Definition Format

persona-md — each agent file is a markdown file with a named role identity and review framework, e.g.:

"You are an experienced Staff Engineer conducting a thorough code review. Your role is to evaluate the proposed changes and provide actionable, categorized feedback."

Spawn Mechanism

claude-task-tool — agents can be invoked via Claude Code's Task tool; no custom spawn runtime.

Isolation Mechanism

None — edits are in-place. No git-worktree isolation is mandated (unlike superpowers which has a dedicated using-git-worktrees skill). The git-workflow-and-versioning skill encourages branch-based isolation but does not enforce it.

Multi-Model

No — single model only. No routing between different LLM models for different roles.

Execution Mode

Interactive-loop — skills are triggered on demand (either by the user typing a command or by the meta-skill's auto-routing logic). The SessionStart hook enables automatic dispatch without explicit invocation.

Consensus Mechanism

None.

Prompt Chaining

Yes — spec-driven-development produces SPEC.md, which feeds planning-and-task-breakdown, which feeds incremental-implementation. This is a manual chain (user advances phases explicitly), not automatic.

Auto-Validators

None enforced automatically. The test-driven-development skill mandates writing failing tests, but there is no post-hook that auto-runs the test suite. Validation is workflow-described, not hook-executed.

Git Automation

  • commits_automatically: No — the incremental-implementation skill instructs committing after each slice but does not execute git commands automatically
  • creates_pr_automatically: No
  • merges_automatically: No
08

Ui Cli Surface

Agent Skills (Addy Osmani) — UI & CLI Surface

Dedicated CLI Binary

None — no dedicated CLI binary. The plugin is invoked via the hosting agent's CLI (Claude Code, Gemini CLI, etc.).

Claude Code Plugin Install

/plugin marketplace add addyosmani/agent-skills
/plugin install agent-skills@addy-agent-skills

Or local:

claude --plugin-dir /path/to/agent-skills

Local Web Dashboard

None.

Terminal UI

None.

IDE Integration

  • Cursor: Copy SKILL.md files into .cursor/rules/; setup guide at docs/cursor-setup.md
  • Windsurf: Add skill contents to Windsurf rules configuration; docs/windsurf-setup.md
  • VS Code (Copilot): GitHub Copilot integration; docs/copilot-setup.md

Cross-Tool Installation Surfaces

Tool Install Method
Claude Code /plugin install via marketplace
Cursor Copy .cursor/rules/
Gemini CLI gemini skills install
Windsurf Rules config
OpenCode AGENTS.md + skill tool
GitHub Copilot Copilot extension config

Observability

No audit log, no replay capability. Session output is visible in Claude Code's conversation.

Hook Events Used

  • SessionStart — injects meta-skill content

The hooks.json also references additional event hooks (sdd-cache-pre, sdd-cache-post, simplify-ignore) which may use PreToolUse/PostToolUse events based on the shell script names, but the hooks.json only declares the SessionStart hook explicitly.

Related frameworks

same archetype · same primary tool · same memory type

claude-mem (thedotmack) ★ 78k

Background worker service captures every tool call as an observation, AI-compresses sessions, and auto-injects relevant past…

pi (badlogic/earendil) ★ 55k

A minimal, hackable, multi-provider terminal coding agent that adapts to your workflows via npm-installable TypeScript Extensions…

wshobson/agents Plugin Marketplace ★ 36k

Single Markdown source for 83 domain-specialized plugins that auto-generates idiomatic artifacts for five AI coding harnesses.

TabbyML/Tabby ★ 34k

Self-hosted AI coding assistant server (alternative to GitHub Copilot) with admin dashboard, RAG-based completions, and multi-IDE…

Compound Engineering ★ 17k

Make each unit of engineering work compound into easier future work via brainstorm→plan→execute→review→learn cycles.

Qodo (PR-Agent) ★ 11k

Open-source AI PR reviewer with single-call tool architecture, PR compression for large diffs, self-reflection quality gate, and…