glebis/claude-skills

glebis-skills · glebis/claude-skills · ★ 225 · last commit 2026-05-25

Primitive shape 51 total

Skills 48 Subagents 3

Summary

glebis/claude-skills — Summary

The most expansive personal skill collection in this batch: 48 plugins spanning meetings, research, presentations, developer tools, integrations, and personal productivity, maintained by Gleb Kalinin (Berlin technologist, AI educator, solopreneur). The flagship tdd skill is architecturally the most sophisticated item in the entire batch — a full multi-agent TDD orchestration system with the Test Writer, Implementer, and Refactorer running as separate Task subagents with architecturally enforced context isolation (the Test Writer never sees implementation code; the Implementer never sees the spec). It includes session state via .tdd-state.json, --dry-run mode, --auto mode, 7 framework support, 18 documented anti-patterns, 11 failure recovery scenarios, and a run_tests.sh universal test runner. The gpt-image-2 skill ships a Python MCP-style interface with SOPS+age encrypted API key management, JSONL history tracking, and cost-estimation modes.

differs_from_seeds: The TDD skill is a superset of superpowers' test-driven-development skill (Archetype 1) but implemented as a multi-agent orchestrator with separate context isolation between test-writing and implementation phases — closer to BMAD's multi-persona pattern but with information boundary enforcement that none of the 11 seeds implement. The breadth of 48 plugins (integrating ElevenLabs, GPT Image 2, Telegram, Google Workspace, Zoom, etc.) is closest to a personal "Swiss Army knife" — no seed compares in scope diversity.

Overview

glebis/claude-skills — Overview

Origin

Created by Gleb Kalinin (glebis), described as "a Berlin-based technologist, AI educator, solopreneur and artist. He teaches in-depth agentic skills and workflows, productivity and values-based project management using AI tools." He runs Claude code labs and community.

Philosophy

The README doesn't have a single manifesto but the breadth reveals the philosophy: highly personal skill library as a curated productivity OS. Unlike most repos that target professional software teams, this one integrates personal tools (Gmail, Calendar, Telegram, ElevenLabs TTS, Chrome browsing history) alongside technical ones (TDD, GPT Image 2, RAG evaluation).

TDD Design Philosophy

From the TDD SKILL.md:

"The core innovation: the Test Writer never sees implementation code, and the Implementer never sees the specification. This prevents the LLM from leaking implementation intent into test design."

Design credits cited:

tdd-guard (hook-enforced TDD)
Matt Pocock's TDD Skill (vertical slicing)
TDFlow (test quality as ceiling for implementation quality)
alexop.dev (context isolation between phases)

GPT Image 2 Design Philosophy

The gpt-image-2 skill treats image generation as a professional production workflow:

Cost controls with confirmation prompts
Draft mode (10x cheaper) for iteration
14 style presets + 8 platform presets
SOPS+age encrypted key management
JSONL history with metadata sidecars

Community Orientation

Gleb teaches AI skills and runs a community — this skill pack is both personal tooling and teaching material, explaining the unusual breadth and polish of documentation.

Architecture

glebis/claude-skills — Architecture

Distribution

GitHub: glebis/claude-skills
Install: Via Claude Code marketplace (glebis-skills in marketplace.json)
License: None specified
Primary language: JavaScript + Python (scripts)

Directory Structure (selected)

claude-skills/
├── .claude-plugin/
│   ├── marketplace.json    # 48 plugin entries
│   └── README.md
├── BUNDLES.md              # Plugin bundle definitions
├── tdd/
│   ├── SKILL.md            # 744 lines — most complex skill in batch
│   ├── references/
│   │   ├── agent_prompts.md
│   │   ├── anti_patterns.md
│   │   ├── framework_configs.md
│   │   └── layer_guide.md
│   └── scripts/
│       ├── run_tests.sh
│       ├── extract_api.sh
│       └── discover_docs.sh
├── gpt-image-2/
│   ├── SKILL.md
│   └── scripts/
│       └── gpt_image_2.py
├── gws/                    # Google Workspace CLI skill
├── present/                # Narrated HTML presentations (ElevenLabs)
├── balanced/               # Anti-sycophancy skill
├── tg-responder/           # Telegram responder
├── gmail/                  # Gmail integration
├── zoom/                   # Zoom integration
├── granola/                # Meeting notes
├── fathom/                 # Meeting intelligence
├── linear/                 # Linear project management
├── telegram/               # Telegram integration
├── llm-cli/                # LLM CLI integration
├── deep-research/          # Deep research workflow
├── rag-eval/               # RAG evaluation
├── codex/                  # Codex integration
... (48 total plugins)

TDD Architecture (per SKILL.md)

ORCHESTRATOR (main Claude context)
├─ Phase 0: Setup (detect framework, extract API, create state)
├─ Phase 1: Decompose into vertical slices → user approves
├─ FOR EACH SLICE:
│   ├─ Phase 2 (RED):     Task(Test Writer)  ← spec + API only
│   ├─ Phase 3 (GREEN):   Task(Implementer)  ← failing test + error only
│   └─ Phase 4 (REFACTOR): Task(Refactorer)  ← all code + green results
└─ Summary

Required Runtime

Claude Code (primary)
Python 3 (for gpt-image-2, gws scripts)
run_tests.sh — universal test runner supporting Jest, Vitest, pytest, Go test, cargo test, RSpec, PHPUnit
API keys for: OpenAI (gpt-image-2), Google (gws), ElevenLabs (present), Telegram Bot, etc.

Components

glebis/claude-skills — Components

Plugin Count: 48 total

From marketplace.json (verified count: 48 entries).

Developer Skills (selected)

Plugin	Description
`tdd`	Multi-agent TDD orchestration with context isolation between Test Writer, Implementer, Refactorer. 744-line SKILL.md. Session state via `.tdd-state.json`.
`gpt-image-2`	GPT Image 2 image generation — 14 style presets, 8 platform presets, thinking mode, cost controls, SOPS+age key management
`balanced`	Anti-sycophancy mode — constructive, evidence-based dialogue
`rag-eval`	RAG evaluation framework
`codex`	Codex agent integration
`llm-cli`	LLM CLI integration skill
`deep-research`	Deep research workflow

Communication + Productivity Skills (selected)

Plugin	Description
`gmail`	Gmail integration
`gws`	Google Workspace CLI — Gmail, Calendar, Drive, Sheets, Docs, Tasks, Chat, People, Meet
`telegram`	Telegram integration
`tg-responder`	Telegram auto-responder
`zoom`	Zoom integration
`granola`	Meeting notes processing
`fathom`	Meeting intelligence
`linear`	Linear project management
`present`	Narrated HTML presentations with ElevenLabs voiceover
`elevenlabs-tts`	ElevenLabs TTS integration

Data + Research Skills (selected)

Plugin	Description
`browsing-history`	Query cross-device browsing history
`chrome-history`	Chrome history querying
`deep-research`	Multi-step research workflows
`firecrawl-research`	Firecrawl-powered web research

TDD Sub-components (scripts)

Script	Purpose
`run_tests.sh`	Universal test runner wrapping Jest, Vitest, pytest, Go test, cargo test, RSpec, PHPUnit — returns structured JSON
`extract_api.sh`	Extract public API signatures (no bodies) for 7 languages — isolates implementer from spec
`discover_docs.sh`	Find and load project docs for context injection

TDD Reference Documents

File	Purpose
`references/agent_prompts.md`	Full prompts for Test Writer, Implementer, Refactorer subagents
`references/anti_patterns.md`	18 documented anti-patterns with prevention guidance
`references/framework_configs.md`	Per-framework test runner configs
`references/layer_guide.md`	Inside-out vertical slicing — layer-specific constraints

Prompts

glebis/claude-skills — Prompts

Prompt File 1: TDD SKILL.md — Identity + Information Isolation (verbatim)

Technique: Architectural information isolation via separate Task subagents with explicit context restrictions.

---
name: tdd
description: This skill should be used when the user wants to implement features or
  fix bugs using test-driven development. Enforces the RED-GREEN-REFACTOR cycle with
  vertical slicing, context isolation between test writing and implementation...
---

# Test-Driven Development — Multi-Agent Orchestration

Enforce disciplined RED-GREEN-REFACTOR cycles using **separate subagents** for test
writing and implementation. The core innovation: **the Test Writer never sees
implementation code, and the Implementer never sees the specification.** This prevents
the LLM from leaking implementation intent into test design.

Technique: Negative-constraint information isolation — the skill defines what each subagent is FORBIDDEN from seeing, not just what it receives. This is a security/integrity pattern applied to prompt engineering.

Prompt File 2: TDD SKILL.md — Dry-Run Mode (verbatim)

Technique: Validation pipeline for skill itself — the --dry-run mode treats the orchestration pipeline as testable infrastructure.

## Invocation Modes

| Invocation | Behavior |
|-----------|----------|
| `/tdd <feature>` | Interactive mode |
| `/tdd --auto <feature>` | Autonomous mode |
| `/tdd --resume` | Resume from .tdd-state.json |
| `/tdd --dry-run <feature>` | Validation mode — runs Phase 0 + Phase 1, renders all
  prompts, but skips Task() calls. No code written. |

In `--dry-run` mode, validate the entire orchestration pipeline without executing:
1. Phase 0 runs fully: detect framework, verify baseline, extract API
2. Phase 1 runs fully: decompose into slices (requires user approval)
3. For each slice: render all three agent prompts with actual variables
4. No Task() calls made. No code written.
5. Report summary:

DRY RUN COMPLETE: {feature name}

Phase 0:
  Framework: {framework}
  Language: {language}
  Baseline: {pass|greenfield}
  API surface: {line count} lines

Phase 1:
  Slices: {N} ({layer breakdown})

Prompts rendered: {N * 3} (all variables resolved)
  Test Writer:   {char count} chars
  Implementer:   {char count} chars
  Refactorer:    {char count} chars

State file: .tdd-state.json written
No code was modified.

Technique: Pipeline self-validation — the dry-run mode treats the TDD orchestration as software that needs testing before execution. The prompt length reporting ({char count} chars per subagent) surfaces potential context-overload issues before the run. This is unique in the entire corpus.

Uniqueness

glebis/claude-skills — Uniqueness

differs_from_seeds

The TDD skill surpasses superpowers' test-driven-development skill (Archetype 1) in every dimension: multi-agent context isolation (Test Writer, Implementer, Refactorer as separate Tasks with information boundaries), session state persistence (.tdd-state.json), --resume mode, --dry-run pipeline validation, 7-framework support, retry loops (5 implementer + 3 regression), and 18 documented anti-patterns. The information isolation (Implementer cannot see the spec) is not present in any of the 11 seeds and is the most principled application of context engineering theory in the entire corpus. The overall breadth (48 plugins integrating ElevenLabs, GPT Image 2, Telegram, Google Workspace, Zoom, RAG eval) exceeds any seed in scope diversity — closest to claude-flow in component count but entirely markdown/script-based without a runtime.

Most Distinctive Feature

Information isolation in multi-agent TDD: The Test Writer never sees implementation code; the Implementer never sees the spec. This is not just multi-agent orchestration — it's an architectural information boundary enforced by selective context injection. No other framework in the 11 seeds or this batch implements this pattern.

Second: --dry-run mode that renders all subagent prompts with actual variables and reports character counts before any code is written. This treats the orchestration pipeline itself as testable software.

Positioning

Most sophisticated TDD implementation in any personal skill pack
Highest plugin breadth (48) in this batch — personal productivity OS
Community/teaching focus (Gleb runs Claude code labs)
JavaScript developer primary — marketplace.json language is JavaScript

Observable Failure Modes

No license: The repo has no license file — unclear if skills can be incorporated into other projects
API key sprawl: 48 plugins require API keys for OpenAI, Google, ElevenLabs, Telegram, etc. — high setup cost
SOPS+age dependency: Encrypted key management is powerful but adds complexity for casual users
7-framework run_tests.sh: Universal test runner may have edge cases on non-standard project setups
TDD context injection brittleness: If extract_api.sh doesn't correctly identify public API signatures, the Implementer works with wrong context
Cost estimation UX: The --estimate flag is opt-in — users who forget will get cost surprises on large image generation runs

Workflow

glebis/claude-skills — Workflow

TDD Workflow (the flagship)

/tdd "add user authentication with JWT tokens"

Phase 0: Setup
  ↓ Detect test framework (Jest/pytest/etc.)
  ↓ run_tests.sh: verify baseline passes (or greenfield)
  ↓ extract_api.sh: extract public API surface
  ↓ discover_docs.sh: load project docs
  ↓ Create .tdd-state.json

Phase 1: Decompose
  ↓ Decompose into vertical slices by architectural layer
    (domain → domain-service → application → infrastructure)
  ↓ [HUMAN CHECKPOINT] User approves slice list

FOR EACH SLICE:
  Phase 2 (RED):
    ↓ Task(Test Writer) ← spec + API surface ONLY (no implementation code)
    ↓ run_tests.sh: verify test FAILS as expected
    ↓ [HUMAN CHECKPOINT in interactive mode] "RED confirmed"

  Phase 3 (GREEN):
    ↓ Task(Implementer) ← failing test + error ONLY (no spec)
    ↓ Up to 5 retry attempts with previous-attempt context
    ↓ run_tests.sh: verify test passes
    ↓ Regression auto-fix: detect broken tests (3-attempt limit)

  Phase 4 (REFACTOR):
    ↓ Task(Refactorer) ← all code + green test results
    ↓ Apply/skip suggestions
    ↓ run_tests.sh: verify still passes

Summary

Modes

Mode	Behavior
`/tdd <feature>`	Interactive — pause at every RED checkpoint
`/tdd --auto <feature>`	Autonomous — run all slices, stop only on unrecoverable errors
`/tdd --resume`	Resume from `.tdd-state.json`
`/tdd --dry-run <feature>`	Validate pipeline without writing any code — render all prompts, check variables

Phase-to-Artifact Map

Phase	Artifact
Phase 0	`.tdd-state.json` with framework, API surface, baseline status
Phase 1	Slice decomposition (user-approved)
Phase 2 (RED)	Failing test file
Phase 3 (GREEN)	Implementation file(s)
Phase 4 (REFACTOR)	Refactored code, unchanged tests

Approval Gates

Phase 1: User approves slice decomposition (interactive mode)
Phase 2: User confirms RED state at each slice (interactive mode)
Only errors requiring user intervention in --auto mode:
- 5 failed implementation attempts
- 3 failed regression-fix attempts
- Script errors (missing binary, permission denied)

Memory Context

glebis/claude-skills — Memory & Context

TDD State

The TDD skill persists state to .tdd-state.json in the project root:

Framework detected (Jest, pytest, etc.)
Language
Baseline test status (pass/greenfield)
API surface (line count)
Slice decomposition and current progress
Phase for each slice (PENDING/RED/GREEN/REFACTOR/DONE)

This enables --resume mode: sessions interrupted mid-TDD cycle can pick up where they left off without repeating earlier phases.

Information Isolation Architecture

The TDD skill implements deliberate context segmentation:

Test Writer receives: spec description + API surface (no implementation files)
Implementer receives: failing test + error output only (no spec, no other code)
Refactorer receives: all code + green test results

This is context management as a correctness constraint — not about efficiency but about preventing the model from "cheating" by using implementation knowledge to write tests that mirror the implementation.

JSONL History (gpt-image-2)

The gpt-image-2 skill writes .jsonl history files with:

Each generation's prompt, style, platform, cost
Metadata JSON sidecar per image
Re-roll support via again command

Memory Type

File-based: .tdd-state.json (project-scoped), JSONL history files.

Cross-Session Handoff

Via .tdd-state.json and --resume mode. The TDD skill is the only skill in the entire batch with explicit cross-session resumption.

Orchestration

glebis/claude-skills — Orchestration

Multi-Agent

Yes — TDD skill uses 3 subagents (Test Writer, Implementer, Refactorer) with the orchestrator as a 4th context.

Orchestration Pattern

Hierarchical (ORCHESTRATOR → Test Writer | Implementer | Refactorer in sequence per slice).

Subagent Definition Format

The subagent prompts are stored in references/agent_prompts.md and referenced from the main SKILL.md. The ORCHESTRATOR reads the skill; spawns Task subagents with the relevant sub-prompt + restricted context. This is skill-md with external prompt references.

Spawn Mechanism

Claude's Task tool. From the SKILL.md:

Phase 2 (RED):     Task(Test Writer)  ← spec + API only
Phase 3 (GREEN):   Task(Implementer)  ← failing test + error only
Phase 4 (REFACTOR): Task(Refactorer)  ← all code + green results

Context Isolation (Information Boundaries)

This is the architectural centerpiece:

Test Writer: CANNOT see implementation code
Implementer: CANNOT see the spec
Refactorer: CAN see everything (code + tests + results)

This prevents "LLM cheating" — writing tests that mirror implementation rather than specifying behavior.

Isolation Mechanism

Information isolation only (no git-worktree or container). The isolation is enforced via what is passed to each Task subagent's prompt.

Multi-Model

No explicit multi-model. Single model for all subagents.

Execution Mode

Interactive-loop (default) or one-shot autonomous (--auto mode).

Retry Logic

Implementer: up to 5 fresh attempts on failure (with previous-attempt context but no accumulated history)
Regression fix: up to 3 attempts

Prompt Chaining

Yes — Phase 0 output (API surface, framework) → Phase 1 input. RED test → GREEN input. GREEN results → REFACTOR input.

Crash Recovery

Partial — .tdd-state.json enables --resume for interrupted sessions.

Ui Cli Surface

glebis/claude-skills — UI & CLI Surface

CLI Binary

None for the skill pack itself. The gpt-image-2 skill wraps a Python CLI (gpt_image_2.py) invokable via slash command.

Local UI

None. No web dashboard.

`gpt-image-2` CLI Surface

The gpt_image_2.py script provides a sub-CLI:

scripts/gpt_image_2.py "prompt" ./output.png
scripts/gpt_image_2.py --preset editorial "neural networks" ./nn.png
scripts/gpt_image_2.py --edit photo.png --platform square "description" ./styled.png
scripts/gpt_image_2.py --draft --preset infographic "AI trends" ./draft.png
scripts/gpt_image_2.py --estimate --n 10 --thinking high "carousel"
scripts/gpt_image_2.py again  # Re-roll last prompt

Flags: --preset, --platform, --edit, --draft, --thinking, --n, --estimate, -y (skip cost confirmation), --provider openrouter

`gws` CLI Surface

The Google Workspace skill is a reference for the gws CLI with 15 helper commands across 7 services (Gmail, Calendar, Drive, Sheets, Docs, Tasks, Chat) plus raw API recipes.

TDD Observability

The .tdd-state.json file provides session state visibility. The --dry-run mode generates a complete pre-execution report. --auto mode prints structured status lines:

[auto] RED  slice 1/4: "validates email format" — test failing as expected
[auto] GREEN slice 1/4: passing (attempt 1)
[auto] REFACTOR slice 1/4: 1 suggestion applied, 0 skipped

SOPS+age Key Management

The gpt-image-2 skill uses SOPS+age for encrypted API key storage — production-grade key management in a personal skill.

Related frameworks

same archetype · same primary tool · same memory type

Claude-Flow / Ruflo ★ 55k

A6 Multi-agent orchestrator

Eliminates single-agent context limits and sequential bottlenecks by orchestrating fault-tolerant swarms of specialized AI agents…

Hermes Agent (NousResearch) ★ 168k

A6 Multi-agent orchestrator

Self-improving personal AI agent with closed learning loop, 7 terminal backends, and messaging gateway — not tied to any AI…

OpenCode ★ 165k

A6 Multi-agent orchestrator

Terminal-first AI coding agent with multi-model routing, native desktop app, and a typed .opencode/ configuration system for…

OpenHands ★ 75k

A6 Multi-agent orchestrator

Open-source AI software development platform (open-source Devin alternative) with Docker sandbox isolation, 77.6% SWE-bench…

DeerFlow ★ 70k

A6 Multi-agent orchestrator

Long-horizon superagent that researches, codes, and creates by orchestrating parallel sub-agents with isolated contexts in Docker…

oh-my-openagent (omo) ★ 60k

A6 Multi-agent orchestrator

Multi-provider AI agent orchestration for OpenCode: escape vendor lock-in by routing Sisyphus (Claude/Kimi/GLM) and Hephaestus…

Distribution

Type: claude-plugin
Install: multi-step

Surfaces

CLI binary: gpt_image_2.py (internal script)
CLI subcmds: 8
Local UI: No
Tech stack: none

Components

Commands: 0
Skills: 48
Subagents: 3
Hooks: 0
MCP servers: 0
MCP tools: 0
Scripts: 3
Templates: 0

Workflow

Phases: 6
Approval gates: 2
Spec format: none
Spec storage: none
Delta or full: none

Orchestration

Multi-agent: Yes
Pattern: hierarchical
Max concurrent: 1
Isolation: none
Consensus: none
Prompt chaining: Yes

Multi-model

Multi-model: No
BYOK: Yes
Modal: text+vision

Execution

Mode: interactive-loop
Crash recovery: Yes
Compaction: No
Session handoff: Yes
Streaming: Yes

Memory

Type: file-based
Persistence: project
Search: none
State files: 3 files

Quality

TDD: Yes
TDD mechanism: pre-impl-test-write
Validators: 1
Self-review: none

Git / Observability

Auto commit: No
Auto PR: No
Auto merge: No
Worktree/feat: No
Audit log: Yes
Audit format: structured-md
Replay: Yes

Tools

Primary: claude-code
Targets: 1
Portability: low

Signals

Stars: 225
Last commit: 2026-05-25
Contributors: 1
Maintainer: active
Quality score: 6.5/10

Summary

glebis/claude-skills — Summary

Overview

glebis/claude-skills — Overview

Origin

Philosophy

TDD Design Philosophy

GPT Image 2 Design Philosophy

Community Orientation

Architecture

glebis/claude-skills — Architecture

Distribution

Directory Structure (selected)

TDD Architecture (per SKILL.md)

Required Runtime

Components

glebis/claude-skills — Components

Plugin Count: 48 total

Developer Skills (selected)

Communication + Productivity Skills (selected)

Data + Research Skills (selected)

TDD Sub-components (scripts)

TDD Reference Documents

Prompts

glebis/claude-skills — Prompts

Prompt File 1: TDD SKILL.md — Identity + Information Isolation (verbatim)

Prompt File 2: TDD SKILL.md — Dry-Run Mode (verbatim)

Uniqueness

glebis/claude-skills — Uniqueness

differs_from_seeds

Most Distinctive Feature

Positioning

Observable Failure Modes

Workflow

glebis/claude-skills — Workflow

TDD Workflow (the flagship)

Modes

Phase-to-Artifact Map

Approval Gates

Memory Context

glebis/claude-skills — Memory & Context

TDD State

Information Isolation Architecture

JSONL History (gpt-image-2)

Memory Type

Cross-Session Handoff

Orchestration

glebis/claude-skills — Orchestration

Multi-Agent

Orchestration Pattern

Subagent Definition Format

Spawn Mechanism

Context Isolation (Information Boundaries)

Isolation Mechanism

Multi-Model

Execution Mode

Retry Logic

Prompt Chaining

Crash Recovery

Ui Cli Surface

glebis/claude-skills — UI & CLI Surface

CLI Binary

Local UI

gpt-image-2 CLI Surface

gws CLI Surface

TDD Observability

SOPS+age Key Management

Related frameworks

`gpt-image-2` CLI Surface

`gws` CLI Surface