Skip to content
/

glebis/claude-skills

glebis-skills · glebis/claude-skills · ★ 225 · last commit 2026-05-25

Primitive shape 51 total
Skills 48 Subagents 3
00

Summary

glebis/claude-skills — Summary

The most expansive personal skill collection in this batch: 48 plugins spanning meetings, research, presentations, developer tools, integrations, and personal productivity, maintained by Gleb Kalinin (Berlin technologist, AI educator, solopreneur). The flagship tdd skill is architecturally the most sophisticated item in the entire batch — a full multi-agent TDD orchestration system with the Test Writer, Implementer, and Refactorer running as separate Task subagents with architecturally enforced context isolation (the Test Writer never sees implementation code; the Implementer never sees the spec). It includes session state via .tdd-state.json, --dry-run mode, --auto mode, 7 framework support, 18 documented anti-patterns, 11 failure recovery scenarios, and a run_tests.sh universal test runner. The gpt-image-2 skill ships a Python MCP-style interface with SOPS+age encrypted API key management, JSONL history tracking, and cost-estimation modes.

differs_from_seeds: The TDD skill is a superset of superpowers' test-driven-development skill (Archetype 1) but implemented as a multi-agent orchestrator with separate context isolation between test-writing and implementation phases — closer to BMAD's multi-persona pattern but with information boundary enforcement that none of the 11 seeds implement. The breadth of 48 plugins (integrating ElevenLabs, GPT Image 2, Telegram, Google Workspace, Zoom, etc.) is closest to a personal "Swiss Army knife" — no seed compares in scope diversity.

01

Overview

glebis/claude-skills — Overview

Origin

Created by Gleb Kalinin (glebis), described as "a Berlin-based technologist, AI educator, solopreneur and artist. He teaches in-depth agentic skills and workflows, productivity and values-based project management using AI tools." He runs Claude code labs and community.

Philosophy

The README doesn't have a single manifesto but the breadth reveals the philosophy: highly personal skill library as a curated productivity OS. Unlike most repos that target professional software teams, this one integrates personal tools (Gmail, Calendar, Telegram, ElevenLabs TTS, Chrome browsing history) alongside technical ones (TDD, GPT Image 2, RAG evaluation).

TDD Design Philosophy

From the TDD SKILL.md:

"The core innovation: the Test Writer never sees implementation code, and the Implementer never sees the specification. This prevents the LLM from leaking implementation intent into test design."

Design credits cited:

GPT Image 2 Design Philosophy

The gpt-image-2 skill treats image generation as a professional production workflow:

  • Cost controls with confirmation prompts
  • Draft mode (10x cheaper) for iteration
  • 14 style presets + 8 platform presets
  • SOPS+age encrypted key management
  • JSONL history with metadata sidecars

Community Orientation

Gleb teaches AI skills and runs a community — this skill pack is both personal tooling and teaching material, explaining the unusual breadth and polish of documentation.

02

Architecture

glebis/claude-skills — Architecture

Distribution

  • GitHub: glebis/claude-skills
  • Install: Via Claude Code marketplace (glebis-skills in marketplace.json)
  • License: None specified
  • Primary language: JavaScript + Python (scripts)

Directory Structure (selected)

claude-skills/
├── .claude-plugin/
│   ├── marketplace.json    # 48 plugin entries
│   └── README.md
├── BUNDLES.md              # Plugin bundle definitions
├── tdd/
│   ├── SKILL.md            # 744 lines — most complex skill in batch
│   ├── references/
│   │   ├── agent_prompts.md
│   │   ├── anti_patterns.md
│   │   ├── framework_configs.md
│   │   └── layer_guide.md
│   └── scripts/
│       ├── run_tests.sh
│       ├── extract_api.sh
│       └── discover_docs.sh
├── gpt-image-2/
│   ├── SKILL.md
│   └── scripts/
│       └── gpt_image_2.py
├── gws/                    # Google Workspace CLI skill
├── present/                # Narrated HTML presentations (ElevenLabs)
├── balanced/               # Anti-sycophancy skill
├── tg-responder/           # Telegram responder
├── gmail/                  # Gmail integration
├── zoom/                   # Zoom integration
├── granola/                # Meeting notes
├── fathom/                 # Meeting intelligence
├── linear/                 # Linear project management
├── telegram/               # Telegram integration
├── llm-cli/                # LLM CLI integration
├── deep-research/          # Deep research workflow
├── rag-eval/               # RAG evaluation
├── codex/                  # Codex integration
... (48 total plugins)

TDD Architecture (per SKILL.md)

ORCHESTRATOR (main Claude context)
├─ Phase 0: Setup (detect framework, extract API, create state)
├─ Phase 1: Decompose into vertical slices → user approves
├─ FOR EACH SLICE:
│   ├─ Phase 2 (RED):     Task(Test Writer)  ← spec + API only
│   ├─ Phase 3 (GREEN):   Task(Implementer)  ← failing test + error only
│   └─ Phase 4 (REFACTOR): Task(Refactorer)  ← all code + green results
└─ Summary

Required Runtime

  • Claude Code (primary)
  • Python 3 (for gpt-image-2, gws scripts)
  • run_tests.sh — universal test runner supporting Jest, Vitest, pytest, Go test, cargo test, RSpec, PHPUnit
  • API keys for: OpenAI (gpt-image-2), Google (gws), ElevenLabs (present), Telegram Bot, etc.
03

Components

glebis/claude-skills — Components

Plugin Count: 48 total

From marketplace.json (verified count: 48 entries).

Developer Skills (selected)

Plugin Description
tdd Multi-agent TDD orchestration with context isolation between Test Writer, Implementer, Refactorer. 744-line SKILL.md. Session state via .tdd-state.json.
gpt-image-2 GPT Image 2 image generation — 14 style presets, 8 platform presets, thinking mode, cost controls, SOPS+age key management
balanced Anti-sycophancy mode — constructive, evidence-based dialogue
rag-eval RAG evaluation framework
codex Codex agent integration
llm-cli LLM CLI integration skill
deep-research Deep research workflow

Communication + Productivity Skills (selected)

Plugin Description
gmail Gmail integration
gws Google Workspace CLI — Gmail, Calendar, Drive, Sheets, Docs, Tasks, Chat, People, Meet
telegram Telegram integration
tg-responder Telegram auto-responder
zoom Zoom integration
granola Meeting notes processing
fathom Meeting intelligence
linear Linear project management
present Narrated HTML presentations with ElevenLabs voiceover
elevenlabs-tts ElevenLabs TTS integration

Data + Research Skills (selected)

Plugin Description
browsing-history Query cross-device browsing history
chrome-history Chrome history querying
deep-research Multi-step research workflows
firecrawl-research Firecrawl-powered web research

TDD Sub-components (scripts)

Script Purpose
run_tests.sh Universal test runner wrapping Jest, Vitest, pytest, Go test, cargo test, RSpec, PHPUnit — returns structured JSON
extract_api.sh Extract public API signatures (no bodies) for 7 languages — isolates implementer from spec
discover_docs.sh Find and load project docs for context injection

TDD Reference Documents

File Purpose
references/agent_prompts.md Full prompts for Test Writer, Implementer, Refactorer subagents
references/anti_patterns.md 18 documented anti-patterns with prevention guidance
references/framework_configs.md Per-framework test runner configs
references/layer_guide.md Inside-out vertical slicing — layer-specific constraints
05

Prompts

glebis/claude-skills — Prompts

Prompt File 1: TDD SKILL.md — Identity + Information Isolation (verbatim)

Technique: Architectural information isolation via separate Task subagents with explicit context restrictions.

---
name: tdd
description: This skill should be used when the user wants to implement features or
  fix bugs using test-driven development. Enforces the RED-GREEN-REFACTOR cycle with
  vertical slicing, context isolation between test writing and implementation...
---

# Test-Driven Development — Multi-Agent Orchestration

Enforce disciplined RED-GREEN-REFACTOR cycles using **separate subagents** for test
writing and implementation. The core innovation: **the Test Writer never sees
implementation code, and the Implementer never sees the specification.** This prevents
the LLM from leaking implementation intent into test design.

Technique: Negative-constraint information isolation — the skill defines what each subagent is FORBIDDEN from seeing, not just what it receives. This is a security/integrity pattern applied to prompt engineering.

Prompt File 2: TDD SKILL.md — Dry-Run Mode (verbatim)

Technique: Validation pipeline for skill itself — the --dry-run mode treats the orchestration pipeline as testable infrastructure.

## Invocation Modes

| Invocation | Behavior |
|-----------|----------|
| `/tdd <feature>` | Interactive mode |
| `/tdd --auto <feature>` | Autonomous mode |
| `/tdd --resume` | Resume from .tdd-state.json |
| `/tdd --dry-run <feature>` | Validation mode — runs Phase 0 + Phase 1, renders all
  prompts, but skips Task() calls. No code written. |

In `--dry-run` mode, validate the entire orchestration pipeline without executing:
1. Phase 0 runs fully: detect framework, verify baseline, extract API
2. Phase 1 runs fully: decompose into slices (requires user approval)
3. For each slice: render all three agent prompts with actual variables
4. No Task() calls made. No code written.
5. Report summary:

DRY RUN COMPLETE: {feature name}

Phase 0:
  Framework: {framework}
  Language: {language}
  Baseline: {pass|greenfield}
  API surface: {line count} lines

Phase 1:
  Slices: {N} ({layer breakdown})

Prompts rendered: {N * 3} (all variables resolved)
  Test Writer:   {char count} chars
  Implementer:   {char count} chars
  Refactorer:    {char count} chars

State file: .tdd-state.json written
No code was modified.

Technique: Pipeline self-validation — the dry-run mode treats the TDD orchestration as software that needs testing before execution. The prompt length reporting ({char count} chars per subagent) surfaces potential context-overload issues before the run. This is unique in the entire corpus.

09

Uniqueness

glebis/claude-skills — Uniqueness

differs_from_seeds

The TDD skill surpasses superpowers' test-driven-development skill (Archetype 1) in every dimension: multi-agent context isolation (Test Writer, Implementer, Refactorer as separate Tasks with information boundaries), session state persistence (.tdd-state.json), --resume mode, --dry-run pipeline validation, 7-framework support, retry loops (5 implementer + 3 regression), and 18 documented anti-patterns. The information isolation (Implementer cannot see the spec) is not present in any of the 11 seeds and is the most principled application of context engineering theory in the entire corpus. The overall breadth (48 plugins integrating ElevenLabs, GPT Image 2, Telegram, Google Workspace, Zoom, RAG eval) exceeds any seed in scope diversity — closest to claude-flow in component count but entirely markdown/script-based without a runtime.

Most Distinctive Feature

Information isolation in multi-agent TDD: The Test Writer never sees implementation code; the Implementer never sees the spec. This is not just multi-agent orchestration — it's an architectural information boundary enforced by selective context injection. No other framework in the 11 seeds or this batch implements this pattern.

Second: --dry-run mode that renders all subagent prompts with actual variables and reports character counts before any code is written. This treats the orchestration pipeline itself as testable software.

Positioning

  • Most sophisticated TDD implementation in any personal skill pack
  • Highest plugin breadth (48) in this batch — personal productivity OS
  • Community/teaching focus (Gleb runs Claude code labs)
  • JavaScript developer primary — marketplace.json language is JavaScript

Observable Failure Modes

  1. No license: The repo has no license file — unclear if skills can be incorporated into other projects
  2. API key sprawl: 48 plugins require API keys for OpenAI, Google, ElevenLabs, Telegram, etc. — high setup cost
  3. SOPS+age dependency: Encrypted key management is powerful but adds complexity for casual users
  4. 7-framework run_tests.sh: Universal test runner may have edge cases on non-standard project setups
  5. TDD context injection brittleness: If extract_api.sh doesn't correctly identify public API signatures, the Implementer works with wrong context
  6. Cost estimation UX: The --estimate flag is opt-in — users who forget will get cost surprises on large image generation runs
04

Workflow

glebis/claude-skills — Workflow

TDD Workflow (the flagship)

/tdd "add user authentication with JWT tokens"

Phase 0: Setup
  ↓ Detect test framework (Jest/pytest/etc.)
  ↓ run_tests.sh: verify baseline passes (or greenfield)
  ↓ extract_api.sh: extract public API surface
  ↓ discover_docs.sh: load project docs
  ↓ Create .tdd-state.json

Phase 1: Decompose
  ↓ Decompose into vertical slices by architectural layer
    (domain → domain-service → application → infrastructure)
  ↓ [HUMAN CHECKPOINT] User approves slice list

FOR EACH SLICE:
  Phase 2 (RED):
    ↓ Task(Test Writer) ← spec + API surface ONLY (no implementation code)
    ↓ run_tests.sh: verify test FAILS as expected
    ↓ [HUMAN CHECKPOINT in interactive mode] "RED confirmed"

  Phase 3 (GREEN):
    ↓ Task(Implementer) ← failing test + error ONLY (no spec)
    ↓ Up to 5 retry attempts with previous-attempt context
    ↓ run_tests.sh: verify test passes
    ↓ Regression auto-fix: detect broken tests (3-attempt limit)

  Phase 4 (REFACTOR):
    ↓ Task(Refactorer) ← all code + green test results
    ↓ Apply/skip suggestions
    ↓ run_tests.sh: verify still passes

Summary

Modes

Mode Behavior
/tdd <feature> Interactive — pause at every RED checkpoint
/tdd --auto <feature> Autonomous — run all slices, stop only on unrecoverable errors
/tdd --resume Resume from .tdd-state.json
/tdd --dry-run <feature> Validate pipeline without writing any code — render all prompts, check variables

Phase-to-Artifact Map

Phase Artifact
Phase 0 .tdd-state.json with framework, API surface, baseline status
Phase 1 Slice decomposition (user-approved)
Phase 2 (RED) Failing test file
Phase 3 (GREEN) Implementation file(s)
Phase 4 (REFACTOR) Refactored code, unchanged tests

Approval Gates

  • Phase 1: User approves slice decomposition (interactive mode)
  • Phase 2: User confirms RED state at each slice (interactive mode)
  • Only errors requiring user intervention in --auto mode:
    • 5 failed implementation attempts
    • 3 failed regression-fix attempts
    • Script errors (missing binary, permission denied)
06

Memory Context

glebis/claude-skills — Memory & Context

TDD State

The TDD skill persists state to .tdd-state.json in the project root:

  • Framework detected (Jest, pytest, etc.)
  • Language
  • Baseline test status (pass/greenfield)
  • API surface (line count)
  • Slice decomposition and current progress
  • Phase for each slice (PENDING/RED/GREEN/REFACTOR/DONE)

This enables --resume mode: sessions interrupted mid-TDD cycle can pick up where they left off without repeating earlier phases.

Information Isolation Architecture

The TDD skill implements deliberate context segmentation:

  • Test Writer receives: spec description + API surface (no implementation files)
  • Implementer receives: failing test + error output only (no spec, no other code)
  • Refactorer receives: all code + green test results

This is context management as a correctness constraint — not about efficiency but about preventing the model from "cheating" by using implementation knowledge to write tests that mirror the implementation.

JSONL History (gpt-image-2)

The gpt-image-2 skill writes .jsonl history files with:

  • Each generation's prompt, style, platform, cost
  • Metadata JSON sidecar per image
  • Re-roll support via again command

Memory Type

File-based: .tdd-state.json (project-scoped), JSONL history files.

Cross-Session Handoff

Via .tdd-state.json and --resume mode. The TDD skill is the only skill in the entire batch with explicit cross-session resumption.

07

Orchestration

glebis/claude-skills — Orchestration

Multi-Agent

Yes — TDD skill uses 3 subagents (Test Writer, Implementer, Refactorer) with the orchestrator as a 4th context.

Orchestration Pattern

Hierarchical (ORCHESTRATOR → Test Writer | Implementer | Refactorer in sequence per slice).

Subagent Definition Format

The subagent prompts are stored in references/agent_prompts.md and referenced from the main SKILL.md. The ORCHESTRATOR reads the skill; spawns Task subagents with the relevant sub-prompt + restricted context. This is skill-md with external prompt references.

Spawn Mechanism

Claude's Task tool. From the SKILL.md:

Phase 2 (RED):     Task(Test Writer)  ← spec + API only
Phase 3 (GREEN):   Task(Implementer)  ← failing test + error only
Phase 4 (REFACTOR): Task(Refactorer)  ← all code + green results

Context Isolation (Information Boundaries)

This is the architectural centerpiece:

  • Test Writer: CANNOT see implementation code
  • Implementer: CANNOT see the spec
  • Refactorer: CAN see everything (code + tests + results)

This prevents "LLM cheating" — writing tests that mirror implementation rather than specifying behavior.

Isolation Mechanism

Information isolation only (no git-worktree or container). The isolation is enforced via what is passed to each Task subagent's prompt.

Multi-Model

No explicit multi-model. Single model for all subagents.

Execution Mode

Interactive-loop (default) or one-shot autonomous (--auto mode).

Retry Logic

  • Implementer: up to 5 fresh attempts on failure (with previous-attempt context but no accumulated history)
  • Regression fix: up to 3 attempts

Prompt Chaining

Yes — Phase 0 output (API surface, framework) → Phase 1 input. RED test → GREEN input. GREEN results → REFACTOR input.

Crash Recovery

Partial — .tdd-state.json enables --resume for interrupted sessions.

08

Ui Cli Surface

glebis/claude-skills — UI & CLI Surface

CLI Binary

None for the skill pack itself. The gpt-image-2 skill wraps a Python CLI (gpt_image_2.py) invokable via slash command.

Local UI

None. No web dashboard.

gpt-image-2 CLI Surface

The gpt_image_2.py script provides a sub-CLI:

scripts/gpt_image_2.py "prompt" ./output.png
scripts/gpt_image_2.py --preset editorial "neural networks" ./nn.png
scripts/gpt_image_2.py --edit photo.png --platform square "description" ./styled.png
scripts/gpt_image_2.py --draft --preset infographic "AI trends" ./draft.png
scripts/gpt_image_2.py --estimate --n 10 --thinking high "carousel"
scripts/gpt_image_2.py again  # Re-roll last prompt

Flags: --preset, --platform, --edit, --draft, --thinking, --n, --estimate, -y (skip cost confirmation), --provider openrouter

gws CLI Surface

The Google Workspace skill is a reference for the gws CLI with 15 helper commands across 7 services (Gmail, Calendar, Drive, Sheets, Docs, Tasks, Chat) plus raw API recipes.

TDD Observability

The .tdd-state.json file provides session state visibility. The --dry-run mode generates a complete pre-execution report. --auto mode prints structured status lines:

[auto] RED  slice 1/4: "validates email format" — test failing as expected
[auto] GREEN slice 1/4: passing (attempt 1)
[auto] REFACTOR slice 1/4: 1 suggestion applied, 0 skipped

SOPS+age Key Management

The gpt-image-2 skill uses SOPS+age for encrypted API key storage — production-grade key management in a personal skill.

Related frameworks

same archetype · same primary tool · same memory type

Claude-Flow / Ruflo ★ 55k

Eliminates single-agent context limits and sequential bottlenecks by orchestrating fault-tolerant swarms of specialized AI agents…

Hermes Agent (NousResearch) ★ 168k

Self-improving personal AI agent with closed learning loop, 7 terminal backends, and messaging gateway — not tied to any AI…

OpenCode ★ 165k

Terminal-first AI coding agent with multi-model routing, native desktop app, and a typed .opencode/ configuration system for…

OpenHands ★ 75k

Open-source AI software development platform (open-source Devin alternative) with Docker sandbox isolation, 77.6% SWE-bench…

DeerFlow ★ 70k

Long-horizon superagent that researches, codes, and creates by orchestrating parallel sub-agents with isolated contexts in Docker…

oh-my-openagent (omo) ★ 60k

Multi-provider AI agent orchestration for OpenCode: escape vendor lock-in by routing Sisyphus (Claude/Kimi/GLM) and Hephaestus…