Skip to content
/

Agent Skills for Context Engineering

muratcankoylan-skills · muratcankoylan/Agent-Skills-for-Context-Engineering · ★ 16k · last commit 2026-05-26

Primitive shape 15 total
Skills 15
00

Summary

Agent Skills for Context Engineering (muratcankoylan) — Summary

This is the highest-starred personal skill collection in this batch (16,030 stars) — a comprehensive 15-skill educational and operational library teaching context engineering and agent harness engineering principles. Unlike most skill packs that target feature delivery, this repo is explicitly about teaching how to build better AI agent systems: context window management, attention mechanics, multi-agent coordination patterns, memory systems, evaluation frameworks, and autonomous harness design. The skills are organized into four tiers: foundational (context anatomy, degradation, compression), architectural (multi-agent, memory, tool design, filesystem, hosted agents), operational (optimization, latent briefing, evaluation, harness engineering), and methodological (project development, BDI mental states). Cited in two peer-reviewed papers (Peking University and a CMU/Yale/JHU/etc. consortium survey), making it one of few personal skill repos with academic validation.

differs_from_seeds: Differs from all 11 seeds in purpose — not an engineering workflow accelerator but a domain knowledge library about context engineering itself. Closest to superpowers (Archetype 1: skills-only) in structure, but where superpowers enforces workflow behaviors, this repo teaches underlying principles. Unlike BMAD (34 skills focused on project delivery), these 15 skills focus on meta-level agent system design. The harness-engineering skill directly competes conceptually with claude-flow's hive-mind orchestration but is a reference guide rather than a runtime.

01

Overview

Agent Skills for Context Engineering (muratcankoylan) — Overview

Origin

Created by Muratcan Koylan. The repository has been cited in two academic papers:

  1. "Meta Context Engineering via Agentic Skill Evolution" (Peking University, 2025): "While static skills are well-recognized [Anthropic, 2025b; Muratcan Koylan, 2025], MCE is among the first to dynamically evolve them, bridging manual skill engineering and autonomous self-improvement."
  2. "Agent Harness Engineering: A Survey" (CMU, Yale, JHU, NEU, Tulane, UAB, OSU, Virginia Tech, and Amazon, 2026)

Philosophy

The README opens with a foundational definition:

"Context engineering is the discipline of managing the language model's context window. Unlike prompt engineering, which focuses on crafting effective instructions, context engineering addresses the holistic curation of all information that enters the model's limited attention budget: system prompts, tool definitions, retrieved documents, message history, and tool outputs."

"The fundamental challenge is that context windows are constrained not by raw token capacity but by attention mechanics. As context length increases, models exhibit predictable degradation patterns: the 'lost-in-the-middle' phenomenon, U-shaped attention curves, and attention scarcity."

Key design principles:

  1. Progressive Disclosure — load only names at startup; load full skill content when activated
  2. Platform Agnosticism — transferable principles, not vendor-specific implementations
  3. Conceptual Foundation with Practical Examples — Python pseudocode that works across environments

Academic Positioning

The paper citation confirms that this repository is treated as foundational work on "static skill architecture." The Peking University paper contrasts it with dynamic skill evolution (MCE), while the CMU survey includes it in a broad review of agent harness engineering patterns.

Organizing Philosophy

Skills are organized to cover a complete "stack" for agent systems:

  • Why context matters (foundational)
  • How to architect multi-agent systems (architectural)
  • How to optimize running systems (operational)
  • How to design the meta-level harness (harness engineering)
  • How to apply formal cognitive modeling (BDI)
02

Architecture

Agent Skills for Context Engineering (muratcankoylan) — Architecture

Distribution

  • GitHub: muratcankoylan/Agent-Skills-for-Context-Engineering
  • Install (Claude Code): /plugin marketplace add muratcankoylan/Agent-Skills-for-Context-Engineering then /plugin install context-engineering@context-engineering-marketplace
  • Install (Cursor): Listed on Cursor Plugin Directory
  • Install (Open Plugins): .plugin/plugin.json follows Open Plugins standard
  • Individual skill: Copy skills/<name>/SKILL.md to .claude/skills/
  • License: MIT
  • Primary language: Python (pseudocode examples in scripts)

Directory Structure

Agent-Skills-for-Context-Engineering/
├── .claude-plugin/          # Claude Code plugin manifest
├── .plugin/
│   └── plugin.json          # Open Plugins standard manifest
├── CLAUDE.md               # Top-level agent instructions
├── AGENTS.md               # Agent operating guide
├── SKILL.md                # Top-level skill reference
├── skills/
│   ├── context-fundamentals/
│   │   └── SKILL.md
│   ├── context-degradation/
│   │   └── SKILL.md
│   ├── context-compression/
│   │   └── SKILL.md
│   ├── context-optimization/
│   │   └── SKILL.md
│   ├── latent-briefing/
│   │   └── SKILL.md
│   ├── multi-agent-patterns/
│   │   └── SKILL.md
│   ├── memory-systems/
│   │   └── SKILL.md
│   ├── tool-design/
│   │   └── SKILL.md
│   ├── filesystem-context/
│   │   └── SKILL.md
│   ├── hosted-agents/
│   │   └── SKILL.md
│   ├── evaluation/
│   │   └── SKILL.md
│   ├── advanced-evaluation/
│   │   └── SKILL.md
│   ├── harness-engineering/
│   │   └── SKILL.md
│   ├── project-development/
│   │   └── SKILL.md
│   └── bdi-mental-states/
│       └── SKILL.md
├── .claude-plugin/
│   └── marketplace.json     # Marketplace skill catalog with activation scenarios
├── researcher/              # Research/reference materials
├── examples/                # Usage examples
├── template/                # Skill template for new skills
└── docs/                    # Documentation

Target AI Tools

  • Claude Code (native)
  • Cursor (Plugin Directory listing)
  • Any agent platform conformant with Open Plugins standard (Codex, GitHub Copilot, etc.)

Required Runtime

  • None — pure Markdown skills with Python pseudocode examples
  • No binary, no build step, no MCP server required
03

Components

Agent Skills for Context Engineering (muratcankoylan) — Components

Skills (15 total)

Foundational Skills

Skill Description Activation Scenario
context-fundamentals Context anatomy, attention mechanics, U-shaped curve, progressive disclosure Explaining context concepts, onboarding, first-principles reasoning
context-degradation Lost-in-middle, context poisoning, distraction, attention failure patterns Diagnosing attention failures, degraded agent performance
context-compression Compaction strategies, session summarization, trajectory compression Preserving state while reducing conversation size

Architectural Skills

Skill Description Activation Scenario
multi-agent-patterns Orchestrator, peer-to-peer, hierarchical coordination; handoff design Choosing coordination patterns, designing multi-agent handoffs
memory-systems Short-term, long-term, graph-based memory architectures Persisting cross-session knowledge, entity tracking
tool-design Agent-tool contracts, description writing, error recovery Defining tools, improving tool descriptions
filesystem-context Dynamic context discovery, tool-output offloading, plan persistence via files Moving large context to files, coordinating agents through shared artifacts
hosted-agents (NEW v3.0) Background coding agents, sandboxed VMs, pre-built images, multiplayer support Building agents that run in remote sandboxes or background environments

Operational Skills

Skill Description Activation Scenario
context-optimization Compaction, masking, caching, budget allocation Token efficiency, retrieval precision, prefix reuse
latent-briefing Task-guided KV cache compaction for worker initialization Sharing orchestrator state with workers when runtime is controllable
evaluation Deterministic checks, rubrics, regression suites, production monitoring Quality gates, agent behavior evaluation
advanced-evaluation LLM-as-a-Judge, pairwise comparison, rubric generation, bias mitigation Using LLM judges, calibration, human-aligned quality assessment
harness-engineering Autonomous loops with locked evaluators, editable surfaces, durable logs, novelty gates, rollback Designing autonomous agent harnesses

Methodological Skills

Skill Description Activation Scenario
project-development LLM project design, pipeline architecture, task-model fit Deciding if LLM is appropriate, pipeline shaping, cost estimation
bdi-mental-states (NEW v3.0) BDI ontology: beliefs, desires, intentions from RDF context Formal cognitive modeling for deliberative reasoning and explainability

Marketplace Metadata

The marketplace.json contains:

  • 15 skill entries with full description and activate_when scenarios
  • Each skill has explicit routing: "Do not activate this skill for X — use Y instead"

Templates

  • template/ — skill template for creating new context engineering skills

Total Primitive Count

  • Skills: 15
  • Commands: 0
  • Agents: 0
  • Hooks: 0
  • MCP servers: 0
  • Scripts: Python pseudocode examples (not executable scripts)
05

Prompts

Agent Skills for Context Engineering (muratcankoylan) — Prompts

Prompt File 1: context-fundamentals SKILL.md (verbatim excerpt)

Technique: Principle enumeration with explicit anti-examples and routing directives. The skill establishes conceptual ownership and then routes to other skills for operational work — creating a skill-to-skill handoff graph.

---
name: context-fundamentals
description: This skill should be used to explain or reason about the foundational concepts
  of context engineering: what context is, the anatomy of a context window, how attention
  mechanics work, the U-shaped attention curve, why context quality matters more than
  quantity, and the mental models needed to interpret every other context-engineering
  decision.
---

# Context Engineering Fundamentals

Context is the complete state available to a language model at inference time: system
instructions, tool definitions, retrieved documents, message history, and tool outputs.
Context engineering is the discipline of curating the smallest high-signal token set
that maximizes the likelihood of desired outcomes.

Apply four principles when assembling context:

1. **Informativity over exhaustiveness** — include only what matters for the current
   decision; design systems that can retrieve additional information on demand.
2. **Position-aware placement** — place critical constraints at the beginning and end of
   context because long-context evaluations show middle-position information is less
   reliably recovered.
3. **Progressive disclosure** — load skill names and summaries at startup; load full
   content only when a skill activates for a specific task.
4. **Iterative curation** — context engineering is not a one-time prompt-writing
   exercise but an ongoing discipline applied every time content is passed to the model.

Prompting technique: Foundational-concept anchoring — the SKILL.md establishes definitions and first-principles before any operational guidance. This grounds all downstream skills in a shared vocabulary.

Prompt File 2: harness-engineering SKILL.md (verbatim excerpt)

Technique: Surface classification table — a structured four-class taxonomy for agent permissions with explicit rules.

### Harness Boundary

Separate the agent from the environment it operates inside. The agent proposes actions;
the harness defines allowed surfaces, feedback, persistence, and promotion rules.

Use four surface classes:

| Surface | Examples | Rule |
| --- | --- | --- |
| Locked | Eval metric, rubric, validation script, merge policy | Agent may read and propose changes, but cannot score itself with modified rules |
| Editable | Skill draft, experiment file, prompt, config under test | Agent may mutate during the loop |
| Append-only | Results log, research thread, rejected ideas | Agent may append, not rewrite |
| Human-controlled | Merge, production deploy, credentials, destructive operations | Requires explicit human approval |

### Tight Feedback Loops

Autonomy works when feedback is fast, unambiguous, and hard to game. Karpathy's
`autoresearch` is the minimal pattern: one editable file, one locked evaluation file,
fixed wall-clock budget, one scalar metric, git rollback, and a durable results log.

Prompting technique: Permission taxonomy with rule table — uses structured tabular format to make the permission model machine-readable and unambiguous. References Karpathy's autoresearch as a concrete anchor example.

09

Uniqueness

Agent Skills for Context Engineering (muratcankoylan) — Uniqueness

differs_from_seeds

Differs from all 11 seeds in fundamental purpose. While seeds like superpowers, spec-driver, and BMAD use skills to enforce workflow behaviors (TDD, git discipline, spec-writing), this repo uses skills to teach the underlying theory of agent system design. The closest structural match is superpowers (Archetype 1: skills-only), but the content is conceptual rather than operational. No seed contains a BDI mental states skill, a latent-briefing skill, or formal attention-mechanics documentation. The harness-engineering skill's surface-class taxonomy (locked/editable/append-only/human-controlled) is conceptually related to claude-flow's hive-mind architecture but expressed as a design framework rather than a runtime implementation. This repo is the only one in the entire corpus to be cited in peer-reviewed academic literature.

Most Distinctive Feature

The academic citation and explicit research grounding. Most skill repos cite "best practices" without evidence; this one is referenced by Peking University and a six-university consortium. The harness-engineering skill's locked/editable/append-only/human-controlled taxonomy is the most sophisticated permission model documented in any skill pack in the corpus.

Positioning

  • Highest-starred personal skill collection in this batch (16,030 stars)
  • Educational + operational: teaches principles AND provides activatable skills
  • Platform-agnostic: Open Plugins standard + Cursor Directory + individual copy-paste
  • Academic validation: cited in two papers as "foundational work on static skill architecture"

Observable Failure Modes

  1. Knowledge without runtime: Skills teach patterns (e.g., harness loops, memory systems) but provide no implementation. Users must build the actual systems themselves.
  2. Star count misleading: 16,030 stars may reflect educational appeal rather than production adoption. Actual user count unclear.
  3. Latent-briefing skill is extremely niche: Requires controllable worker runtime AND model compatibility for KV cache compaction — conditions that rarely hold in practice.
  4. BDI mental states complexity: The formal ontology approach (RDF, SPARQL-style reasoning) is advanced enough that few practitioners would use it directly.
  5. No hooks or validators: Zero automated quality gates — relies entirely on the AI activating the right skill at the right time.
04

Workflow

Agent Skills for Context Engineering (muratcankoylan) — Workflow

Usage Pattern

This is a knowledge/reference library, not a workflow orchestrator. The workflow is:

  1. Install — Add to Claude Code as marketplace or install individual skills
  2. Task begins — Claude auto-activates the relevant skill based on the task description matching the skill's activation scenario
  3. Skill provides guidance — The SKILL.md content provides principles, patterns, and examples relevant to the current task
  4. User/agent acts — Applies the knowledge to the actual work

Activation Routing (from marketplace.json)

The skills explicitly route to each other to prevent overlap:

From context-fundamentals:

"Do not activate this skill for operational work. The specialized skills handle the doing:

  • Diagnosing lost-in-middle, context poisoning: context-degradation
  • Reducing token cost: context-optimization
  • Compressing a long session: context-compression
  • Offloading large tool outputs: filesystem-context
  • Deciding the shape of an LLM project: project-development"

This inter-skill routing creates a skill taxonomy graph where each skill "owns" a distinct problem domain and routes others away.

Phase-to-Artifact Map

There are no workflow phases in the traditional sense. Each skill provides:

  • Conceptual framework
  • Decision heuristics
  • Example patterns (Python pseudocode)
  • Anti-patterns to avoid

Approval Gates

None — reference library, no approval gates.

Skill Selection Logic

From the README's activation table:

Skill Activate When
context-fundamentals Establishing context-window mental models, onboarding
context-degradation Diagnosing attention failures, degraded performance
context-compression Preserving useful state while reducing conversation size
context-optimization Improving token efficiency, retrieval precision
latent-briefing Sharing orchestrator trajectory with workers via KV cache compaction
multi-agent-patterns Choosing coordination patterns, designing handoffs
memory-systems Persisting cross-session knowledge, choosing memory frameworks
tool-design Defining agent-tool contracts, improving tool descriptions
filesystem-context Moving large context to files, creating scratchpads
hosted-agents Running coding agents in remote sandboxes
evaluation Creating deterministic checks, rubrics, regression suites
advanced-evaluation Using LLM judges, pairwise comparison, bias mitigation
harness-engineering Designing autonomous loops with locked evaluators
project-development Deciding whether LLM is appropriate, shaping pipelines
bdi-mental-states Modeling beliefs, desires, intentions for agents
06

Memory Context

Agent Skills for Context Engineering (muratcankoylan) — Memory & Context

Irony Note

This repository teaches context and memory engineering but does not implement any of it. The skills are pure knowledge documents with no runtime memory system of their own.

What the Skills Teach

The memory-systems skill covers:

  • Short-term memory: In-context conversation history (volatile, session-scoped)
  • Long-term memory: File-based or database-backed (persistent, project or global)
  • Graph-based memory: Entity-relationship tracking (e.g., Neo4j — as in ccmemory)
  • Retrieval semantics: When and how to load memory into context

The filesystem-context skill covers:

  • Using files as a "just-in-time retrieval index" (maintain paths, not copies)
  • Tool-output offloading: replace verbose tool outputs with compact file references
  • Shared scratchpads for multi-agent coordination
  • Durable plans persisted to disk across sessions

Progressive Disclosure in This Repo

The repo itself implements the progressive disclosure pattern it teaches:

  • The marketplace.json stores skill descriptions (minimal — loaded at startup)
  • Full SKILL.md content loads only when a skill is activated

State Files

None. This repo writes no runtime state files. It is a knowledge library.

Cross-Session Handoff

The filesystem-context and harness-engineering skills document how to design handoffs, but the repo itself does not implement any handoff mechanism.

07

Orchestration

Agent Skills for Context Engineering (muratcankoylan) — Orchestration

Multi-Agent

No runtime multi-agent capability. The multi-agent-patterns skill teaches multi-agent design but does not implement it.

What the Skills Teach About Orchestration

From multi-agent-patterns:

  • Orchestrator pattern: Central coordinator dispatches tasks to workers, collects results
  • Peer-to-peer: Agents communicate directly, share context via shared scratchpads
  • Hierarchical: Multi-level trees with specialization at each layer

From harness-engineering:

  • Agent-as-proposer: agent proposes actions; harness decides what to execute
  • Locked/editable/append-only/human-controlled surface classification
  • Novelty gates: prevent agent from repeatedly trying similar failed approaches
  • Rollback: git-based state recovery when agent corrupts objective

Execution Mode

One-shot or interactive — skills activate on demand when relevant to the current task.

Isolation Mechanism

None for this repo. The harness-engineering skill teaches container and git-based isolation patterns.

Prompt Chaining

No — each skill activation is independent. Skills route to each other but do not pass artifacts between them.

Consensus

None. The advanced-evaluation skill covers LLM-as-a-Judge and pairwise comparison as evaluation patterns, which could be used in a consensus architecture, but this repo does not implement consensus.

08

Ui Cli Surface

Agent Skills for Context Engineering (muratcankoylan) — UI & CLI Surface

CLI Binary

None.

Local UI

None.

IDE Integration

  • Claude Code: Plugin marketplace install
  • Cursor: Listed on Cursor Plugin Directory
  • Open Plugins standard: .plugin/plugin.json for any conformant agent tool
  • Individual install: Copy SKILL.md to .claude/skills/

Observability

None — pure knowledge library. No audit log, no metrics.

Documentation

  • README.md with full skill table and activation scenarios
  • DeepWiki: https://deepwiki.com/muratcankoylan/Agent-Skills-for-Context-Engineering
  • Cited in two academic papers (links in README)

External References Cited in Skills

The skills cite specific external research:

  • "lost-in-the-middle" phenomenon (claim-context-degradation-lost-middle-ruler internal citation format)
  • Karpathy's autoresearch pattern (harness-engineering skill)
  • Prime Intellect's autonomous nanoGPT work (harness-engineering skill)
  • U-shaped attention curve (context-fundamentals skill)
  • Position encoding interpolation limits (context-fundamentals skill)

Related frameworks

same archetype · same primary tool · same memory type

MemPalace ★ 53k

Verbatim local-first AI memory with 96.6% R@5 retrieval on LongMemEval using zero API calls — structured into a palace hierarchy…

Beads (Yegge) ★ 24k

Dolt-powered distributed graph issue tracker where AI agents track tasks with hierarchical IDs and dependency edges, claim work…

deepagents (LangChain) ★ 23k

Opinionated Python agent harness on top of LangGraph with sub-agents, filesystem, memory, and context compaction bundled in

agentmemory ★ 18k

Persistent, searchable memory for AI coding agents that captures every tool interaction, compresses it via LLM, and injects…

Open Multi-Agent ★ 6.3k

Give a natural-language goal to a coordinator agent and get a dynamically decomposed, parallelized task DAG executed by…

Basic Memory ★ 3.1k

Gives AI agents a persistent, human-readable knowledge graph of project decisions, observations, and relations stored as plain…