Agent Skills for Context Engineering

muratcankoylan-skills · muratcankoylan/Agent-Skills-for-Context-Engineering · ★ 16k · last commit 2026-05-26

Primitive shape 15 total

Skills 15

Summary

Agent Skills for Context Engineering (muratcankoylan) — Summary

This is the highest-starred personal skill collection in this batch (16,030 stars) — a comprehensive 15-skill educational and operational library teaching context engineering and agent harness engineering principles. Unlike most skill packs that target feature delivery, this repo is explicitly about teaching how to build better AI agent systems: context window management, attention mechanics, multi-agent coordination patterns, memory systems, evaluation frameworks, and autonomous harness design. The skills are organized into four tiers: foundational (context anatomy, degradation, compression), architectural (multi-agent, memory, tool design, filesystem, hosted agents), operational (optimization, latent briefing, evaluation, harness engineering), and methodological (project development, BDI mental states). Cited in two peer-reviewed papers (Peking University and a CMU/Yale/JHU/etc. consortium survey), making it one of few personal skill repos with academic validation.

differs_from_seeds: Differs from all 11 seeds in purpose — not an engineering workflow accelerator but a domain knowledge library about context engineering itself. Closest to superpowers (Archetype 1: skills-only) in structure, but where superpowers enforces workflow behaviors, this repo teaches underlying principles. Unlike BMAD (34 skills focused on project delivery), these 15 skills focus on meta-level agent system design. The harness-engineering skill directly competes conceptually with claude-flow's hive-mind orchestration but is a reference guide rather than a runtime.

Overview

Agent Skills for Context Engineering (muratcankoylan) — Overview

Origin

Created by Muratcan Koylan. The repository has been cited in two academic papers:

"Meta Context Engineering via Agentic Skill Evolution" (Peking University, 2025): "While static skills are well-recognized [Anthropic, 2025b; Muratcan Koylan, 2025], MCE is among the first to dynamically evolve them, bridging manual skill engineering and autonomous self-improvement."
"Agent Harness Engineering: A Survey" (CMU, Yale, JHU, NEU, Tulane, UAB, OSU, Virginia Tech, and Amazon, 2026)

Philosophy

The README opens with a foundational definition:

"Context engineering is the discipline of managing the language model's context window. Unlike prompt engineering, which focuses on crafting effective instructions, context engineering addresses the holistic curation of all information that enters the model's limited attention budget: system prompts, tool definitions, retrieved documents, message history, and tool outputs."

"The fundamental challenge is that context windows are constrained not by raw token capacity but by attention mechanics. As context length increases, models exhibit predictable degradation patterns: the 'lost-in-the-middle' phenomenon, U-shaped attention curves, and attention scarcity."

Key design principles:

Progressive Disclosure — load only names at startup; load full skill content when activated
Platform Agnosticism — transferable principles, not vendor-specific implementations
Conceptual Foundation with Practical Examples — Python pseudocode that works across environments

Academic Positioning

The paper citation confirms that this repository is treated as foundational work on "static skill architecture." The Peking University paper contrasts it with dynamic skill evolution (MCE), while the CMU survey includes it in a broad review of agent harness engineering patterns.

Organizing Philosophy

Skills are organized to cover a complete "stack" for agent systems:

Why context matters (foundational)
How to architect multi-agent systems (architectural)
How to optimize running systems (operational)
How to design the meta-level harness (harness engineering)
How to apply formal cognitive modeling (BDI)

Architecture

Agent Skills for Context Engineering (muratcankoylan) — Architecture

Distribution

GitHub: muratcankoylan/Agent-Skills-for-Context-Engineering
Install (Claude Code): /plugin marketplace add muratcankoylan/Agent-Skills-for-Context-Engineering then /plugin install context-engineering@context-engineering-marketplace
Install (Cursor): Listed on Cursor Plugin Directory
Install (Open Plugins): .plugin/plugin.json follows Open Plugins standard
Individual skill: Copy skills/<name>/SKILL.md to .claude/skills/
License: MIT
Primary language: Python (pseudocode examples in scripts)

Directory Structure

Agent-Skills-for-Context-Engineering/
├── .claude-plugin/          # Claude Code plugin manifest
├── .plugin/
│   └── plugin.json          # Open Plugins standard manifest
├── CLAUDE.md               # Top-level agent instructions
├── AGENTS.md               # Agent operating guide
├── SKILL.md                # Top-level skill reference
├── skills/
│   ├── context-fundamentals/
│   │   └── SKILL.md
│   ├── context-degradation/
│   │   └── SKILL.md
│   ├── context-compression/
│   │   └── SKILL.md
│   ├── context-optimization/
│   │   └── SKILL.md
│   ├── latent-briefing/
│   │   └── SKILL.md
│   ├── multi-agent-patterns/
│   │   └── SKILL.md
│   ├── memory-systems/
│   │   └── SKILL.md
│   ├── tool-design/
│   │   └── SKILL.md
│   ├── filesystem-context/
│   │   └── SKILL.md
│   ├── hosted-agents/
│   │   └── SKILL.md
│   ├── evaluation/
│   │   └── SKILL.md
│   ├── advanced-evaluation/
│   │   └── SKILL.md
│   ├── harness-engineering/
│   │   └── SKILL.md
│   ├── project-development/
│   │   └── SKILL.md
│   └── bdi-mental-states/
│       └── SKILL.md
├── .claude-plugin/
│   └── marketplace.json     # Marketplace skill catalog with activation scenarios
├── researcher/              # Research/reference materials
├── examples/                # Usage examples
├── template/                # Skill template for new skills
└── docs/                    # Documentation

Target AI Tools

Claude Code (native)
Cursor (Plugin Directory listing)
Any agent platform conformant with Open Plugins standard (Codex, GitHub Copilot, etc.)

Required Runtime

None — pure Markdown skills with Python pseudocode examples
No binary, no build step, no MCP server required

Components

Agent Skills for Context Engineering (muratcankoylan) — Components

Skills (15 total)

Foundational Skills

Skill	Description	Activation Scenario
`context-fundamentals`	Context anatomy, attention mechanics, U-shaped curve, progressive disclosure	Explaining context concepts, onboarding, first-principles reasoning
`context-degradation`	Lost-in-middle, context poisoning, distraction, attention failure patterns	Diagnosing attention failures, degraded agent performance
`context-compression`	Compaction strategies, session summarization, trajectory compression	Preserving state while reducing conversation size

Architectural Skills

Skill	Description	Activation Scenario
`multi-agent-patterns`	Orchestrator, peer-to-peer, hierarchical coordination; handoff design	Choosing coordination patterns, designing multi-agent handoffs
`memory-systems`	Short-term, long-term, graph-based memory architectures	Persisting cross-session knowledge, entity tracking
`tool-design`	Agent-tool contracts, description writing, error recovery	Defining tools, improving tool descriptions
`filesystem-context`	Dynamic context discovery, tool-output offloading, plan persistence via files	Moving large context to files, coordinating agents through shared artifacts
`hosted-agents` (NEW v3.0)	Background coding agents, sandboxed VMs, pre-built images, multiplayer support	Building agents that run in remote sandboxes or background environments

Operational Skills

Skill	Description	Activation Scenario
`context-optimization`	Compaction, masking, caching, budget allocation	Token efficiency, retrieval precision, prefix reuse
`latent-briefing`	Task-guided KV cache compaction for worker initialization	Sharing orchestrator state with workers when runtime is controllable
`evaluation`	Deterministic checks, rubrics, regression suites, production monitoring	Quality gates, agent behavior evaluation
`advanced-evaluation`	LLM-as-a-Judge, pairwise comparison, rubric generation, bias mitigation	Using LLM judges, calibration, human-aligned quality assessment
`harness-engineering`	Autonomous loops with locked evaluators, editable surfaces, durable logs, novelty gates, rollback	Designing autonomous agent harnesses

Methodological Skills

Skill	Description	Activation Scenario
`project-development`	LLM project design, pipeline architecture, task-model fit	Deciding if LLM is appropriate, pipeline shaping, cost estimation
`bdi-mental-states` (NEW v3.0)	BDI ontology: beliefs, desires, intentions from RDF context	Formal cognitive modeling for deliberative reasoning and explainability

Marketplace Metadata

The marketplace.json contains:

15 skill entries with full description and activate_when scenarios
Each skill has explicit routing: "Do not activate this skill for X — use Y instead"

Templates

template/ — skill template for creating new context engineering skills

Total Primitive Count

Skills: 15
Commands: 0
Agents: 0
Hooks: 0
MCP servers: 0
Scripts: Python pseudocode examples (not executable scripts)

Prompts

Agent Skills for Context Engineering (muratcankoylan) — Prompts

Prompt File 1: `context-fundamentals` SKILL.md (verbatim excerpt)

Technique: Principle enumeration with explicit anti-examples and routing directives. The skill establishes conceptual ownership and then routes to other skills for operational work — creating a skill-to-skill handoff graph.

---
name: context-fundamentals
description: This skill should be used to explain or reason about the foundational concepts
  of context engineering: what context is, the anatomy of a context window, how attention
  mechanics work, the U-shaped attention curve, why context quality matters more than
  quantity, and the mental models needed to interpret every other context-engineering
  decision.
---

# Context Engineering Fundamentals

Context is the complete state available to a language model at inference time: system
instructions, tool definitions, retrieved documents, message history, and tool outputs.
Context engineering is the discipline of curating the smallest high-signal token set
that maximizes the likelihood of desired outcomes.

Apply four principles when assembling context:

1. **Informativity over exhaustiveness** — include only what matters for the current
   decision; design systems that can retrieve additional information on demand.
2. **Position-aware placement** — place critical constraints at the beginning and end of
   context because long-context evaluations show middle-position information is less
   reliably recovered.
3. **Progressive disclosure** — load skill names and summaries at startup; load full
   content only when a skill activates for a specific task.
4. **Iterative curation** — context engineering is not a one-time prompt-writing
   exercise but an ongoing discipline applied every time content is passed to the model.

Prompting technique: Foundational-concept anchoring — the SKILL.md establishes definitions and first-principles before any operational guidance. This grounds all downstream skills in a shared vocabulary.

Prompt File 2: `harness-engineering` SKILL.md (verbatim excerpt)

Technique: Surface classification table — a structured four-class taxonomy for agent permissions with explicit rules.

### Harness Boundary

Separate the agent from the environment it operates inside. The agent proposes actions;
the harness defines allowed surfaces, feedback, persistence, and promotion rules.

Use four surface classes:

| Surface | Examples | Rule |
| --- | --- | --- |
| Locked | Eval metric, rubric, validation script, merge policy | Agent may read and propose changes, but cannot score itself with modified rules |
| Editable | Skill draft, experiment file, prompt, config under test | Agent may mutate during the loop |
| Append-only | Results log, research thread, rejected ideas | Agent may append, not rewrite |
| Human-controlled | Merge, production deploy, credentials, destructive operations | Requires explicit human approval |

### Tight Feedback Loops

Autonomy works when feedback is fast, unambiguous, and hard to game. Karpathy's
`autoresearch` is the minimal pattern: one editable file, one locked evaluation file,
fixed wall-clock budget, one scalar metric, git rollback, and a durable results log.

Prompting technique: Permission taxonomy with rule table — uses structured tabular format to make the permission model machine-readable and unambiguous. References Karpathy's autoresearch as a concrete anchor example.

Uniqueness

Agent Skills for Context Engineering (muratcankoylan) — Uniqueness

differs_from_seeds

Differs from all 11 seeds in fundamental purpose. While seeds like superpowers, spec-driver, and BMAD use skills to enforce workflow behaviors (TDD, git discipline, spec-writing), this repo uses skills to teach the underlying theory of agent system design. The closest structural match is superpowers (Archetype 1: skills-only), but the content is conceptual rather than operational. No seed contains a BDI mental states skill, a latent-briefing skill, or formal attention-mechanics documentation. The harness-engineering skill's surface-class taxonomy (locked/editable/append-only/human-controlled) is conceptually related to claude-flow's hive-mind architecture but expressed as a design framework rather than a runtime implementation. This repo is the only one in the entire corpus to be cited in peer-reviewed academic literature.

Most Distinctive Feature

The academic citation and explicit research grounding. Most skill repos cite "best practices" without evidence; this one is referenced by Peking University and a six-university consortium. The harness-engineering skill's locked/editable/append-only/human-controlled taxonomy is the most sophisticated permission model documented in any skill pack in the corpus.

Positioning

Highest-starred personal skill collection in this batch (16,030 stars)
Educational + operational: teaches principles AND provides activatable skills
Platform-agnostic: Open Plugins standard + Cursor Directory + individual copy-paste
Academic validation: cited in two papers as "foundational work on static skill architecture"

Observable Failure Modes

Knowledge without runtime: Skills teach patterns (e.g., harness loops, memory systems) but provide no implementation. Users must build the actual systems themselves.
Star count misleading: 16,030 stars may reflect educational appeal rather than production adoption. Actual user count unclear.
Latent-briefing skill is extremely niche: Requires controllable worker runtime AND model compatibility for KV cache compaction — conditions that rarely hold in practice.
BDI mental states complexity: The formal ontology approach (RDF, SPARQL-style reasoning) is advanced enough that few practitioners would use it directly.
No hooks or validators: Zero automated quality gates — relies entirely on the AI activating the right skill at the right time.

Workflow

Agent Skills for Context Engineering (muratcankoylan) — Workflow

Usage Pattern

This is a knowledge/reference library, not a workflow orchestrator. The workflow is:

Install — Add to Claude Code as marketplace or install individual skills
Task begins — Claude auto-activates the relevant skill based on the task description matching the skill's activation scenario
Skill provides guidance — The SKILL.md content provides principles, patterns, and examples relevant to the current task
User/agent acts — Applies the knowledge to the actual work

Activation Routing (from marketplace.json)

The skills explicitly route to each other to prevent overlap:

From context-fundamentals:

"Do not activate this skill for operational work. The specialized skills handle the doing:

Diagnosing lost-in-middle, context poisoning: context-degradation

Reducing token cost: context-optimization

Compressing a long session: context-compression

Offloading large tool outputs: filesystem-context

Deciding the shape of an LLM project: project-development"

This inter-skill routing creates a skill taxonomy graph where each skill "owns" a distinct problem domain and routes others away.

Phase-to-Artifact Map

There are no workflow phases in the traditional sense. Each skill provides:

Conceptual framework
Decision heuristics
Example patterns (Python pseudocode)
Anti-patterns to avoid

Approval Gates

None — reference library, no approval gates.

Skill Selection Logic

From the README's activation table:

Skill	Activate When
`context-fundamentals`	Establishing context-window mental models, onboarding
`context-degradation`	Diagnosing attention failures, degraded performance
`context-compression`	Preserving useful state while reducing conversation size
`context-optimization`	Improving token efficiency, retrieval precision
`latent-briefing`	Sharing orchestrator trajectory with workers via KV cache compaction
`multi-agent-patterns`	Choosing coordination patterns, designing handoffs
`memory-systems`	Persisting cross-session knowledge, choosing memory frameworks
`tool-design`	Defining agent-tool contracts, improving tool descriptions
`filesystem-context`	Moving large context to files, creating scratchpads
`hosted-agents`	Running coding agents in remote sandboxes
`evaluation`	Creating deterministic checks, rubrics, regression suites
`advanced-evaluation`	Using LLM judges, pairwise comparison, bias mitigation
`harness-engineering`	Designing autonomous loops with locked evaluators
`project-development`	Deciding whether LLM is appropriate, shaping pipelines
`bdi-mental-states`	Modeling beliefs, desires, intentions for agents

Memory Context

Agent Skills for Context Engineering (muratcankoylan) — Memory & Context

Irony Note

This repository teaches context and memory engineering but does not implement any of it. The skills are pure knowledge documents with no runtime memory system of their own.

What the Skills Teach

The memory-systems skill covers:

Short-term memory: In-context conversation history (volatile, session-scoped)
Long-term memory: File-based or database-backed (persistent, project or global)
Graph-based memory: Entity-relationship tracking (e.g., Neo4j — as in ccmemory)
Retrieval semantics: When and how to load memory into context

The filesystem-context skill covers:

Using files as a "just-in-time retrieval index" (maintain paths, not copies)
Tool-output offloading: replace verbose tool outputs with compact file references
Shared scratchpads for multi-agent coordination
Durable plans persisted to disk across sessions

Progressive Disclosure in This Repo

The repo itself implements the progressive disclosure pattern it teaches:

The marketplace.json stores skill descriptions (minimal — loaded at startup)
Full SKILL.md content loads only when a skill is activated

State Files

None. This repo writes no runtime state files. It is a knowledge library.

Cross-Session Handoff

The filesystem-context and harness-engineering skills document how to design handoffs, but the repo itself does not implement any handoff mechanism.

Orchestration

Agent Skills for Context Engineering (muratcankoylan) — Orchestration

Multi-Agent

No runtime multi-agent capability. The multi-agent-patterns skill teaches multi-agent design but does not implement it.

What the Skills Teach About Orchestration

From multi-agent-patterns:

Orchestrator pattern: Central coordinator dispatches tasks to workers, collects results
Peer-to-peer: Agents communicate directly, share context via shared scratchpads
Hierarchical: Multi-level trees with specialization at each layer

From harness-engineering:

Agent-as-proposer: agent proposes actions; harness decides what to execute
Locked/editable/append-only/human-controlled surface classification
Novelty gates: prevent agent from repeatedly trying similar failed approaches
Rollback: git-based state recovery when agent corrupts objective

Execution Mode

One-shot or interactive — skills activate on demand when relevant to the current task.

Isolation Mechanism

None for this repo. The harness-engineering skill teaches container and git-based isolation patterns.

Prompt Chaining

No — each skill activation is independent. Skills route to each other but do not pass artifacts between them.

Consensus

None. The advanced-evaluation skill covers LLM-as-a-Judge and pairwise comparison as evaluation patterns, which could be used in a consensus architecture, but this repo does not implement consensus.

Ui Cli Surface

Agent Skills for Context Engineering (muratcankoylan) — UI & CLI Surface

CLI Binary

None.

Local UI

None.

IDE Integration

Claude Code: Plugin marketplace install
Cursor: Listed on Cursor Plugin Directory
Open Plugins standard: .plugin/plugin.json for any conformant agent tool
Individual install: Copy SKILL.md to .claude/skills/

Observability

None — pure knowledge library. No audit log, no metrics.

Documentation

README.md with full skill table and activation scenarios
DeepWiki: https://deepwiki.com/muratcankoylan/Agent-Skills-for-Context-Engineering
Cited in two academic papers (links in README)

External References Cited in Skills

The skills cite specific external research:

"lost-in-the-middle" phenomenon (claim-context-degradation-lost-middle-ruler internal citation format)
Karpathy's autoresearch pattern (harness-engineering skill)
Prime Intellect's autonomous nanoGPT work (harness-engineering skill)
U-shaped attention curve (context-fundamentals skill)
Position encoding interpolation limits (context-fundamentals skill)

Related frameworks

same archetype · same primary tool · same memory type

MemPalace ★ 53k

A10 Memory engine

Verbatim local-first AI memory with 96.6% R@5 retrieval on LongMemEval using zero API calls — structured into a palace hierarchy…

Beads (Yegge) ★ 24k

A10 Memory engine

Dolt-powered distributed graph issue tracker where AI agents track tasks with hierarchical IDs and dependency edges, claim work…

deepagents (LangChain) ★ 23k

A10 Memory engine

Opinionated Python agent harness on top of LangGraph with sub-agents, filesystem, memory, and context compaction bundled in

agentmemory ★ 18k

A10 Memory engine

Persistent, searchable memory for AI coding agents that captures every tool interaction, compresses it via LLM, and injects…

Open Multi-Agent ★ 6.3k

A10 Memory engine

Give a natural-language goal to a coordinator agent and get a dynamically decomposed, parallelized task DAG executed by…

Basic Memory ★ 3.1k

A10 Memory engine

Gives AI agents a persistent, human-readable knowledge graph of project decisions, observations, and relations stored as plain…

Distribution

Type: claude-plugin
License: MIT
Install: multi-step

Surfaces

CLI binary: No
CLI subcmds: 0
Local UI: No
Tech stack: none

Components

Commands: 0
Skills: 15
Subagents: 0
Hooks: 0
MCP servers: 0
MCP tools: 0
Scripts: 0
Templates: 1

Workflow

Phases: 3
Approval gates: 0
Spec format: markdown
Spec storage: none
Delta or full: none

Orchestration

Multi-agent: No
Pattern: none
Max concurrent: 1
Isolation: none
Consensus: none
Prompt chaining: No

Multi-model

Multi-model: No
BYOK: Yes
Modal: text

Execution

Mode: one-shot
Crash recovery: No
Compaction: No
Session handoff: No
Streaming: No

Memory

Type: none
Persistence: none
Search: none

Quality

TDD: No
TDD mechanism: none
Self-review: none

Git / Observability

Auto commit: No
Auto PR: No
Auto merge: No
Worktree/feat: No
Audit log: No
Audit format: none
Replay: No

Tools

Primary: claude-code
Targets: 3
Portability: high

Signals

Stars: 16k
Last commit: 2026-05-26
Contributors: 1
Maintainer: active
Quality score: 0/10

Summary

Agent Skills for Context Engineering (muratcankoylan) — Summary

Overview

Agent Skills for Context Engineering (muratcankoylan) — Overview

Origin

Philosophy

Academic Positioning

Organizing Philosophy

Architecture

Agent Skills for Context Engineering (muratcankoylan) — Architecture

Distribution

Directory Structure

Target AI Tools

Required Runtime

Components

Agent Skills for Context Engineering (muratcankoylan) — Components

Skills (15 total)

Foundational Skills

Architectural Skills

Operational Skills

Methodological Skills

Marketplace Metadata

Templates

Total Primitive Count

Prompts

Agent Skills for Context Engineering (muratcankoylan) — Prompts

Prompt File 1: context-fundamentals SKILL.md (verbatim excerpt)

Prompt File 2: harness-engineering SKILL.md (verbatim excerpt)

Uniqueness

Agent Skills for Context Engineering (muratcankoylan) — Uniqueness

differs_from_seeds

Most Distinctive Feature

Positioning

Observable Failure Modes

Workflow

Agent Skills for Context Engineering (muratcankoylan) — Workflow

Usage Pattern

Activation Routing (from marketplace.json)

Phase-to-Artifact Map

Approval Gates

Skill Selection Logic

Memory Context

Agent Skills for Context Engineering (muratcankoylan) — Memory & Context

Irony Note

What the Skills Teach

Progressive Disclosure in This Repo

State Files

Cross-Session Handoff

Orchestration

Agent Skills for Context Engineering (muratcankoylan) — Orchestration

Multi-Agent

What the Skills Teach About Orchestration

Execution Mode

Isolation Mechanism

Prompt Chaining

Consensus

Ui Cli Surface

Agent Skills for Context Engineering (muratcankoylan) — UI & CLI Surface

CLI Binary

Local UI

IDE Integration

Observability

Documentation

External References Cited in Skills

Related frameworks

Prompt File 1: `context-fundamentals` SKILL.md (verbatim excerpt)

Prompt File 2: `harness-engineering` SKILL.md (verbatim excerpt)