meta-agent-teams (jbrahy)

meta-agent-teams · jbrahy/meta-agent-teams · ★ 2 · last commit 2026-03-17

Build self-improving AI agent teams via a supervised training loop: specialist agents advise, a meta-agent evolves prompts based on human feedback, and an independent auditor prevents constitutional drift.

Best whenAgent improvement requires governance (constitution + auditor), not just better prompts; maximum 30% change per cycle with documented rationale prevents cata…

Skip ifMeta-agent self-modifying its own constraints, Inferred feedback (only respond to what was actually said)

vs seeds

bmad-methodin using persona-md agent definitions and supporting multi-agent teams, but the defining difference is that meta-agent-t…

Primitive shape 1 total

Skills 1

Summary

Meta-Agent Teams (jbrahy) — Summary

Meta-Agent Teams (2 stars) is a framework for building self-improving AI agent teams through structured human feedback. Its core architecture has four layers: specialist agents (domain work), a meta-agent (processes feedback, proposes prompt modifications), an auditor (independently reviews every meta-agent proposal across 6 dimensions), and a human-unmodifiable constitution. The meta-agent can propose up to 30% prompt change per cycle, requires documented rationale, and is constrained by the auditor from drift, regression, and cross-agent incoherence. Every change is git-committed. The repo ships 230 pre-built team templates and a Claude Code skill (team-builder) plus a portable prompt (agent-team-builder.md) that works with any LLM. A shell script run-agent.sh executes individual agents against any LLM provider (Claude, Ollama, OpenAI-compatible). This framework is fundamentally different from all batch-24 peers: it is not about AI orchestrating a software development workflow — it is about humans iteratively improving AI agent teams through a supervised training loop. Closest seed is BMAD-METHOD (persona-md format, multi-agent teams) but the iterative-improvement and auditor-as-constitutional-check pattern has no equivalent in any seed.

Overview

Meta-Agent Teams (jbrahy) — Overview

Origin

Created by jbrahy (GitHub). AGPL-3.0 license. Shell-based tool runner with multi-LLM support. 230 pre-built agent teams across multiple domains.

Philosophy

"An open framework for building self-improving AI agent teams."

"Agents advise, humans execute. No agent takes autonomous action without explicit human approval." "Everything in git. Every agent modification is a commit with a documented rationale." "Feedback-driven improvement. Agents only change based on structured human feedback." "Constrained evolution. Max 30% change per cycle, documented rationale, auditor review." "No single point of control. The auditor independently reviews the meta-agent's decisions."

The framework treats agent development as an iterative, auditable process analogous to software development with code review — except the "code" being reviewed is agent prompts.

Key Principles

Agents advise, humans execute — All output is advisory; humans approve changes
Constitutional constraints — Inviolable rules the meta-agent cannot modify
Independent auditor — No entity in the system has unchecked authority
Feedback-driven — Meta-agent cannot self-optimize; only responds to human feedback
Git-backed evolution — All changes committed with rationale for diff/bisect/revert

What It Is Not

This framework does not orchestrate software development tasks. It is a meta-system for improving the prompts of agents that do other things. The "work" the framework produces is better agent prompts, not code.

Architecture

Meta-Agent Teams (jbrahy) — Architecture

Distribution

Type: standalone-repo (bash scripts + markdown prompt files)
Install: git clone
Language: Shell (runner scripts), Markdown (agent prompts + system prompts)
License: AGPL-3.0

Team Structure (per domain)

teams/<domain>/
├── README.md
├── agents/
│   └── <agent-name>/
│       ├── system-prompt.md      # Core persona + capabilities
│       └── agent.yaml            # model, temperature, context_sources
├── meta-agent/
│   ├── system-prompt.md          # Feedback processing + evolution workflow
│   ├── agent.yaml
│   └── CHANGELOG.md
├── auditor/
│   ├── system-prompt.md          # 6-dimension independent review
│   ├── agent.yaml
│   └── CHANGELOG.md
├── shared/
│   ├── constitution.md           # Inviolable constraints
│   └── glossary.md
├── feedback/
│   └── template.md               # Structured feedback form
└── evals/
    └── baseline-scores.json      # Performance tracking

Repo-Level Structure

meta-agent-teams/
├── teams/                # 230 pre-built team directories
├── skill/                # Claude Code team-builder skill
│   ├── SKILL.md
│   └── references/       # Architecture docs, domain constitutions
├── prompt/
│   └── agent-team-builder.md  # LLM-agnostic portable prompt
├── bin/
│   ├── run-agent.sh      # Execute individual agents
│   └── run-cycle.sh      # Full feedback cycle (meta → auditor → commit)
├── docs/
│   └── architecture.md, getting-started.md, domain-guide.md
└── .agent-teams.env.example  # Provider config

Required Runtime

Bash
llm CLI (Simon Willison's tool) for model execution, or direct API access
Any LLM provider (Claude default; Ollama, OpenAI-compatible supported)

Config Files

.agent-teams.env — AGENT_PROVIDER, AGENT_MODEL

Components

Meta-Agent Teams (jbrahy) — Components

Scripts (2)

Script	Purpose
`bin/run-agent.sh <team> <agent> [prompt]`	Execute a specialist agent against configured LLM provider
`bin/run-cycle.sh <team>`	Full feedback cycle: meta-agent processes feedback → auditor reviews → human approves → git commit

Skill (1, Claude Code only)

Skill	Purpose
`skill/SKILL.md` (name: `team-builder`)	Generate complete agent team repository from description

Portable Prompt (1)

File	Purpose
`prompt/agent-team-builder.md`	LLM-agnostic system prompt for building teams with any LLM

Per-Team Components (×230 teams)

System Agents per Team

Role	Component	Constraints
Specialist agents	`agents/<name>/system-prompt.md`	Domain work only; advisory output
Meta-agent	`meta-agent/system-prompt.md`	Cannot modify itself, auditor, or constitution; max 30% change per cycle
Auditor	`auditor/system-prompt.md`	Independent review only; no authority to make changes

Shared Documents per Team

Document	Purpose
`shared/constitution.md`	Inviolable constraints (ethical, regulatory, quality)
`shared/glossary.md`	Domain terminology
`feedback/template.md`	Structured human feedback form
`evals/baseline-scores.json`	Performance tracking baseline

Pre-Built Team Categories (230 total)

Engineering, Marketing, Sales, Customer Success, Product Management, Finance, HR/People, Legal, Security, Data Science, DevOps, Design Systems, Mobile, Platform, QA, Content, SEO, Competitive Intelligence, and ~200 more domain variants.

Auditor Review Dimensions (6)

Constitutional compliance
Feedback fidelity (did meta-agent address the feedback?)
Drift detection (moving away from purpose?)
Regression risk (could this break what works?)
Cross-agent coherence (agents still aligned?)
Change magnitude (within 30% threshold?)

Prompts

Meta-Agent Teams (jbrahy) — Prompts

Verbatim Excerpt 1: `team-builder` SKILL.md (Team Analysis Phase)

## When you receive a team-building request

Read `references/architecture.md` first to understand the system design philosophy. Then follow the generation workflow below.

## Step 1: Analyze the team description

Before generating anything, think through:

- **Which roles are genuinely agentic** (observe → decide → act loops) vs. which are **pipelines** (input → transform → output)?
  Label them honestly. Both are valid, but they need different architectures...
- **What are the dependency relationships?** Which agents consume another agent's output?
  Draw this graph mentally — it determines context_sources in each agent.yaml.
- **What are the domain-specific ethical constraints?** Every domain has them.
  Medical teams can't diagnose. Legal teams can't give advice as attorneys...
- **What's the feedback signal?** How will the human know if an agent's output is good?
  This determines the eval dimensions.

Prompting technique: "Think before generating" section forces a structured pre-generation analysis pass (agentic vs. pipeline classification, dependency graph, ethical constraints, feedback signal). This prevents superficial team generation where every role becomes an "agent" regardless of whether it actually needs an agentic architecture.

Verbatim Excerpt 2: `prompt/agent-team-builder.md` (Constitutional Design Pattern)

**Constitution** — This is the most important document. It defines what NO agent (including the meta-agent) can violate. Structure it as:
1. Scope of authority (advisory only unless explicitly configured otherwise)
2. Domain-specific ethical boundaries (research these — every domain has regulatory and ethical constraints)
3. Evolution rules (incremental changes, documented rationale, rollback path)
4. Data handling constraints
5. Quality/voice/brand integrity
6. Inter-agent coherence requirements

The constitution must be written so that the meta-agent cannot weaken it, the auditor enforces it, and only the human operator can amend it.

Prompting technique: Three-party constitutional design (meta-agent proposes, auditor enforces, human amends). The constitution is explicitly designed to be uncircumventable — the meta-agent's prompt contains a reference to a document it cannot modify, creating an immutable constraint even when the meta-agent has write access to other agents' prompts.

Verbatim Excerpt 3: Auditor Six-Dimension Framework

**Auditor system prompt** — Six evaluation dimensions:
1. Constitutional compliance
2. Feedback fidelity (did the meta-agent actually address the feedback?)
3. Drift detection (is the agent moving away from its purpose?)
4. Regression risk (could this change break something that works?)
5. Cross-agent coherence (are agents still aligned?)
6. Change magnitude (within the 30% threshold?)

The auditor must be explicitly independent — the meta-agent has no authority over it.

Prompting technique: Structured multi-dimension evaluation rubric for the auditor. Each dimension is a specific, actionable check — not general quality assessment. The independence constraint ("meta-agent has no authority over it") is a prompt-level constraint that prevents adversarial self-modification.

Uniqueness

Meta-Agent Teams (jbrahy) — Uniqueness

Differs from Seeds

No seed addresses the problem this framework solves. Closest to BMAD-METHOD in using persona-md files for agent definitions and supporting multi-agent teams. Critical difference: BMAD defines static agent personas for software development; Meta-Agent Teams defines a dynamic evolution system for improving agent personas over time. The meta-agent/auditor/constitution triad has no equivalent in any of the 11 seeds. The AGPL-3.0 license (vs. MIT in all seeds) is another distinctive signal of the framework's philosophy — it implies a "viral" sharing requirement for improvements. The 230 pre-built teams span non-engineering domains (marketing, legal, HR, finance) that no seed addresses.

Positioning

This is the only framework in this corpus designed for prompt engineering as a practice rather than for software development automation. It is a governance framework for AI agents — how to improve, constrain, and evolve them responsibly — not a workflow automation framework.

Observable Failure Modes

AGPL compliance — Teams using this in commercial products must open-source modifications under AGPL. Most users will miss this licensing requirement.
Meta-agent constraint bypass — If the meta-agent's system prompt is modified by a human (outside the framework's governance), all constitutional constraints are void.
Auditor capture — If the meta-agent's proposed changes gradually shift the auditor's evaluation criteria (e.g., via the constitution), the auditor loses independence. The framework relies on humans reading audit reports to catch this.
30% change limit fragility — The 30% constraint is specified in the meta-agent's prompt, not enforced programmatically. A sufficiently capable meta-agent could rationalize changes that technically comply while violating the spirit.
Scale limitation — 230 pre-built teams with ./bin/run-agent.sh sequential execution doesn't parallelize. Large teams with many specialists require multiple sequential runs.

Workflow

Meta-Agent Teams (jbrahy) — Workflow

Agent Usage Workflow (daily)

bin/run-agent.sh <team> <agent> "task description"
→ Agent produces advisory output
→ Human evaluates output
→ Human writes structured feedback → feedback/YYYY-MM/YYYY-MM-DD.md
→ bin/run-cycle.sh <team>
→ Meta-agent categorizes feedback → diagnoses root cause → proposes changes
→ Auditor reviews proposal (6 dimensions) → APPROVE / FLAG / REJECT
→ Human reviews audit findings
→ git commit (if approved)

Team Building Workflow

Option A (Claude Code with skill):

/team-builder "Build me a DevOps team"
→ Skill generates full team repo structure
→ All files created in one pass
→ Human reviews and customizes

Option B (any LLM with portable prompt):

Paste agent-team-builder.md as system prompt
→ Describe team
→ LLM generates all files (README, constitution, meta-agent, auditor, specialist agents, feedback template, evals)

Phase/Artifact Map

Phase	Artifact
Agent run	Advisory output (markdown report)
Feedback	`feedback/YYYY-MM/YYYY-MM-DD.md`
Meta-agent processing	Proposed diffs to agent system-prompts
Auditor review	Audit report with APPROVE/FLAG/REJECT + rationale
Commit	Git commit with rationale in message

Approval Gates

Gate	Who	Type
Auditor review	Auditor agent	APPROVE / FLAG / REJECT
Commit approval	Human	explicit git commit

Feedback-to-Evolution Loop (4-step meta-agent pipeline)

Categorize feedback (which agent, type, severity)
Diagnose root cause (why did agent produce that output?)
Propose modifications (specific diffs + rationale + side effects)
Document in CHANGELOG

Constraints: max 30% prompt change per cycle, one variable at a time, preserve what works, cross-agent coherence check, never infer feedback not given.

Memory Context

Meta-Agent Teams (jbrahy) — Memory & Context

State Storage

Type: File-based (git-committed markdown + YAML + JSON)
Persistence: Project + global (per-team directory, committed to git)

State Files per Team

File	Content
`agents/<name>/system-prompt.md`	Current agent system prompt (evolves via meta-agent)
`agents/<name>/agent.yaml`	Model, temperature, context_sources config
`agents/<name>/CHANGELOG.md`	History of prompt modifications with rationale
`meta-agent/CHANGELOG.md`	History of meta-agent's own changes (it can't modify itself)
`auditor/CHANGELOG.md`	Auditor review history
`shared/constitution.md`	Inviolable constraints
`feedback/YYYY-MM/YYYY-MM-DD.md`	Structured human feedback forms
`evals/baseline-scores.json`	Performance baselines

Git as Memory

All prompt changes are git commits. The "memory" of what worked and what didn't is the git history. Teams can:

git diff to see prompt evolution
git bisect to find when a regression was introduced
git revert a bad meta-agent update
Branch for experimental prompt changes

Context Sources

Each agent.yaml defines context_sources — which other agents' outputs this agent reads before producing its own. This creates a declared dependency graph at the prompt-configuration level.

Cross-Session State

The team's state (agent prompts, changelogs, feedback) persists entirely in git. Any new Claude Code session that reads the team directory has full context. No external database required.

Orchestration

Meta-Agent Teams (jbrahy) — Orchestration

Multi-Agent Pattern

Hierarchical — three tiers: specialist agents (domain work), meta-agent (evolution), auditor (governance). The human is the orchestrator.

There is no automated orchestration — run-agent.sh executes individual agents, run-cycle.sh runs the feedback cycle, but the human decides when to run each and what feedback to provide.

Isolation

None — agents run as separate LLM calls via the llm CLI or API. No shared process, no git worktrees, no containers.

Multi-Model Support

Yes — .agent-teams.env configures AGENT_PROVIDER and AGENT_MODEL. Different agents could use different providers by specifying in agent.yaml. Works with Claude (default), Ollama (local), and any OpenAI-compatible API.

Execution Mode

Manual / event-driven — agents run when the human invokes bin/run-agent.sh. No continuous loop, no scheduling.

Autonomy Level

Very low — explicitly constrained:

Agents advise, humans execute
Meta-agent only responds to human feedback (cannot self-initiate evolution)
Constitution cannot be changed by agents
Every change requires human git commit

Subagent Definition Format

Persona-md — each agent has a system-prompt.md and agent.yaml. The persona files are hand-crafted, not generated by an orchestrator.

Constitutional Constraints on Meta-Agent

The meta-agent's constraints are:

Cannot modify itself, the auditor, or the constitution
Changes must be incremental (max 30% per cycle)
One variable changed at a time
Must preserve what works
Must check cross-agent coherence
Never infer feedback not given

Ui Cli Surface

Meta-Agent Teams (jbrahy) — UI, CLI & Surface

Shell Scripts (primary interface)

bin/run-agent.sh <team> <agent> [prompt]

Reads agent.yaml for model/provider config
Executes agent system-prompt against configured LLM via llm CLI (or direct API)
Outputs agent response to terminal

bin/run-cycle.sh <team>

Full feedback cycle: meta-agent → auditor → human review → commit
Interactive (human reviews and approves)

Claude Code Skill (optional)

When using Claude Code, skill/SKILL.md provides /team-builder — a guided generation workflow for creating complete team repositories.

Portable Prompt (LLM-agnostic)

prompt/agent-team-builder.md — paste into any LLM's system prompt or first message, describe your team, get the full file tree back.

No Dashboard / No TUI

The framework has no dedicated UI. Teams are files; the "UI" is a text editor and terminal.

Cross-Tool Portability

Maximum — run-agent.sh uses Simon Willison's llm CLI which works with Claude, Ollama, OpenAI, and any configured backend. Agent system prompts are pure markdown — paste them into any LLM directly.

Git as UI

The git log is the primary observability surface. git log --all --oneline on a team directory shows the evolution history. git diff HEAD~1 shows what the meta-agent changed and why.

Related frameworks

same archetype · same primary tool · same memory type

alirezarezvani/claude-skills ★ 16k

A18 Self-evolving

313+ skills for 12 AI tools covering engineering, marketing, C-level advisory, compliance, research, and finance — all from one…

MoAI-ADK ★ 1.0k

A18 Self-evolving

Implements Harness Engineering as a Go-binary-installed Claude Code environment with auto-TDD/DDD methodology selection, 20-event…

REAP (c-d-cc/reap) ★ 41

A18 Self-evolving

Prevent context loss, scattered development, and forgotten lessons through a generation-based lifecycle where AI and human…

Codex Harness MCP ★ 7

A18 Self-evolving

Gives MCP-capable coding agents a local contract-lifecycle harness with governance audits and explicit completion gates.

Browser Harness ★ 14k

A18 Self-evolving

Thin, self-healing CDP harness connecting an LLM to the user's real Chrome browser with coordinate-first clicking and…

SwarmVault ★ 492

A18 Self-evolving

Production-grade CLI for Karpathy's LLM Wiki pattern: ingests any content into a local-first durable markdown wiki + knowledge…

Distribution

Type: standalone-repo
License: AGPL-3.0
Install: clone-and-configure

Surfaces

CLI binary: No
CLI subcmds: 0
Local UI: No
Tech stack: none

Components

Commands: 0
Skills: 1
Subagents: 0
Hooks: 0
MCP servers: 0
MCP tools: 0
Scripts: 2
Templates: 230

Workflow

Phases: 6
Approval gates: 2
Spec format: markdown
Spec storage: per-feature-folder
Delta or full: delta-diff

Orchestration

Multi-agent: Yes
Pattern: hierarchical
Max concurrent: 1
Isolation: none
Consensus: none
Prompt chaining: Yes

Multi-model

Multi-model: Yes
BYOK: Yes
Modal: text

Execution

Mode: event-driven
Crash recovery: No
Compaction: No
Session handoff: Yes
Streaming: No

Memory

Type: file-based
Persistence: global
Search: none
State files: 4 files

Quality

TDD: No
TDD mechanism: none
Self-review: adversarial-subagent

Git / Observability

Auto commit: No
Auto PR: No
Auto merge: No
Worktree/feat: No
Audit log: Yes
Audit format: structured-md
Replay: Yes

Tools

Primary: claude-code
Targets: 2
Portability: high

Signals

Stars: 2
Last commit: 2026-03-17
Maintainer: active
Quality score: 4.2/10

Summary

Meta-Agent Teams (jbrahy) — Summary

Overview

Meta-Agent Teams (jbrahy) — Overview

Origin

Philosophy

Key Principles

What It Is Not

Architecture

Meta-Agent Teams (jbrahy) — Architecture

Distribution

Team Structure (per domain)

Repo-Level Structure

Required Runtime

Config Files

Components

Meta-Agent Teams (jbrahy) — Components

Scripts (2)

Skill (1, Claude Code only)

Portable Prompt (1)

Per-Team Components (×230 teams)

System Agents per Team

Shared Documents per Team

Pre-Built Team Categories (230 total)

Auditor Review Dimensions (6)

Prompts

Meta-Agent Teams (jbrahy) — Prompts

Verbatim Excerpt 1: team-builder SKILL.md (Team Analysis Phase)

Verbatim Excerpt 2: prompt/agent-team-builder.md (Constitutional Design Pattern)

Verbatim Excerpt 3: Auditor Six-Dimension Framework

Uniqueness

Meta-Agent Teams (jbrahy) — Uniqueness

Differs from Seeds

Positioning

Observable Failure Modes

Workflow

Meta-Agent Teams (jbrahy) — Workflow

Agent Usage Workflow (daily)

Team Building Workflow

Option A (Claude Code with skill):

Option B (any LLM with portable prompt):

Phase/Artifact Map

Approval Gates

Feedback-to-Evolution Loop (4-step meta-agent pipeline)

Memory Context

Meta-Agent Teams (jbrahy) — Memory & Context

State Storage

State Files per Team

Git as Memory

Context Sources

Cross-Session State

Orchestration

Meta-Agent Teams (jbrahy) — Orchestration

Multi-Agent Pattern

Isolation

Multi-Model Support

Execution Mode

Autonomy Level

Subagent Definition Format

Constitutional Constraints on Meta-Agent

Ui Cli Surface

Meta-Agent Teams (jbrahy) — UI, CLI & Surface

Shell Scripts (primary interface)

Claude Code Skill (optional)

Portable Prompt (LLM-agnostic)

No Dashboard / No TUI

Cross-Tool Portability

Git as UI

Related frameworks

Verbatim Excerpt 1: `team-builder` SKILL.md (Team Analysis Phase)

Verbatim Excerpt 2: `prompt/agent-team-builder.md` (Constitutional Design Pattern)