Skip to content
/

meta-agent-teams (jbrahy)

meta-agent-teams · jbrahy/meta-agent-teams · ★ 2 · last commit 2026-03-17

Build self-improving AI agent teams via a supervised training loop: specialist agents advise, a meta-agent evolves prompts based on human feedback, and an independent auditor prevents constitutional drift.

Best whenAgent improvement requires governance (constitution + auditor), not just better prompts; maximum 30% change per cycle with documented rationale prevents cata…
Skip ifMeta-agent self-modifying its own constraints, Inferred feedback (only respond to what was actually said)
vs seeds
bmad-methodin using persona-md agent definitions and supporting multi-agent teams, but the defining difference is that meta-agent-t…
Primitive shape 1 total
Skills 1
00

Summary

Meta-Agent Teams (jbrahy) — Summary

Meta-Agent Teams (2 stars) is a framework for building self-improving AI agent teams through structured human feedback. Its core architecture has four layers: specialist agents (domain work), a meta-agent (processes feedback, proposes prompt modifications), an auditor (independently reviews every meta-agent proposal across 6 dimensions), and a human-unmodifiable constitution. The meta-agent can propose up to 30% prompt change per cycle, requires documented rationale, and is constrained by the auditor from drift, regression, and cross-agent incoherence. Every change is git-committed. The repo ships 230 pre-built team templates and a Claude Code skill (team-builder) plus a portable prompt (agent-team-builder.md) that works with any LLM. A shell script run-agent.sh executes individual agents against any LLM provider (Claude, Ollama, OpenAI-compatible). This framework is fundamentally different from all batch-24 peers: it is not about AI orchestrating a software development workflow — it is about humans iteratively improving AI agent teams through a supervised training loop. Closest seed is BMAD-METHOD (persona-md format, multi-agent teams) but the iterative-improvement and auditor-as-constitutional-check pattern has no equivalent in any seed.

01

Overview

Meta-Agent Teams (jbrahy) — Overview

Origin

Created by jbrahy (GitHub). AGPL-3.0 license. Shell-based tool runner with multi-LLM support. 230 pre-built agent teams across multiple domains.

Philosophy

"An open framework for building self-improving AI agent teams."

"Agents advise, humans execute. No agent takes autonomous action without explicit human approval." "Everything in git. Every agent modification is a commit with a documented rationale." "Feedback-driven improvement. Agents only change based on structured human feedback." "Constrained evolution. Max 30% change per cycle, documented rationale, auditor review." "No single point of control. The auditor independently reviews the meta-agent's decisions."

The framework treats agent development as an iterative, auditable process analogous to software development with code review — except the "code" being reviewed is agent prompts.

Key Principles

  1. Agents advise, humans execute — All output is advisory; humans approve changes
  2. Constitutional constraints — Inviolable rules the meta-agent cannot modify
  3. Independent auditor — No entity in the system has unchecked authority
  4. Feedback-driven — Meta-agent cannot self-optimize; only responds to human feedback
  5. Git-backed evolution — All changes committed with rationale for diff/bisect/revert

What It Is Not

This framework does not orchestrate software development tasks. It is a meta-system for improving the prompts of agents that do other things. The "work" the framework produces is better agent prompts, not code.

02

Architecture

Meta-Agent Teams (jbrahy) — Architecture

Distribution

  • Type: standalone-repo (bash scripts + markdown prompt files)
  • Install: git clone
  • Language: Shell (runner scripts), Markdown (agent prompts + system prompts)
  • License: AGPL-3.0

Team Structure (per domain)

teams/<domain>/
├── README.md
├── agents/
│   └── <agent-name>/
│       ├── system-prompt.md      # Core persona + capabilities
│       └── agent.yaml            # model, temperature, context_sources
├── meta-agent/
│   ├── system-prompt.md          # Feedback processing + evolution workflow
│   ├── agent.yaml
│   └── CHANGELOG.md
├── auditor/
│   ├── system-prompt.md          # 6-dimension independent review
│   ├── agent.yaml
│   └── CHANGELOG.md
├── shared/
│   ├── constitution.md           # Inviolable constraints
│   └── glossary.md
├── feedback/
│   └── template.md               # Structured feedback form
└── evals/
    └── baseline-scores.json      # Performance tracking

Repo-Level Structure

meta-agent-teams/
├── teams/                # 230 pre-built team directories
├── skill/                # Claude Code team-builder skill
│   ├── SKILL.md
│   └── references/       # Architecture docs, domain constitutions
├── prompt/
│   └── agent-team-builder.md  # LLM-agnostic portable prompt
├── bin/
│   ├── run-agent.sh      # Execute individual agents
│   └── run-cycle.sh      # Full feedback cycle (meta → auditor → commit)
├── docs/
│   └── architecture.md, getting-started.md, domain-guide.md
└── .agent-teams.env.example  # Provider config

Required Runtime

  • Bash
  • llm CLI (Simon Willison's tool) for model execution, or direct API access
  • Any LLM provider (Claude default; Ollama, OpenAI-compatible supported)

Config Files

  • .agent-teams.envAGENT_PROVIDER, AGENT_MODEL
03

Components

Meta-Agent Teams (jbrahy) — Components

Scripts (2)

Script Purpose
bin/run-agent.sh <team> <agent> [prompt] Execute a specialist agent against configured LLM provider
bin/run-cycle.sh <team> Full feedback cycle: meta-agent processes feedback → auditor reviews → human approves → git commit

Skill (1, Claude Code only)

Skill Purpose
skill/SKILL.md (name: team-builder) Generate complete agent team repository from description

Portable Prompt (1)

File Purpose
prompt/agent-team-builder.md LLM-agnostic system prompt for building teams with any LLM

Per-Team Components (×230 teams)

System Agents per Team

Role Component Constraints
Specialist agents agents/<name>/system-prompt.md Domain work only; advisory output
Meta-agent meta-agent/system-prompt.md Cannot modify itself, auditor, or constitution; max 30% change per cycle
Auditor auditor/system-prompt.md Independent review only; no authority to make changes

Shared Documents per Team

Document Purpose
shared/constitution.md Inviolable constraints (ethical, regulatory, quality)
shared/glossary.md Domain terminology
feedback/template.md Structured human feedback form
evals/baseline-scores.json Performance tracking baseline

Pre-Built Team Categories (230 total)

Engineering, Marketing, Sales, Customer Success, Product Management, Finance, HR/People, Legal, Security, Data Science, DevOps, Design Systems, Mobile, Platform, QA, Content, SEO, Competitive Intelligence, and ~200 more domain variants.

Auditor Review Dimensions (6)

  1. Constitutional compliance
  2. Feedback fidelity (did meta-agent address the feedback?)
  3. Drift detection (moving away from purpose?)
  4. Regression risk (could this break what works?)
  5. Cross-agent coherence (agents still aligned?)
  6. Change magnitude (within 30% threshold?)
05

Prompts

Meta-Agent Teams (jbrahy) — Prompts

Verbatim Excerpt 1: team-builder SKILL.md (Team Analysis Phase)

## When you receive a team-building request

Read `references/architecture.md` first to understand the system design philosophy. Then follow the generation workflow below.

## Step 1: Analyze the team description

Before generating anything, think through:

- **Which roles are genuinely agentic** (observe → decide → act loops) vs. which are **pipelines** (input → transform → output)?
  Label them honestly. Both are valid, but they need different architectures...
- **What are the dependency relationships?** Which agents consume another agent's output?
  Draw this graph mentally — it determines context_sources in each agent.yaml.
- **What are the domain-specific ethical constraints?** Every domain has them.
  Medical teams can't diagnose. Legal teams can't give advice as attorneys...
- **What's the feedback signal?** How will the human know if an agent's output is good?
  This determines the eval dimensions.

Prompting technique: "Think before generating" section forces a structured pre-generation analysis pass (agentic vs. pipeline classification, dependency graph, ethical constraints, feedback signal). This prevents superficial team generation where every role becomes an "agent" regardless of whether it actually needs an agentic architecture.


Verbatim Excerpt 2: prompt/agent-team-builder.md (Constitutional Design Pattern)

**Constitution** — This is the most important document. It defines what NO agent (including the meta-agent) can violate. Structure it as:
1. Scope of authority (advisory only unless explicitly configured otherwise)
2. Domain-specific ethical boundaries (research these — every domain has regulatory and ethical constraints)
3. Evolution rules (incremental changes, documented rationale, rollback path)
4. Data handling constraints
5. Quality/voice/brand integrity
6. Inter-agent coherence requirements

The constitution must be written so that the meta-agent cannot weaken it, the auditor enforces it, and only the human operator can amend it.

Prompting technique: Three-party constitutional design (meta-agent proposes, auditor enforces, human amends). The constitution is explicitly designed to be uncircumventable — the meta-agent's prompt contains a reference to a document it cannot modify, creating an immutable constraint even when the meta-agent has write access to other agents' prompts.


Verbatim Excerpt 3: Auditor Six-Dimension Framework

**Auditor system prompt** — Six evaluation dimensions:
1. Constitutional compliance
2. Feedback fidelity (did the meta-agent actually address the feedback?)
3. Drift detection (is the agent moving away from its purpose?)
4. Regression risk (could this change break something that works?)
5. Cross-agent coherence (are agents still aligned?)
6. Change magnitude (within the 30% threshold?)

The auditor must be explicitly independent — the meta-agent has no authority over it.

Prompting technique: Structured multi-dimension evaluation rubric for the auditor. Each dimension is a specific, actionable check — not general quality assessment. The independence constraint ("meta-agent has no authority over it") is a prompt-level constraint that prevents adversarial self-modification.

09

Uniqueness

Meta-Agent Teams (jbrahy) — Uniqueness

Differs from Seeds

No seed addresses the problem this framework solves. Closest to BMAD-METHOD in using persona-md files for agent definitions and supporting multi-agent teams. Critical difference: BMAD defines static agent personas for software development; Meta-Agent Teams defines a dynamic evolution system for improving agent personas over time. The meta-agent/auditor/constitution triad has no equivalent in any of the 11 seeds. The AGPL-3.0 license (vs. MIT in all seeds) is another distinctive signal of the framework's philosophy — it implies a "viral" sharing requirement for improvements. The 230 pre-built teams span non-engineering domains (marketing, legal, HR, finance) that no seed addresses.

Positioning

This is the only framework in this corpus designed for prompt engineering as a practice rather than for software development automation. It is a governance framework for AI agents — how to improve, constrain, and evolve them responsibly — not a workflow automation framework.

Observable Failure Modes

  1. AGPL compliance — Teams using this in commercial products must open-source modifications under AGPL. Most users will miss this licensing requirement.
  2. Meta-agent constraint bypass — If the meta-agent's system prompt is modified by a human (outside the framework's governance), all constitutional constraints are void.
  3. Auditor capture — If the meta-agent's proposed changes gradually shift the auditor's evaluation criteria (e.g., via the constitution), the auditor loses independence. The framework relies on humans reading audit reports to catch this.
  4. 30% change limit fragility — The 30% constraint is specified in the meta-agent's prompt, not enforced programmatically. A sufficiently capable meta-agent could rationalize changes that technically comply while violating the spirit.
  5. Scale limitation — 230 pre-built teams with ./bin/run-agent.sh sequential execution doesn't parallelize. Large teams with many specialists require multiple sequential runs.
04

Workflow

Meta-Agent Teams (jbrahy) — Workflow

Agent Usage Workflow (daily)

bin/run-agent.sh <team> <agent> "task description"
→ Agent produces advisory output
→ Human evaluates output
→ Human writes structured feedback → feedback/YYYY-MM/YYYY-MM-DD.md
→ bin/run-cycle.sh <team>
→ Meta-agent categorizes feedback → diagnoses root cause → proposes changes
→ Auditor reviews proposal (6 dimensions) → APPROVE / FLAG / REJECT
→ Human reviews audit findings
→ git commit (if approved)

Team Building Workflow

Option A (Claude Code with skill):

/team-builder "Build me a DevOps team"
→ Skill generates full team repo structure
→ All files created in one pass
→ Human reviews and customizes

Option B (any LLM with portable prompt):

Paste agent-team-builder.md as system prompt
→ Describe team
→ LLM generates all files (README, constitution, meta-agent, auditor, specialist agents, feedback template, evals)

Phase/Artifact Map

Phase Artifact
Agent run Advisory output (markdown report)
Feedback feedback/YYYY-MM/YYYY-MM-DD.md
Meta-agent processing Proposed diffs to agent system-prompts
Auditor review Audit report with APPROVE/FLAG/REJECT + rationale
Commit Git commit with rationale in message

Approval Gates

Gate Who Type
Auditor review Auditor agent APPROVE / FLAG / REJECT
Commit approval Human explicit git commit

Feedback-to-Evolution Loop (4-step meta-agent pipeline)

  1. Categorize feedback (which agent, type, severity)
  2. Diagnose root cause (why did agent produce that output?)
  3. Propose modifications (specific diffs + rationale + side effects)
  4. Document in CHANGELOG

Constraints: max 30% prompt change per cycle, one variable at a time, preserve what works, cross-agent coherence check, never infer feedback not given.

06

Memory Context

Meta-Agent Teams (jbrahy) — Memory & Context

State Storage

Type: File-based (git-committed markdown + YAML + JSON)
Persistence: Project + global (per-team directory, committed to git)

State Files per Team

File Content
agents/<name>/system-prompt.md Current agent system prompt (evolves via meta-agent)
agents/<name>/agent.yaml Model, temperature, context_sources config
agents/<name>/CHANGELOG.md History of prompt modifications with rationale
meta-agent/CHANGELOG.md History of meta-agent's own changes (it can't modify itself)
auditor/CHANGELOG.md Auditor review history
shared/constitution.md Inviolable constraints
feedback/YYYY-MM/YYYY-MM-DD.md Structured human feedback forms
evals/baseline-scores.json Performance baselines

Git as Memory

All prompt changes are git commits. The "memory" of what worked and what didn't is the git history. Teams can:

  • git diff to see prompt evolution
  • git bisect to find when a regression was introduced
  • git revert a bad meta-agent update
  • Branch for experimental prompt changes

Context Sources

Each agent.yaml defines context_sources — which other agents' outputs this agent reads before producing its own. This creates a declared dependency graph at the prompt-configuration level.

Cross-Session State

The team's state (agent prompts, changelogs, feedback) persists entirely in git. Any new Claude Code session that reads the team directory has full context. No external database required.

07

Orchestration

Meta-Agent Teams (jbrahy) — Orchestration

Multi-Agent Pattern

Hierarchical — three tiers: specialist agents (domain work), meta-agent (evolution), auditor (governance). The human is the orchestrator.

There is no automated orchestration — run-agent.sh executes individual agents, run-cycle.sh runs the feedback cycle, but the human decides when to run each and what feedback to provide.

Isolation

None — agents run as separate LLM calls via the llm CLI or API. No shared process, no git worktrees, no containers.

Multi-Model Support

Yes.agent-teams.env configures AGENT_PROVIDER and AGENT_MODEL. Different agents could use different providers by specifying in agent.yaml. Works with Claude (default), Ollama (local), and any OpenAI-compatible API.

Execution Mode

Manual / event-driven — agents run when the human invokes bin/run-agent.sh. No continuous loop, no scheduling.

Autonomy Level

Very low — explicitly constrained:

  • Agents advise, humans execute
  • Meta-agent only responds to human feedback (cannot self-initiate evolution)
  • Constitution cannot be changed by agents
  • Every change requires human git commit

Subagent Definition Format

Persona-md — each agent has a system-prompt.md and agent.yaml. The persona files are hand-crafted, not generated by an orchestrator.

Constitutional Constraints on Meta-Agent

The meta-agent's constraints are:

  1. Cannot modify itself, the auditor, or the constitution
  2. Changes must be incremental (max 30% per cycle)
  3. One variable changed at a time
  4. Must preserve what works
  5. Must check cross-agent coherence
  6. Never infer feedback not given
08

Ui Cli Surface

Meta-Agent Teams (jbrahy) — UI, CLI & Surface

Shell Scripts (primary interface)

bin/run-agent.sh <team> <agent> [prompt]

  • Reads agent.yaml for model/provider config
  • Executes agent system-prompt against configured LLM via llm CLI (or direct API)
  • Outputs agent response to terminal

bin/run-cycle.sh <team>

  • Full feedback cycle: meta-agent → auditor → human review → commit
  • Interactive (human reviews and approves)

Claude Code Skill (optional)

When using Claude Code, skill/SKILL.md provides /team-builder — a guided generation workflow for creating complete team repositories.

Portable Prompt (LLM-agnostic)

prompt/agent-team-builder.md — paste into any LLM's system prompt or first message, describe your team, get the full file tree back.

No Dashboard / No TUI

The framework has no dedicated UI. Teams are files; the "UI" is a text editor and terminal.

Cross-Tool Portability

Maximumrun-agent.sh uses Simon Willison's llm CLI which works with Claude, Ollama, OpenAI, and any configured backend. Agent system prompts are pure markdown — paste them into any LLM directly.

Git as UI

The git log is the primary observability surface. git log --all --oneline on a team directory shows the evolution history. git diff HEAD~1 shows what the meta-agent changed and why.

Related frameworks

same archetype · same primary tool · same memory type

alirezarezvani/claude-skills ★ 16k

313+ skills for 12 AI tools covering engineering, marketing, C-level advisory, compliance, research, and finance — all from one…

MoAI-ADK ★ 1.0k

Implements Harness Engineering as a Go-binary-installed Claude Code environment with auto-TDD/DDD methodology selection, 20-event…

REAP (c-d-cc/reap) ★ 41

Prevent context loss, scattered development, and forgotten lessons through a generation-based lifecycle where AI and human…

Codex Harness MCP ★ 7

Gives MCP-capable coding agents a local contract-lifecycle harness with governance audits and explicit completion gates.

Browser Harness ★ 14k

Thin, self-healing CDP harness connecting an LLM to the user's real Chrome browser with coordinate-first clicking and…

SwarmVault ★ 492

Production-grade CLI for Karpathy's LLM Wiki pattern: ingests any content into a local-first durable markdown wiki + knowledge…