Conductor (microsoft)

conductor-microsoft · microsoft/conductor · ★ 156 · last commit 2026-05-21

Deterministic YAML-defined multi-agent workflow engine with Jinja2 routing, parallel execution, multi-model support, and a built-in real-time web dashboard.

Best whenRouting logic should be deterministic code (Jinja2 expressions), never an LLM — this makes multi-agent workflows repeatable and version-controllable.

Skip ifLLM-decided routing (all routing must be deterministic Jinja2), Monolithic single-agent workflows

vs seeds

claude-flow(multi-agent orchestration with parallel execution), but microsoft/conductor is a standalone Python runtime with determi…

Primitive shape 6 total

Commands 5 Skills 1

Summary

Conductor (microsoft) — Summary

Microsoft's Conductor is a production-grade Python CLI tool (conductor run workflow.yaml) for defining and executing multi-agent workflows using YAML configuration files against GitHub Copilot SDK and Anthropic Claude APIs. It is architecturally unrelated to the CDD-methodology "conductor" variants in this batch: it is a workflow orchestration engine where agents, routing logic, and tool configurations are described in YAML, executed deterministically via Jinja2 expression evaluation (no LLM in the routing loop), and monitored through a real-time web dashboard. Key features include parallel execution (static groups and dynamic for-each fan-out), sub-workflow composition, script steps with exit-code routing, human-in-the-loop gates in-browser, and reasoning.effort configuration translated to each provider's native API. At v0.1.17, with 156 stars, 15 contributors, and an MIT license, it is maintained by a Microsoft team (Melty Labs adjacent) and ships as both a standalone CLI tool and a dual-purpose Claude Code / GitHub Copilot CLI skill that teaches the YAML schema. Compared to the CDD conductor variants, this is closest to the "MCP-anchored toolserver" archetype but implemented as a Python runtime rather than an MCP server — the defining difference from all other conductors in this batch.

Overview

Conductor (microsoft) — Overview

Origin

Author: Microsoft (Conductor Team)
Repo: https://github.com/microsoft/conductor
Version: 0.1.17 (pyproject.toml)
License: MIT
Stars: 156 | Forks: 21 | Contributors: 15 | Last commit: 2026-05-21
Install binary: conductor (PyPI package: conductor-cli)

Philosophy

From the README:

"Conductor makes multi-agent workflows — code review pipelines, research-then-synthesize flows, plan-then-implement loops — repeatable, deterministic, and version-controlled."

Three core principles:

Repeatable — Same inputs follow the same path through the same agents.
Deterministic — Routing uses Jinja2 templates and expression evaluation. First matching condition wins. No LLM in the orchestration loop, no tokens spent deciding what runs next.
Source-controlled — Plain YAML files. Diff workflows in pull requests, version them with code, run identically locally and in CI.

This is a deliberate design reaction against "LLM-decides-routing" approaches — Conductor's orchestration logic is purely deterministic (Jinja2 expressions), making it suitable for production CI pipelines.

Use cases (from examples directory)

Code review pipelines (implement.yaml)
Parallel research workflows (parallel-research.yaml)
Plan-then-implement loops (plan.yaml, implement.yaml)
Design-code-validate cycles (design.yaml)
Multi-provider routing (multi-provider-research.yaml)
Dialog mode for uncertainty handling (dialog-mode.yaml)
Script steps with exit-code routing (script-step.yaml)

Dual-purpose: CLI + Claude Code / Copilot skill

The repo also ships a skill for Claude Code and Copilot CLI that teaches the agent the YAML schema and CLI commands — so the tool can be used both standalone and as an AI-assisted workflow builder.

Architecture

Conductor (microsoft) — Architecture

Distribution

Type: Python CLI tool (+ optional Claude Code / Copilot CLI skill)

Install methods:

# Quick install (recommended):
curl -sSfL https://aka.ms/conductor/install.sh | sh      # macOS/Linux
irm https://aka.ms/conductor/install.ps1 | iex           # Windows

# pip/uv:
uv tool install git+https://github.com/microsoft/conductor.git
pip install git+https://github.com/microsoft/conductor.git

CLI binary: conductor
Required runtime: Python >= 3.12
Package name: conductor-cli v0.1.17

Key dependencies

typer — CLI framework
rich — terminal output formatting
pydantic — workflow schema validation
ruamel.yaml — YAML parsing
jinja2 + simpleeval — expression evaluation for routing
anthropic>=0.77.0 — Anthropic SDK
github-copilot-sdk>=0.3.0 — GitHub Copilot SDK
fastapi + uvicorn + websockets — web dashboard
mcp>=1.0.0 — MCP tool support

Source directory structure

src/conductor/
├── cli/
│   ├── app.py          # Typer app + subcommand registration
│   ├── run.py          # `conductor run` command
│   ├── validate.py     # `conductor validate` command
│   ├── update.py       # `conductor update` command
│   ├── bg_runner.py    # Background mode runner
│   ├── pid.py          # PID management for bg mode
│   └── registry.py     # Registry subcommand group
├── engine/             # Workflow execution engine
├── executor/           # Agent executor (Anthropic + Copilot providers)
├── providers/          # LLM provider adapters
├── config/             # YAML schema definitions (Pydantic)
├── gates/              # Human-in-the-loop gates
├── interrupt/          # Ctrl+C / signal handling
├── mcp/                # MCP server integration
├── web/                # FastAPI web dashboard
│   ├── server.py
│   ├── frontend/       # React/JS dashboard UI
│   └── replay.py       # Replay capability
└── registry/           # Workflow registry

Workflow YAML schema

workflow:
  name: <string>
  entry_point: <agent-name>
  runtime:
    provider: copilot | anthropic
    default_model: <model-name>
    mcp_servers:
      <name>: {command, args, tools}
  input:
    <key>: {type, required, default, description}
  limits:
    max_iterations: <int>
    timeout_seconds: <int>

parallel:
  - name: <group-name>
    agents: [<list>]
    failure_mode: halt_on_error | continue_on_error
    routes:
      - to: <agent>

agents:
  - name: <string>
    model: <model-name>
    prompt: <jinja2-template>
    system_prompt: <string>
    output:
      <key>: {type: string|json|bool}
    routes:
      - condition: <jinja2-expr>
        to: <agent-name>
      - to: $end

Target AI tools

Primary: conductor CLI (standalone)
Secondary: Claude Code + GitHub Copilot CLI (skill that teaches YAML schema)
LLM providers: GitHub Copilot SDK, Anthropic Claude

Components

Conductor (microsoft) — Components

CLI subcommands

Subcommand	Purpose
`conductor run <workflow.yaml>`	Execute a workflow; supports `--web`, `--web-bg`, `--input key=val`, `-V` verbose
`conductor validate <workflow.yaml>`	Validate YAML schema before runtime
`conductor update`	Check for newer releases and upgrade
`conductor stop`	Stop a background-mode workflow
`conductor registry`	Manage workflow registry (subcommand group)

Workflow primitives (YAML)

Primitive	Purpose
`agent`	An LLM-powered step with prompt, model, output schema, and conditional routes
`parallel group`	Concurrent execution of multiple agents with failure_mode
`for_each` group	Dynamic fan-out — runs an agent once per element of a list
`sub-workflow`	Reusable workflow embedded in another with `input_mapping`
`script step`	Shell command with exit-code or JSON-stdout routing
`dialog step`	Agent pauses for multi-turn human conversation
`human gate`	Pause for human decision with rendered Markdown and file links
`mcp_server`	Per-workflow MCP server configuration

Web dashboard

Real-time FastAPI+WebSocket dashboard (--web flag):

Interactive DAG graph (zoomable, draggable)
Live agent streaming (reasoning, tool calls, outputs)
Three-pane layout: graph / agent detail / tabbed output
In-browser human gates (no terminal needed)
Per-node detail: prompt, metadata (model, tokens, cost), activity stream, output
Background mode: --web-bg + conductor stop

Example workflows (16 in `examples/`)

Example	Demonstrates
`simple-qa.yaml`	Single-agent Q&A
`parallel-research.yaml`	3 parallel researchers + synthesizer
`implement.yaml`	Epic selector → Coder → Reviewer → Committer → Plan reviewer
`design.yaml`	Design-code-validate pipeline
`multi-provider-research.yaml`	Cross-provider routing
`script-step.yaml`	Shell script integration
`dialog-mode.yaml`	Multi-turn dialog agent
`reasoning-effort.yaml`	`reasoning.effort` configuration

Skill (for Claude Code / Copilot)

plugins/conductor/skills/conductor/ — markdown skill teaching the YAML schema and CLI usage. Ships only markdown, no executables.

Prompts

Conductor (microsoft) — Prompts

Excerpt 1: `examples/implement.yaml` — Multi-model orchestration with Jinja2 routing

agents:
  - name: epic_selector
    description: Reads the plan and selects exactly one epic to implement next
    model: claude-sonnet-4.6
    input:
      - workflow.input.plan
      - workflow.input.epic?
      - committer.output?
    system_prompt: |
      You are an Implementation Planner agent. Your ONLY job is to read an implementation
      plan document and identify the SINGLE next epic that needs to be implemented.
    ...
    routes:
      - condition: "epic_selector.output.epic_id == 'DONE'"
        to: $end
      - to: coder

  - name: coder
    model: claude-opus-4-5
    description: Deep analysis, research, and implementation of one epic
    input:
      - workflow.input.plan
      - epic_selector.output.epic_id
    ...
    routes:
      - to: epic_reviewer

Technique: Jinja2 condition expressions ("epic_selector.output.epic_id == 'DONE'") for deterministic routing. Different models for different roles (Sonnet for coordination, Opus for deep implementation). Output fields explicitly typed (epic_id: {type: string}). ? suffix marks optional inputs.

Excerpt 2: `examples/parallel-research.yaml` — Parallel execution with resilient failure mode

parallel:
  - name: parallel_researchers
    description: Research from multiple independent sources in parallel
    agents:
      - academic_researcher
      - web_researcher
      - technical_researcher
    failure_mode: continue_on_error
    routes:
      - to: synthesizer

agents:
  - name: planner
    model: gpt-5.2
    prompt: |
      You are a research planning expert. Create a comprehensive research plan for:
      
      Topic: {{ workflow.input.topic }}
      Depth: {{ workflow.input.depth }}

Technique: Declarative parallel groups with continue_on_error resilience. Jinja2 template interpolation in prompts ({{ workflow.input.topic }}). Agents expressed as configuration, not code. Runtime MCP server injection via mcp_servers: block.

Prompting techniques observed

YAML-native prompt composition — prompts are Jinja2 templates inline in YAML; no separate prompt files
Typed output schema — each agent declares output: {key: {type: string|json|bool}} so downstream Jinja2 can access structured values
Optional input ? suffix — allows conditional context injection without NoneErrors
Multi-model assignment in YAML — different agents get different models in the same workflow file
Workspace instructions auto-injection — AGENTS.md, CLAUDE.md, .github/copilot-instructions.md auto-discovered and injected into every agent's system prompt
reasoning.effort — unified low|medium|high|xhigh per-agent setting, translated to each provider's native API (Claude's thinking tokens, Copilot's o3-pro reasoning)

Uniqueness

Conductor (microsoft) — Uniqueness

differs_from_seeds

Microsoft's Conductor is architecturally distinct from all seeds and all other conductors in this batch. The closest seed is claude-flow (MCP-anchored toolserver with multi-agent orchestration), but microsoft/conductor is not MCP-based and not a Claude Code plugin — it is a standalone Python CLI with its own execution runtime. The deterministic-routing principle (Jinja2 expressions, no LLM in the orchestration loop) directly contrasts with claude-flow's consensus protocols and superpowers' Iron Law skill activation. It shares the "multi-agent, parallel execution" dimension with claude-flow but implements it via declarative YAML rather than SQLite-backed skill queues. Among seeds, it is most novel: none of the 11 seeds ship a standalone CLI with a built-in web dashboard, per-agent model assignment, and parallel group orchestration — those dimensions are simply absent from the seed corpus.

What makes this conductor unique

Deterministic YAML-driven routing — Jinja2 expressions route agents; zero LLM tokens spent on orchestration decisions
Multi-model per workflow — YAML assigns different LLM models to different agents in the same workflow file
Built-in real-time web dashboard — FastAPI + WebSocket DAG visualization with live streaming; no other conductor variant has a local UI
For-each dynamic fan-out — execute one agent N times over a list, accumulating results
Sub-workflow composition — reusable nested workflows with input_mapping
Dual provider support — GitHub Copilot SDK and Anthropic in the same tool, with reasoning.effort unified across both
--web-bg background mode — run a workflow headlessly, stop it with conductor stop
Microsoft backing — institutional maintainer with SHA-256 install verification and Microsoft install CDN

Observable failure modes

No checkpointing: workflow failure means restart from scratch; long workflows (implement.yaml has 100 max_iterations) can lose significant progress
YAML complexity ceiling: the implement.yaml example with 6 agents, multi-provider MCP servers, and for-each is already complex to read/debug
Python >=3.12 requirement: excludes users on older systems
No isolation: all agents write to the same working directory; parallel agents could create conflicts

Positioning

This is the only "workflow engine" in the batch — the others are development methodology plugins. microsoft/conductor competes with LangGraph, CrewAI, and Temporal for the deterministic AI workflow orchestration niche, not with Claude Code plugin frameworks.

Workflow

Conductor (microsoft) — Workflow

Workflow definition → execution model

Unlike the CDD variants, Conductor (microsoft) does not prescribe a development lifecycle. Instead, it provides a workflow engine that users define workflows for:

1. Define workflow YAML (one-time per use case)
2. conductor validate workflow.yaml   (optional pre-flight)
3. conductor run workflow.yaml --input key=val [--web]
4. Workflow executes deterministically:
   - entry_point agent runs
   - Jinja2 conditions evaluated for routing
   - Parallel groups run concurrently
   - Human gates pause for input
   - Script steps run shell commands
   - Workflow ends at $end or max_iterations

Routing model

First-matching-condition wins: routes evaluated top-down, first condition that is truthy routes to target agent
No LLM in orchestration loop: all routing is Jinja2 expression evaluation
$end is the terminal sentinel

Approval gates

Human gates in YAML:

- name: reviewer
  human_gate:
    prompt: "Review the design and approve or reject"
    options: [approve, reject, revise]

Human gates render in-browser when --web is active; in terminal otherwise.

Parallel execution

parallel:
  - name: researchers
    agents: [academic_researcher, web_researcher, technical_researcher]
    failure_mode: continue_on_error
    routes:
      - to: synthesizer

For-each (dynamic fan-out)

- name: per_issue_agent
  for_each:
    input: planner.output.issues
    agent: implementer
  routes:
    - to: $end

Safety limits

limits:
  max_iterations: 100
  timeout_seconds: 3600

Sub-workflow composition

- name: research_step
  sub_workflow:
    path: research.yaml
    input_mapping:
      topic: "{{ workflow.input.topic }}"

Memory Context

Conductor (microsoft) — Memory & Context

Context model

Conductor uses two context accumulation modes per workflow:

context:
  mode: explicit   # Only declared inputs available to agents
  # OR
  mode: accumulate # All previous agents' outputs available to downstream agents

explicit: each agent only sees what it declares in input: — clean, predictable, minimal
accumulate: downstream agents accumulate all upstream outputs automatically — richer context, harder to reason about

State during workflow execution

In-memory: workflow state lives in the Python runtime during execution; no persistent disk state by default
Agent outputs: typed output fields (output: {key: {type: string}}) flow between agents via Jinja2 references ({{ agent_name.output.field }})
Checkpointing: no native checkpointing — if a workflow crashes, it restarts from the beginning

Workspace instructions auto-injection

If AGENTS.md, CLAUDE.md, or .github/copilot-instructions.md exist in the working directory, Conductor auto-discovers and injects them into every agent's system prompt. This provides cross-session project context without explicit configuration.

Replay capability

The web module includes replay.py — workflow runs can be replayed for debugging. The dashboard captures the execution trace.

Update check persistence

~/.conductor/update-check.json   # Caches version check results for 24 hours

Memory type

Ephemeral (in-process for workflow execution) — no database, no file-based state between runs. Each conductor run is a fresh execution. Contrast with CDD variants where conductor/tracks.md persists across sessions.

Cross-session handoff

No — workflows are stateless between invocations. Context accumulation applies within a single workflow run, not across runs.

Orchestration

Conductor (microsoft) — Orchestration

Multi-agent

Yes — full multi-agent orchestration engine. Supports:

Sequential agent chains (default routing)
Parallel groups (static parallel execution)
Dynamic for-each fan-out (one agent per list element)
Sub-workflow composition (reusable nested workflows)
Script steps (shell commands in the agent graph)
Dialog agents (multi-turn conversation)
Human gates (pause for human decision)

Orchestration pattern

Parallel-fan-out + hierarchical — the most sophisticated in this batch. Workflows can combine sequential chains, parallel groups, dynamic for-each expansion, and nested sub-workflows in a single YAML file.

Deterministic routing (key differentiator)

No LLM in the orchestration loop. All routing uses Jinja2 expression evaluation:

routes:
  - condition: "epic_selector.output.status == 'DONE'"
    to: $end
  - condition: "coder.output.confidence < 0.7"
    to: reviewer
  - to: committer

This makes workflows behave like deterministic programs, not probabilistic agent graphs.

Isolation mechanism

No built-in isolation (no worktrees, no containers). Agents run against the same working directory. The workflow definition controls what files agents can read/write via tool configurations.

Multi-model

Yes — per-agent model assignment in YAML:

- name: epic_selector
  model: claude-sonnet-4.6    # coordination role
- name: coder
  model: claude-opus-4-5     # deep implementation
- name: reviewer
  model: claude-sonnet-4.6   # review

Both GitHub Copilot and Anthropic providers supported; can mix in one workflow (multi-provider-research.yaml).

Execution mode

One-shot — conductor run workflow.yaml runs to completion (or max_iterations). No background daemon by default; --web-bg enables background mode.

Crash recovery

No native checkpointing. Workflows restart from scratch on failure. Max iterations safety limit prevents infinite loops.

Context window management

context.mode: explicit — minimal; only declared inputs
context.mode: accumulate — growing context as agents complete

Consensus mechanism

None — routing is deterministic; no voting or consensus required.

Max concurrent agents

Parallel groups run all listed agents concurrently. No explicit limit configured; bounded by the runtime's parallelism.

Ui Cli Surface

Conductor (microsoft) — UI & CLI Surface

Dedicated CLI binary

Yes — conductor (package: conductor-cli).

Subcommand	Description
`conductor run <yaml> [--input k=v] [--web] [--web-bg] [-V]`	Execute workflow
`conductor validate <yaml>`	Schema validation
`conductor update [--apply]`	Version check and upgrade
`conductor stop`	Stop background workflow
`conductor registry`	Workflow registry management

The CLI is not a thin wrapper — it is the full runtime. It spawns agents, manages routing, handles parallel execution, runs the web server, and manages background processes.

Local web dashboard

Yes — built-in FastAPI + WebSocket dashboard.

Property	Value
Activate	`conductor run workflow.yaml --web`
Background	`conductor run workflow.yaml --web-bg` (prints URL, exits)
Port	dynamic (printed on launch)
Tech stack	FastAPI + uvicorn + WebSocket (backend), React/JS (frontend in `web/frontend/`)

Dashboard features

Interactive DAG graph (zoomable, draggable, animated edges showing execution flow)
Live agent streaming (reasoning, tool calls, outputs stream in real-time)
Three-pane layout: graph / agent detail / tabbed output (Log, Activity, Output)
In-browser human gates (respond to human-in-the-loop without terminal)
Per-node detail: prompt, metadata (model, tokens, cost), activity stream, output
Breadcrumb navigation into sub-workflows
Resizable panels

IDE integration

Optional: Claude Code skill (/plugin marketplace add microsoft/conductor + /plugin install conductor@conductor) teaches the YAML schema to the AI assistant. Also available for GitHub Copilot CLI (gh skill install microsoft/conductor conductor).

Observability

Rich terminal output: colored, streaming agent output via Rich library
Web dashboard: real-time workflow visualization
Verbosity control: --verbosity full|minimal|silent
Update check: cached version check every 24h with hint in terminal

Cross-platform install

macOS/Linux: curl install script (aka.ms/conductor/install.sh)
Windows: PowerShell install script (aka.ms/conductor/install.ps1)
SHA-256 integrity verification on install
conductor update --apply for in-place upgrades

Related frameworks

same archetype · same primary tool · same memory type

claude-mem (thedotmack) ★ 78k

A8 Cross-runtime harness

Background worker service captures every tool call as an observation, AI-compresses sessions, and auto-injects relevant past…

pi (badlogic/earendil) ★ 55k

A8 Cross-runtime harness

A minimal, hackable, multi-provider terminal coding agent that adapts to your workflows via npm-installable TypeScript Extensions…

Agent Skills (Addy Osmani) ★ 46k

A8 Cross-runtime harness

Encodes senior-engineer software development lifecycle as 23 auto-routed skills and 7 slash commands for any AI coding agent.

wshobson/agents Plugin Marketplace ★ 36k

A8 Cross-runtime harness

Single Markdown source for 83 domain-specialized plugins that auto-generates idiomatic artifacts for five AI coding harnesses.

TabbyML/Tabby ★ 34k

A8 Cross-runtime harness

Self-hosted AI coding assistant server (alternative to GitHub Copilot) with admin dashboard, RAG-based completions, and multi-IDE…

Compound Engineering ★ 17k

A8 Cross-runtime harness

Make each unit of engineering work compound into easier future work via brainstorm→plan→execute→review→learn cycles.

Distribution

Type: cli-tool
License: MIT
Install: one-liner
Version: 0.1.17

Surfaces

CLI binary: conductor
CLI subcmds: 5
Local UI: web-dashboard
Tech stack: FastAPI + uvicorn + WebSocket (backend); React/JS (frontend)

Components

Commands: 5
Skills: 1
Subagents: 0
Hooks: 0
MCP servers: 0
MCP tools: 0
Scripts: 2
Templates: 16

Workflow

Phases: 5
Approval gates: 1
Spec format: yaml
Spec storage: flat-files
Delta or full: whole-file

Orchestration

Multi-agent: Yes
Pattern: parallel-fan-out
Isolation: none
Consensus: none
Prompt chaining: Yes

Multi-model

Multi-model: Yes
BYOK: Yes
Modal: text

Execution

Mode: one-shot
Crash recovery: No
Compaction: No
Session handoff: No
Streaming: Yes

Memory

Type: none
Persistence: none
Search: none
State files: 1 file

Quality

TDD: No
TDD mechanism: none
Self-review: none

Git / Observability

Auto commit: No
Auto PR: No
Auto merge: No
Worktree/feat: No
Audit log: Yes
Audit format: proprietary
Replay: Yes

Tools

Primary: conductor-cli
Targets: 3
Portability: high

Signals

Stars: 156
Last commit: 2026-05-21
Contributors: 15
Maintainer: active
Quality score: 3.7/10

Summary

Conductor (microsoft) — Summary

Overview

Conductor (microsoft) — Overview

Origin

Philosophy

Use cases (from examples directory)

Dual-purpose: CLI + Claude Code / Copilot skill

Architecture

Conductor (microsoft) — Architecture

Distribution

Key dependencies

Source directory structure

Workflow YAML schema

Target AI tools

Components

Conductor (microsoft) — Components

CLI subcommands

Workflow primitives (YAML)

Web dashboard

Example workflows (16 in examples/)

Skill (for Claude Code / Copilot)

Prompts

Conductor (microsoft) — Prompts

Excerpt 1: examples/implement.yaml — Multi-model orchestration with Jinja2 routing

Excerpt 2: examples/parallel-research.yaml — Parallel execution with resilient failure mode

Prompting techniques observed

Uniqueness

Conductor (microsoft) — Uniqueness

differs_from_seeds

What makes this conductor unique

Observable failure modes

Positioning

Workflow

Conductor (microsoft) — Workflow

Workflow definition → execution model

Routing model

Approval gates

Parallel execution

For-each (dynamic fan-out)

Safety limits

Sub-workflow composition

Memory Context

Conductor (microsoft) — Memory & Context

Context model

State during workflow execution

Workspace instructions auto-injection

Replay capability

Update check persistence

Memory type

Cross-session handoff

Orchestration

Conductor (microsoft) — Orchestration

Multi-agent

Orchestration pattern

Deterministic routing (key differentiator)

Isolation mechanism

Multi-model

Execution mode

Crash recovery

Context window management

Consensus mechanism

Max concurrent agents

Ui Cli Surface

Conductor (microsoft) — UI & CLI Surface

Dedicated CLI binary

Local web dashboard

Dashboard features

IDE integration

Observability

Cross-platform install

Related frameworks

Example workflows (16 in `examples/`)

Excerpt 1: `examples/implement.yaml` — Multi-model orchestration with Jinja2 routing

Excerpt 2: `examples/parallel-research.yaml` — Parallel execution with resilient failure mode