Parlant

parlant · emcie-co/parlant · ★ 18k · last commit 2026-05-24

Primitive shape

No installable primitives

Summary

Parlant — Summary

Parlant is a Python framework for building reliable customer-facing AI agents through optimized context engineering. Instead of large system prompts, Parlant builds dynamic, per-turn context windows from composable behavioral primitives: Guidelines (condition → action), Observations (conditional tool/data triggers), Journeys (multi-step SOPs), Retrievers (domain knowledge), Glossary (domain terms), and Variables (session memory). A matching engine evaluates these primitives against each conversation turn and assembles only the relevant subset into the LLM prompt — preventing the prompt overload that kills compliance in production agents. Used in production at banks (JPMorgan Chase, Slice Bank).

Compared to seeds: Parlant is closest to none of the 11 seeds but has philosophical overlaps with kiro (behavioral control in a specialized IDE) and spec-driver (structured agent behavior definition). Unlike all seeds, Parlant is a runtime conversation governance framework: it doesn't help developers write code, it governs how AI agents behave in customer conversations. Its context engineering approach (assembling only relevant instructions per turn) is technically distinct from the "big CLAUDE.md" approach all seeds use.

Overview

Parlant — Overview

Origin

Developed by Emcie Co Ltd. (emcie-co). Python, Apache-2.0. 18,084 stars, 1,537 forks as of 2026-05-26 — the highest-starred framework in this batch by a significant margin. Active development on develop branch. Used in production at JPMorgan Chase, Slice Bank, Oracle AI.

Philosophy

"Build reliable customer-facing AI agents with Parlant: an interaction control harness optimized for controlled, consistent, and predictable LLM interactions."

The core insight: system prompts fail at scale because the more instructions you add, the less the LLM attends to any of them. Parlant's response: stop treating the system prompt as a document and start treating behavioral rules as individually matchable components assembled dynamically per conversation turn.

Problem Statement (from README)

"System prompts work until production complexity kicks in. The more instructions you add to a prompt, the faster your agent stops paying attention to any of them."

"Routed graphs solve the prompt-overload problem, but the more routing you add, the more fragile it becomes when faced with the chaos of natural interactions."

Solution: Context Engineering

Parlant's contextual matching engine evaluates the current turn against all defined Guidelines, Observations, and Journeys, then assembles only the matching subset into the LLM prompt. "Adding more rules makes the agent smarter, not more confused — because the engine filters context relevance, not the LLM."

Manifesto Quotes from Users

"Parlant isn't just a framework. It's a high-level software that solves the conversational modeling problem head-on." — Sarthak Dalabehera, Principal Engineer, Slice Bank

"Parlant dramatically reduces the need for prompt engineering and complex flow control." — Diogo Santiago, AI Engineer, Oracle

Target Domain

Customer-facing B2C and B2B agents in regulated industries: finance, insurance, healthcare, telecom. The README explicitly mentions banks as production users.

Architecture

Parlant — Architecture

Distribution

PyPI: pip install parlant CLI binaries: parlant (client), parlant-server (server), parlant-prepare-migration (schema migration)

Architecture

Parlant is a client-server architecture:

parlant-server: The conversation engine — runs the contextual matching engine, stores guidelines/agents, serves the API
parlant (client): Python SDK for interacting with the server
parlant-client (separate package, pip): REST client auto-generated from API

Source Structure

src/parlant/
  core/
    agents.py               # Agent definition and management
    guidelines.py           # Guideline storage and retrieval
    sessions.py             # Conversation session management
    context_variables.py    # Variable store
    glossary.py             # Domain glossary
    journeys.py             # Journey (SOP) definitions
    evaluations.py          # Evaluation framework
    engines/
      alpha/
        engine.py           # Main contextual matching engine
        guideline_matching/ # Matching algorithms
        tool_calling/       # Tool execution
        message_generator.py
        hooks.py            # Engine lifecycle hooks
        planners/           # Planning algorithms
    adapters/               # LLM provider adapters
    api/                    # REST API layer (Fastify/FastAPI equivalent)
    persistence/            # Document database abstraction
  sdk.py                    # Public SDK interface
  bin/
    server.py               # parlant-server entry point
    client.py               # parlant client entry point

Config Files

pyproject.toml — project configuration and entry points
CLAUDE.md — developer AI instructions

Required Runtime

Python ≥ 3.10
poetry or uv for dependency management

Target AI Tools

Any LLM via adapter layer. Developer tooling: Claude Code (CLAUDE.md present).

Components

Parlant — Components

Behavioral Primitives (SDK API)

Primitive	Purpose	Example
`Agent`	An AI agent with a name, description, and `CompositionMode`	`await server.create_agent(name="Support")`
`Guideline`	A `condition → action` behavioral rule; matched per turn	`condition="customer angry", action="acknowledge and de-escalate"`
`Observation`	Conditional trigger for tools/data; activates when condition holds	`condition="uses financial terminology", tools=[research_deep_answer]`
`Journey`	Multi-step SOP with state; the agent's current step affects matching	Escalation workflow, onboarding sequence
`Retriever`	Domain knowledge source attached to context when relevant	Product catalog, policy documents
`Glossary`	Domain-specific terms injected when mentioned	"DTI" → "debt-to-income ratio"
`Variable`	Session or agent-scoped memory variable	`customer_tier: "premium"`

Context Exclusion / Priority

Concept	Purpose
`guideline.exclude(other)`	When both guidelines match, `other` is excluded from context
Dependencies	A guideline can depend on an Observation — activates only when observation holds
`MATCH_ALWAYS`	Guideline always included regardless of condition

Output Modes

Mode	Behavior
Fluid Output	LLM generates free-form response from assembled context
Strict/Canned Output	Deterministic response selected from predefined options

Core Engine

Contextual Matching Engine (engines/alpha/engine.py):

Receives user turn
Evaluates all Guidelines + Journeys + Observations against turn
Assembles focused context window (matching subset only)
Calls contextually-associated tools if needed
Iterates tool results back through matching if needed
Generates message from narrowed context

ARQ (Attentive Reasoning Queries): Based on research paper "Attentive Reasoning Queries" — domain-specialized reasoning blueprints that improve model accuracy and consistency. The engine uses ARQs internally.

Evaluation Framework

core/evaluations.py — stochastic testing framework for agent behavior. pytest_stochastics.json configures probabilistic test thresholds.

CLI Binaries

Binary	Purpose
`parlant-server`	Start the conversation engine server
`parlant`	Python SDK client CLI
`parlant-prepare-migration`	Database schema migration preparation

Persistence

Document database abstraction (core/persistence/document_database.py) with BaseDocument → FindResult interface. Supports versioning, cursor-based pagination, sort directions.

Prompts

Parlant — Prompts

Parlant's "prompts" are its behavioral primitives — Guidelines, Observations, and Journeys defined in Python code. The framework explicitly moves away from prose system prompts toward structured, matchable behavioral rules.

SDK API as Prompt Language

import parlant.sdk as p

async with p.Server():
    agent = await server.create_agent(
        name="Customer Support",
        description="Handles customer inquiries for an airline",
    )

    # Evaluate and call tools only under the right conditions
    expert_customer = await agent.create_observation(
        condition="customer uses financial terminology like DTI or amortization",
        tools=[research_deep_answer],
    )

    # When the expert observation holds, always respond with depth
    expert_answers = await agent.create_guideline(
        matcher=p.MATCH_ALWAYS,
        action="respond with technical depth",
        dependencies=[expert_customer],
    )

    beginner_answers = await agent.create_guideline(
        condition="customer seems new to the topic",
        action="simplify and use concrete examples",
    )

    # When both match, beginners wins
    await beginner_answers.exclude(expert_customer)

Prompting technique: Code-as-behavior-specification. Each guideline is a discrete, independently matchable unit with explicit dependencies and exclusions. This is "prompt engineering as software engineering" — behavioral rules compose, override, and depend on each other like objects.

CLAUDE.md (Developer Instructions)

# Parlant Developer Guide

## Repository Structure
The repository uses a develop branch as the main integration branch.
All pull requests should target develop.

## Testing
- Unit tests: pytest
- Stochastic behavioral tests: pytest --stochastics pytest_stochastics.json
  (Tests probabilistic behavior with configurable pass thresholds)
- Integration tests require a running parlant-server

## Contribution
Follow the DCO (Developer Certificate of Origin) — sign off commits:
git commit -s -m "..."

Prompting technique: Minimal developer instructions focused on process (DCO, stochastic testing). Not prescriptive about code style — trusts the framework's own test infrastructure.

ARQ Research Foundation

From the README and code imports, Parlant uses "Attentive Reasoning Queries (ARQs)" — domain-specialized reasoning blueprints based on an arXiv paper (2503.03669). These are internal prompt patterns that the engine uses to improve model accuracy in conversational contexts. The ARQ mechanism is not directly exposed in the SDK but runs inside engines/alpha/.

Prompting technique: Research-derived structured reasoning queries that decompose complex conversational decisions into smaller, more reliably answered sub-questions — reducing model attention failures.

Uniqueness

Parlant — Uniqueness & Positioning

Differs from Seeds

Parlant has no close seed equivalent. It occupies a layer that seeds don't address: runtime conversation governance for customer-facing agents. The 11 seeds all deal with developer AI assistants (coding); Parlant governs customer-facing agents (support, sales, compliance). The closest philosophical match is kiro (behavioral specification with precise rules) but kiro is a developer IDE while Parlant is a runtime conversation engine. spec-driver and superpowers use structured markdown rules for AI assistants — Parlant uses the same "structured rules" philosophy but applies it to a running service with per-turn matching, not a static CLAUDE.md. The contextual matching engine (only matching rules enter each prompt) is architecturally novel in the corpus.

Unique Characteristics

Highest stars in batch: 18,084 stars — significantly higher than any other framework in this batch. Production usage at JPMorgan Chase and Slice Bank provides credibility.
Per-turn contextual matching: The engine assembles only matching Guidelines/Observations/Journeys into each LLM call. Adding 1,000 rules doesn't expand context if only 5 match. No other framework in corpus does this.
ARQ (Attentive Reasoning Queries): Research-derived structured reasoning blueprints (arXiv 2503.03669) integrated into the engine to improve accuracy and consistency — a published academic paper backing the implementation.
Strict/Canned output mode: Deterministic response selection without LLM generation — for the highest-compliance scenarios (regulated finance, healthcare).
Stochastic behavioral testing: pytest_stochastics.json framework for probabilistic coverage testing of agent behavior. Not unit tests — statistical behavioral reliability tests.
Guideline dependency/exclusion graph: Guidelines can depend on Observations (only activate when observation holds) and exclude each other (when both match, one wins). This creates a computable behavior DAG, not a flat rule list.
Banking production deployments: The only framework in this batch with confirmed major-bank production usage.

Positioning

"Enterprise behavioral governance for customer-facing AI agents." Solves the problem that system prompts become ungovernable at 50+ rules — by replacing the monolithic prompt with a matching engine.

Observable Failure Modes

Behavioral complexity management: As the number of Guidelines grows, understanding the emergent interaction between matching rules, dependencies, and exclusions becomes non-trivial.
No multi-agent orchestration: Parlant governs single agents; teams building complex multi-agent systems need a separate orchestration layer.
ARQ opacity: The internal ARQ mechanism is not user-configurable — operators can't tune the reasoning blueprints.
Fluid mode non-determinism: Even with precise rules, fluid output mode produces LLM responses that can vary. Parlant mitigates (not eliminates) this.

Cross-References

Research foundation: arXiv 2503.03669 (ARQ)
Production users: JPMorgan Chase, Slice Bank, Oracle AI
Competitor territory: Sierra AI, Ada, Decagon (mentioned in README as alternatives)

Workflow

Parlant — Workflow

Agent Development Workflow

Phase	Artifact
Install	`pip install parlant`
Start server	`parlant-server`
Define agent	`await server.create_agent(name=..., description=...)`
Define guidelines	`await agent.create_guideline(condition=..., action=...)`
Define observations	`await agent.create_observation(condition=..., tools=[...])`
Define journeys	`await agent.create_journey(...)`
Connect customer	SDK creates session, passes turns to engine

Per-Turn Execution Flow

Customer sends message
Match Phase: Engine evaluates all Guidelines + Observations + Journeys against the current turn
Tool Phase: If observations triggered, call their associated tools; re-run matching with tool results
Compose Phase: Assemble focused context from matched items only
- Fluid mode: LLM generates response from focused context
- Strict mode: Select canned response from predefined options
Response streamed to customer

Journey Execution

Journeys are SOPs — multi-step procedures where the agent's current step affects matching:

Current step's requirements are activated
Completion conditions advance the step
Agent state persists across turns

Evaluation Workflow

# Stochastic behavioral testing
pytest --stochastics pytest_stochastics.json

Tests verify that behavioral rules fire when expected, don't fire when they shouldn't, and maintain accuracy over probabilistic samples.

Approval Gates

None in the conversational path — Parlant is designed for automated, consistent responses. Human oversight is implemented through the evaluation framework (review behavioral coverage in test) and the MATCH_ALWAYS + exclusion rules (deterministic behavior).

The closest to an approval gate is the Strict/Canned Output mode where responses are deterministically selected from approved options — no LLM generation.

Migration Workflow

parlant-prepare-migration — schema evolution tool for updating existing deployments without conversation history loss.

Memory Context

Parlant — Memory & Context

Context Assembly (Per-Turn)

Parlant's key innovation is the contextual matching engine that assembles a focused context window on every turn — not a static system prompt. State that feeds into this assembly:

Guidelines: all defined condition→action rules (evaluated against current turn)
Observations: conditional tool triggers (evaluated, fire or not fire)
Journeys: current step state (which step is the agent at?)
Retrievers: relevant domain knowledge retrieved per turn
Glossary: domain terms found in the message
Variables: current session/agent variable values

Session Memory (Variables)

ContextVariable / ContextVariableStore — key-value memory scoped to session or agent. Persisted in the document database. Used to inject per-customer state (tier, preferences, history) into the context window.

Conversation History

Standard conversation history (messages array) stored in the session. sessions.py manages session state. History is passed to the LLM along with the assembled focused context.

Persistence Layer

Document database abstraction (core/persistence/document_database.py):

BaseDocument with id, creation_utc, version fields
Cursor-based pagination (FindResult)
Sort and filter operations

The actual storage backend is configurable. The versioning field on every document enables schema migration via parlant-prepare-migration.

Memory Persistence Scope

Session: conversation history, variable values, journey step state
Global/Project: guidelines, observations, journeys, glossary, retrievers — these are defined once and apply to all conversations for the agent

Cross-Session Handoff

Yes — the document database persists guidelines, journeys, and agent configuration across server restarts. Customer conversation history also persists.

Context Compaction

No explicit compaction. The contextual matching engine naturally limits context size by including only matching items — this is Parlant's alternative to compaction. Adding 1,000 guidelines doesn't expand the per-turn context if only 5 match.

Evaluation State

pytest_stochastics.json — test configuration for probabilistic behavioral coverage. Tracks pass rates across random samples, enabling statistical guarantees about behavioral reliability.

Orchestration

Parlant — Orchestration

Multi-Agent Support

Limited. Parlant is designed for single-agent conversations with behavioral complexity, not multi-agent task orchestration. The core primitives (Guidelines, Observations, Journeys) govern a single agent's behavior. Multiple Parlant agents can be created independently but they don't orchestrate each other.

Orchestration Pattern

Sequential within a conversation (turn-by-turn). The Journey primitive provides multi-step sequential SOP execution within a single agent.

Execution Mode

Background daemon — parlant-server runs as a persistent service. AI clients submit conversation turns via the Python SDK or REST API and receive responses.

Multi-Tool Execution (within a turn)

Within a single turn, multiple tools can be triggered:

Matching phase triggers relevant observations
All relevant tools called (parallel or sequential — implementation detail)
Results fed back through another matching iteration
Final response composed from accumulated context

This is iterative tool calling within a turn, not multi-agent orchestration.

Isolation Mechanism

None — Parlant manages behavioral logic, not execution isolation. Tool implementations are provided by the integrator.

Multi-Model Support

Yes — the adapter layer (core/adapters/) supports multiple LLM providers. Each agent can be configured with a different model. No dynamic routing between models within a single agent turn.

Fluid vs. Strict Output Modes

Fluid (default): LLM generates response from assembled context — probabilistic, natural language
Strict/Canned: Deterministic response selection from approved options — no LLM generation for response text

The strict mode is effectively a rule-based output selector — approaching deterministic for compliance-critical scenarios.

Stochastic Testing as Quality Gate

The evaluation framework with pytest_stochastics.json runs behavioral tests statistically:

pass_threshold: 0.95  # 95% of samples must produce correct behavior
sample_count: 100     # Test each scenario 100 times

This is the framework's answer to "how do you verify agent behavior?" — probabilistic coverage testing rather than exact test oracles.

Ui Cli Surface

Parlant — UI/CLI Surface

CLI Binaries

Binary	Purpose
`parlant-server`	Starts the conversation engine server
`parlant`	Python SDK client (REST client)
`parlant-prepare-migration`	Database schema migration

Entry points (from pyproject.toml):

parlant = "parlant.bin.client:main"
parlant-server = "parlant.bin.server:main"
parlant-prepare-migration = "parlant.bin.prepare_migration:main"

Local Web Dashboard

Exists: Unknown (not found in repository structure). The api/ directory contains REST endpoints but no frontend UI was found.

The parlant-client package is a separate Python library for API access. No web dashboard confirmed.

API Surface

The src/parlant/api/ directory contains REST API modules:

agents.py — agent CRUD
guidelines.py — guideline management
sessions.py — conversation session management
context_variables.py — variable management
customers.py — customer management
evaluations.py — evaluation runs
glossary.py — glossary management
journeys.py — journey management
logs.py — request/response logging
health.py — health endpoint

Observability

logs.py API endpoint for structured request/response logging
health.py health endpoint
core/health.py internal health reporter with ENGINE_TURN_KIND and ENGINE_TURNS_COUNTER metrics
Meter class for custom metrics instrumentation

Evaluation / Testing Surface

# Standard tests
pytest

# Stochastic behavioral tests (probabilistic coverage)
pytest --stochastics pytest_stochastics.json

The stochastic test framework is Parlant's primary quality assurance surface — it verifies behavioral reliability statistically rather than deterministically.

IDE Integration

Claude Code development tooling: CLAUDE.md with developer instructions. No end-user IDE integration (not applicable — server framework).

Related frameworks

same archetype · same primary tool · same memory type

OpenHarness ★ 13k

A11 Governance

Open-source Python agent runtime providing complete harness infrastructure: tools, memory, governance, swarm coordination, and…

Trae Agent ★ 12k

A11 Governance

Research-friendly open-source CLI coding agent by ByteDance, designed for academic ablation studies and modular LLM provider…

Sweep AI ★ 7.7k

A11 Governance

Autonomous GitHub bot that converts issues to pull requests using a sequential multi-agent pipeline.

Agent Governance Toolkit (microsoft) ★ 2.3k

A11 Governance

Enterprise-grade AI agent governance: YAML policy enforcement, 12-vector prompt injection defense, zero-trust identity,…

TDD Guard ★ 2.1k

A11 Governance

Mechanically enforces the Red-Green-Refactor TDD cycle by blocking file writes that violate TDD principles via a PreToolUse hook…

Agentic Coding Flywheel Setup (ACFS) ★ 1.5k

A11 Governance

Take a complete beginner from laptop to three AI coding agents running on a VPS in 30 minutes via an idempotent manifest-driven…

Distribution

Type: standalone-repo
License: Apache-2.0
Install: npm-install
Version: unknown (develop branch, 2026-05-24)

Surfaces

CLI binary: parlant
CLI subcmds: 3
Local UI: No
Tech stack: null

Components

Commands: 0
Skills: 0
Subagents: 0
Hooks: 0
MCP servers: 0
MCP tools: 0
Scripts: 2
Templates: 0

Workflow

Phases: 6
Approval gates: 0
Spec format: none
Spec storage: db
Delta or full: none

Orchestration

Multi-agent: No
Pattern: sequential
Max concurrent: 1
Isolation: none
Consensus: none
Prompt chaining: Yes

Multi-model

Multi-model: Yes
BYOK: Yes
Modal: text

Execution

Mode: background-daemon
Compaction: Yes
Session handoff: Yes
Streaming: Yes

Memory

Type: hybrid
Persistence: project
Search: none
State files: 2 files

Quality

TDD: Optional
TDD mechanism: dedicated-skill
Validators: 2
Self-review: inline-self

Git / Observability

Auto commit: No
Auto PR: No
Auto merge: No
Worktree/feat: No
Audit log: Yes
Audit format: proprietary
Replay: No

Tools

Primary: any-llm
Targets: 1
Portability: high

Signals

Stars: 18k
Last commit: 2026-05-24
Contributors: 30
Maintainer: active
Quality score: 4.1/10