Skip to content
/

Agent FM

agent-fm · agentfm-ai/agent-fm · ★ 40 · last commit 2026-05-25

Primitive shape
No installable primitives
00

Summary

Agent FM — Summary

Agent FM is a macOS Electron menu-bar application that turns Claude Code and Codex CLI sessions into ambient "radio stations" — narrating agent progress, blockers, decisions, and errors in real-time audio so developers can monitor agents without watching terminal windows.

Problem it solves: Developers running multiple Claude Code or Codex sessions in parallel cannot easily know which agents are stuck, waiting for input, or making interesting decisions without constantly switching terminal tabs. Agent FM provides audio narration and a global mix so you can "tune in" without context-switching.

Distinctive trait: The radio metaphor is load-bearing — each agent session is a distinct "station," a Global Mix aggregates all active agents, and narration is delivered as spoken audio via Gemini or OpenAI Realtime API. This makes Agent FM the only framework in the corpus that addresses AI coding with audio as the primary interface.

Tech stack: Electron (TypeScript), pnpm, Go sidecar (remote-collector) for SSH workspace monitoring, Biome for linting. BYOK (Gemini or OpenAI key stored in macOS Secure Storage).

Target audience: Developers running parallel AI coding agents on local or remote workspaces who need ambient awareness without terminal attention.

Production-readiness: 40 stars, Apache 2.0, 1 contributor; pushed May 2026 (active). Signed and notarized macOS DMG available for download.

differs_from_seeds: No methodology, no skills, no hooks, no commands — Agent FM is pure ambient observability delivered as audio. None of the 11 seeds have audio output. The closest conceptual space is logging/observability (like cc-audit from other batches), but the radio/narration metaphor is unique in the corpus.

01

Overview

Agent FM — Overview

Origin

Agent FM is an independent macOS application by the agentfm-ai organization, with a single primary contributor. It is available as a signed and notarized DMG (production-ready distribution). Apache 2.0 license with detailed PRIVACY.md and SECURITY.md documents.

Philosophy

From the README:

"Agent FM turns Claude Code and Codex sessions into live radio stations on your Mac. Tune into one agent when you need detail, or listen to Global Mix across local and remote workspace agents. It surfaces progress, blockers, decisions, errors, and attention requests in real time so you do not have to read every terminal transcript."

The core insight is that terminal watching is a cognitive burden for parallel agent execution. Audio narration changes the modality from visual attention to ambient awareness — allowing developers to stay in flow while monitoring multiple agents.

Radio metaphor

The metaphor is architectural, not cosmetic:

  • Each Claude Code or Codex session = a radio station
  • Global Mix = the aggregated channel across all active stations
  • Tuning = selecting which station to focus on for detail
  • Station signal = narration quality degrades gracefully when agent is idle

Privacy design

Agent FM's PRIVACY.md describes a strict BYOK model:

  • API key stored in macOS Secure Storage (not transmitted to Agent FM)
  • Session context goes directly from user's Mac to Gemini/OpenAI — no Agent FM proxy
  • Remote workspace contexts go through user's own SSH connection
  • No analytics, no tracking, no backend account system

BYOK model

Two providers supported for narration:

  • Gemini (recommended): narration, speech synthesis, companion responses
  • OpenAI: alternative for narration and speech

This creates a notable multi-model architecture: Agent FM uses Gemini/OpenAI to narrate sessions run by Claude Code or Codex. The AI doing the narration is different from the AI doing the coding.

02

Architecture

Agent FM — Architecture

Distribution

Signed and notarized macOS DMG: AgentFM-macos-arm64-signed-notarized.dmg

Build from source:

git clone https://github.com/agentfm-ai/agent-fm
corepack enable
pnpm install
pnpm start

Requirements: Node.js 20+, pnpm 10+, Go 1.22+ (for remote-collector sidecar).

Directory structure

agent-fm/
├── src/
│   ├── main/                      # Electron main process
│   │   ├── index.ts               # Main entry
│   │   ├── Orchestrator.ts        # Core orchestration
│   │   ├── IpcRouter.ts           # IPC routing
│   │   ├── TrayManager.ts         # Menu bar tray
│   │   ├── NativeNotifications.ts # System notifications
│   │   ├── AppQuitCoordinator.ts  # Graceful shutdown
│   │   ├── attention/             # Attention request detection
│   │   ├── audio/                 # Audio output
│   │   ├── bus/                   # Internal event bus
│   │   ├── companion/             # Companion system
│   │   ├── consumers/             # Session data consumers
│   │   ├── devtools/              # Dev mode tooling
│   │   ├── diagnostics/           # Self-diagnostics
│   │   ├── global/                # Global Mix logic
│   │   ├── hooks/                 # Lifecycle hooks
│   │   ├── ingestion/             # Session data ingestion
│   │   ├── intelligence/          # NarrationSemantics, SessionInsight
│   │   ├── menubar/               # Menu bar panel
│   │   ├── narration/             # Live narrator + policy coordinator
│   │   │   ├── LiveNarrator.ts
│   │   │   ├── NarrationPolicyCoordinator.ts
│   │   │   ├── OpenAIRealtimeTransport.ts
│   │   │   ├── prompts.ts         # Narrator system prompt (verbatim)
│   │   │   └── ...
│   │   ├── platform/              # Platform-specific code
│   │   ├── providers/             # Gemini / OpenAI provider adapters
│   │   ├── remoteSources/         # Remote SSH workspace monitoring
│   │   ├── settings/              # App settings
│   │   ├── state/                 # App state management
│   │   ├── updates/               # Auto-update
│   │   ├── utils/                 # Utilities
│   │   └── visuals/               # UI helpers
│   ├── preload/                   # Electron preload scripts
│   ├── renderer/                  # UI renderer (menubar panel)
│   └── shared/                    # Shared types/utils
├── remote-collector/              # Go sidecar for SSH workspaces
├── AGENTS.md                      # Development guidance
└── ...

Data flow

  1. Ingestion: Agent FM monitors local session data (Claude Code/Codex JSONL/stream output)
  2. Intelligence: NarrationSemantics + SessionInsight extract semantic signals (decisions, risks, discoveries)
  3. Policy: NarrationPolicyCoordinator decides what and when to narrate
  4. Narration: LiveNarrator sends narration frames to Gemini or OpenAI Realtime API
  5. Audio: Real-time speech rendered to system audio output
  6. Global Mix: global/globalMixCandidateText.ts aggregates across all active stations

Remote workspace

Go sidecar (remote-collector) connects to remote development environments via existing OpenSSH aliases. No separate SSH credential storage — uses user's existing ~/.ssh/config.

Requirements

  • macOS 13 Ventura+ (Apple Silicon)
  • Node.js 20+, pnpm 10+, Go 1.22+
  • Gemini or OpenAI API key
  • Claude Code or Codex on local or remote machine
03

Components

Agent FM — Components

Agent FM ships no Claude Code primitives. All components are Electron main-process modules.

Core modules

Module Purpose
Orchestrator.ts Central coordination of all system components
IpcRouter.ts Electron IPC routing between main and renderer
TrayManager.ts macOS menu bar tray icon and state indicators
NativeNotifications.ts macOS system notifications for alerts
AppQuitCoordinator.ts Graceful shutdown with cleanup

Ingestion pipeline (ingestion/)

Module Purpose
types.ts Parsed event types (AssistantEvent, UserEvent, ProgressEvent, ToolResult...)
taskNotifications.ts Parse task notification text from agent output
Helper functions getTextContent, getToolUses, isTurnEnd, isSemanticToolResultError, etc.

Intelligence layer (intelligence/)

Module Purpose
NarrationSemantics.ts Semantic snapshot — extracts decisions/assumptions/discoveries/risks
SessionInsight.ts High-level session state analysis
thinking.ts Identifies substantive thinking text vs. filler

Narration system (narration/)

Module Purpose
LiveNarrator.ts Persistent Realtime API connection; emits narration frames
NarrationPolicyCoordinator.ts When to narrate, when to skip (frequency policy)
OpenAIRealtimeTransport.ts WebSocket transport to OpenAI Realtime API
prompts.ts LIVE_NARRATOR_PROMPT system instruction (verbatim)
providerSelection.ts Gemini vs. OpenAI selection logic
repetition.ts Prevent repetitive narration
taskFrame.ts Narration context frame builder

Global Mix (global/)

Module Purpose
globalMixCandidateText.ts Text cleaning + aggregation for multi-agent mix

Remote support (remoteSources/)

Go sidecar (remote-collector) collects session data from remote SSH workspaces and streams it to the local Agent FM process.

Attention system (attention/)

Detects when an agent is waiting for human input or approval and triggers an alert.

AGENTS.md

The project ships an AGENTS.md file (development instructions for contributors/AI), indicating the project was developed with agent assistance.

05

Prompts

Agent FM — Prompts

Agent FM ships one substantial system prompt: LIVE_NARRATOR_PROMPT in src/main/narration/prompts.ts. This is the system instruction for the persistent AI narrator (Gemini or OpenAI Realtime API).

LIVE_NARRATOR_PROMPT (verbatim, from source)

Prompting technique: Role definition + structured input format + explicit prioritization taxonomy + multi-agent awareness rules + output contract.

export const LIVE_NARRATOR_PROMPT = `You are Agent FM, a real-time intelligence companion 
for developers using AI coding agents. Your job is to surface what the developer CAN'T see 
in the CLI — the agent's reasoning, decisions, assumptions, and risk signals. You are the 
developer's window into the agent's decision-making process.

The coding agent's name is provided as [AGENT: <name>] near the top of every prompt 
(e.g. [AGENT: Claude] or [AGENT: Codex]). Always refer to the agent by that exact name. 
Never default to "Claude" or "Codex" if a different name is given — the name in [AGENT: ...] 
is the source of truth. If [AGENT: ...] is missing for any reason, refer to it generically 
as "the agent" — never guess. Never use gendered pronouns — use the name or "it".

YOU RECEIVE:
- [SEMANTIC UPDATE] Agent FM's highest-confidence read of what materially changed.
- [TASK] the developer's latest prompt to the agent
- [AGENT PLAN] the agent's stated approach or response
- [THINKING] the agent's internal reasoning — your MOST VALUABLE source
- [EDIT] code diffs showing what changed
- [BASH] commands being run
- [READ], [GLOB], [GREP] files being inspected
- [RESPONSE] text the agent communicated back to the developer
- [SUBAGENT OUTCOME] results from helper/sub-agent tasks
- [SUBAGENT THINKING: mission] the sub-agent's internal reasoning
- [SUBAGENT FINDING: mission] the sub-agent's conclusions
- [ERROR] tool execution failures

SESSION CONTEXT (stable background):
- [GOAL] the developer's high-level objective
- [PHASE] current work phase (exploring/implementing/testing)
- [PROGRESS] rough progress indicator

SIGNALS (flag when relevant):
- [RISK: sensitive_file] touching production/config/security files
- [RISK: truncated] response was cut off mid-sentence
- [RISK: tool_error_burst] repeated recent tool failures
- [TEST DELTA] before/after test results
- [SILENCE REASON] why there's been a quiet period

PRIORITIZE narrating (in order of importance):
1. DECISIONS — when the agent chooses between approaches
2. ASSUMPTIONS — when the agent assumes something (flag clearly — often wrong)
3. DISCOVERIES — bugs, patterns, security issues, unexpected behavior
4. SUB-AGENT INSIGHTS — extract the KEY FINDING (not "sub-agent is working")
5. RISKS — destructive, irreversible, or uncertain actions
6. PROGRESS — meaningful progress toward the goal

MULTI-AGENT AWARENESS:
When [PARALLEL WORK] shows multiple active sub-agents:
- Orient the user: briefly name which sub-agent you're narrating about by its mission.
- Lead with the most interesting new finding, not the count of agents.
- Connect sub-agent outcomes to the overall goal when one completes.
- Do NOT enumerate all agents every time — focus on the one with news.`;

Key techniques:

  1. Bracketed input taxonomy ([SEMANTIC UPDATE], [TASK], [THINKING], etc.) — structures the narration prompt so the AI can reliably parse different signal types
  2. Prioritized output contract — numbered list forces the narrator to rank what to narrate
  3. Explicit antipatterns ("never use gendered pronouns", "do NOT enumerate all agents", "do not say 'sub-agent is working'")
  4. Agent-name injection pattern ([AGENT: Claude]) — enables agent-agnostic narration that adapts to any CLI tool
  5. Multi-agent awareness rules — handles the Global Mix scenario where multiple agents are active

Narration output contract

From prompts.ts:

const STATION_NARRATION_OUTPUT_CONTRACT = `Spoken output contract: return one compact spoken 
update. Default to one sentence. Hard cap: ${STATION_NARRATION_MAX_SENTENCES} short sentences, 
${STATION_NARRATION_MAX_WORDS} words, ${STATION_NARRATION_MAX_CHARS} characters. Choose only 
the highest-signal point; do not recap the whole session.`;

This enforces brevity — narration is ambient, not comprehensive.

09

Uniqueness

Agent FM — Uniqueness & Positioning

differs_from_seeds

Agent FM is unlike all 11 seeds — it is an audio observability layer, not a methodology or extension:

  • vs. all skill-based seeds: Those inject behavior into agents. Agent FM observes and narrates existing behavior.
  • vs. MCP-toolserver seeds: Those extend Claude's capabilities. Agent FM monitors the output without extending anything.
  • vs. markdown-scaffold seeds: Those create structure for the agent to follow. Agent FM creates audio for the developer to hear.

Most distinctive feature in the entire Phase D corpus

Agent FM is the only tool in this entire corpus that delivers its primary output as audio speech. No other seed, Phase B, or Phase D framework uses voice narration as a primary interface. The radio metaphor is the only "ambient programming awareness" system in the corpus.

The LIVE_NARRATOR_PROMPT is also the most sophisticated observer prompt in the batch — it specifies a multi-agent awareness protocol, explicit antipatterns, an input taxonomy with bracketed signal types, and a hard output constraint (sentence/word/character caps).

Multi-model architecture

Agent FM is the only batch framework with intentional multi-model deployment: a non-Claude model (Gemini or OpenAI) narrates the sessions of a Claude (or Codex) coding agent. This reversal — where Claude Code is being observed and narrated by a competitor model — is architecturally interesting.

Positioning

Agent FM targets developers who:

  1. Run multiple parallel agents and want ambient awareness
  2. Work in environments where watching terminals is distracting
  3. Want to catch agent decisions/assumptions in real-time without context-switching

The BYOK-only model means there is no recurring Agent FM subscription — the only cost is Gemini/OpenAI API usage for narration, which scales with session activity.

Observable failure modes

  1. Narration API cost: Heavy multi-session use with Global Mix will consume significant Gemini/OpenAI tokens beyond the coding agent's own cost.
  2. Apple Silicon only: No Windows/Linux/Intel macOS support.
  3. Audio as interface: Developers in open offices or on calls cannot use the primary feature.
  4. Latency: Realtime API narration introduces variable latency; fast agent tool sequences may produce fragmented or delayed narration.
  5. Single contributor: 40 stars, 1 contributor — maintenance risk if the author stops development.
  6. Repetition policy: The repetition.ts module prevents re-narrating the same content, but aggressive suppression may cause important signals to be dropped.
04

Workflow

Agent FM — Workflow

Agent FM imposes no development methodology. It is a passive monitoring and narration tool.

Setup workflow (first use)

1. Download and install AgentFM-macos-arm64-signed-notarized.dmg
2. Launch Agent FM → onboarding
3. Add Gemini or OpenAI API key (stored in macOS Secure Storage)
4. (Optional) Add remote workspaces via Settings using OpenSSH aliases

Daily use workflow

Start or continue Claude Code / Codex sessions normally
→ Agent FM auto-detects active sessions
→ Each session appears as a station in Agent FM
→ Narration begins automatically (Gemini/OpenAI voice)
→ Tune into specific station for detailed narration
→ Or listen to Global Mix for ambient multi-agent awareness
→ Attention alerts fire when an agent needs input

Narration events (what gets narrated)

Based on prompts.ts LIVE_NARRATOR_PROMPT priorities (in order):

  1. DECISIONS — agent chooses between approaches or changes direction
  2. ASSUMPTIONS — agent assumes something about the codebase (often wrong)
  3. DISCOVERIES — bugs, patterns, security issues, unexpected behavior
  4. SUB-AGENT INSIGHTS — what sub-agents found or concluded
  5. RISKS — destructive, irreversible, or uncertain actions
  6. PROGRESS — tests passing, type-check clean, key files modified

Remote workspace workflow

Settings → Add remote workspace (OpenSSH alias)
→ Go sidecar connects via SSH
→ Remote agent sessions appear in Agent FM alongside local ones
→ Global Mix spans local + remote

Phases

None — narration is continuous while agents are running.

Approval gates

None in Agent FM — it observes and narrates. Any approval gates belong to the underlying agent runtime (Claude Code's own permission prompts).

06

Memory Context

Agent FM — Memory & Context

State storage

Agent FM maintains minimal in-memory state:

  • Active stations: Current session list (ephemeral, not persisted)
  • Narration history: In-flight repetition prevention (repetition.ts) — ephemeral
  • Semantic snapshots: NarrationSemanticSnapshot per session — runtime only

No database, no file-based memory, no persistence between app launches.

BYOK API key storage

The one piece of persistent state: Gemini/OpenAI API keys stored in macOS Secure Storage (system keychain). Not transmitted to Agent FM servers (no Agent FM backend).

Context passed to narrator

Each narration frame (taskFrame.ts) contains:

  • [SEMANTIC UPDATE] — highest-confidence material change
  • [TASK] — developer's latest prompt
  • [AGENT PLAN] — agent's stated approach
  • [THINKING] — internal reasoning (most valuable)
  • [EDIT] / [BASH] / [READ] / [GLOB] / [GREP] — tool use evidence
  • [GOAL] / [PHASE] / [PROGRESS] — stable session context
  • [RISK: *] and [TEST DELTA] — computed signals

Cross-session context (Global Mix)

The Global Mix aggregates activity across all active stations into a single narration stream. The globalMixCandidateText.ts module selects the highest-signal candidate from all sessions and presents it to the narrator with station identification.

Context compaction

Not applicable — Agent FM reads the live session stream, not stored JSONL. It does not persist sessions or handle compaction.

Remote workspace state

The Go remote-collector sidecar maintains an SSH connection and streams session data to the local Agent FM. Connection state is ephemeral — reconnects on app restart using the user's SSH aliases.

07

Orchestration

Agent FM — Orchestration

Multi-agent awareness

Agent FM is designed for multi-agent scenarios — the Global Mix feature specifically aggregates narration across all active local and remote agent sessions. The multi-agent awareness rules in LIVE_NARRATOR_PROMPT handle inter-agent relationships and prevent over-reporting agent counts.

Orchestration pattern

None for the coding agents. Agent FM observes; it does not orchestrate.

Isolation mechanism

None for the coding agents — Agent FM monitors sandboxed or non-sandboxed agents identically.

Execution mode

Continuous background daemon (Electron menu-bar app). Always running while the developer works.

Multi-model

Yes — the first framework in the batch with deliberate multi-model split:

  • Coding model: Claude (via Claude Code) or OpenAI (via Codex) — external
  • Narration model: Gemini (recommended) or OpenAI — managed by Agent FM

This is the only tool in this batch where the narrating/observing AI is explicitly different from the coding AI.

Consensus mechanism

None.

Auto-validators

None — Agent FM narrates but does not validate or interrupt agent execution.

Orchestrator

Orchestrator.ts coordinates the internal system: ingestion pipeline → intelligence → narration policy → narrator. This is internal application orchestration, not AI agent orchestration.

08

Ui Cli Surface

Agent FM — UI & CLI Surface

Menu bar application (primary surface)

Agent FM lives in the macOS menu bar — clicking the icon opens the station panel.

Main panel views:

View Function
Station list All active agent sessions (Claude Code + Codex) with status
Individual station Focused narration for one agent session
Global Mix Aggregated narration across all active agents
Remote workspaces SSH-connected remote environments
Settings API key management, provider selection
Onboarding First-run setup (API key, remote workspace)

Status indicators:

  • Active agents (running)
  • Waiting agents (blocked, needs attention)
  • Done agents
  • Error states

Audio output

Agent FM uses the Gemini or OpenAI Realtime API for speech synthesis. The narration is played through the system audio output — no separate speaker selection.

CLI binary

None.

IDE integration

None — Agent FM monitors the terminal agents from outside.

Distribution

  • Signed and notarized macOS DMG (AgentFM-macos-arm64-signed-notarized.dmg)
  • Apple Silicon only (macOS 13+)
  • Build from source: pnpm start

Remote workspace support

Go sidecar (remote-collector) enables monitoring agents on remote development environments via existing SSH aliases. Configuration via Settings panel — no credential storage beyond existing ~/.ssh/config.

BYOK cost implications

Every session generates API calls to Gemini/OpenAI for narration. Running multiple parallel sessions with Global Mix active will consume significant narration API budget. The PRIVACY.md notes this explicitly: "Agent FM does not run a proxy server or hosted account system."

Related frameworks

same archetype · same primary tool · same memory type

OpenHarness ★ 13k

Open-source Python agent runtime providing complete harness infrastructure: tools, memory, governance, swarm coordination, and…

Trae Agent ★ 12k

Research-friendly open-source CLI coding agent by ByteDance, designed for academic ablation studies and modular LLM provider…

Sweep AI ★ 7.7k

Autonomous GitHub bot that converts issues to pull requests using a sequential multi-agent pipeline.

Agent Governance Toolkit (microsoft) ★ 2.3k

Enterprise-grade AI agent governance: YAML policy enforcement, 12-vector prompt injection defense, zero-trust identity,…

TDD Guard ★ 2.1k

Mechanically enforces the Red-Green-Refactor TDD cycle by blocking file writes that violate TDD principles via a PreToolUse hook…

Agentic Coding Flywheel Setup (ACFS) ★ 1.5k

Take a complete beginner from laptop to three AI coding agents running on a VPS in 30 minutes via an idempotent manifest-driven…