Skip to content
/

Utah (Inngest)

utah-inngest · inngest/utah · ★ 116 · last commit 2026-05-18

Durable Inngest-powered personal agent where every LLM call and tool execution is a checkpointed step with automatic retries, singleton concurrency, and cancel-on-new-message — no ngrok needed.

Best whenDurability should come from the execution platform (Inngest), not from application code — an agent harness should delegate crash recovery, retries, and concu…
Skip ifBuilding durability into agent application code, Running an ngrok tunnel for local development
vs seeds
claude-conductor(markdown workspace files for agent behavioral configuration) but Utah is a full TypeScript runtime with Inngest-durable…
Primitive shape
No installable primitives
00

Summary

Utah (Inngest) — Summary

Utah (Universally Triggered Agent Harness) is a minimal TypeScript reference implementation of an Inngest-powered durable AI agent with Slack and Telegram messaging channels. Built by the Inngest team, it demonstrates how to build a production-grade personal AI agent where every LLM call and every tool execution is an Inngest step — gaining automatic retries, singleton concurrency, cancel-on-new-message, and WebSocket-based local development (no ngrok needed) for free. The agent uses pi-ai for multi-provider LLM abstraction (Anthropic/OpenAI/Google), pi-coding-agent for battle-tested coding tools (bash, read, write, edit, grep, find, ls), and a two-tier memory system (daily logs + long-term MEMORY.md via cron-triggered distillation). The workspace context (SOUL.md, USER.md, MEMORY.md) is injected into the system prompt. Context compaction, tool result pruning, and conversation summarization are all built in. Utah is explicitly positioned as a reference harness, not a framework.

differs_from_seeds: Utah is unique in this batch in that durability (crash recovery, step retries) is the primary architectural concern — not prompt engineering, not multi-model routing, not UI. Every LLM call and tool invocation is wrapped in step.run(), giving Inngest-level durability guarantees absent from all 11 seeds. Closest to claude-conductor (markdown scaffold, simple workspace files) but Utah is a full TypeScript runtime with durable execution rather than a template repository. The Inngest-powered execution mode — event-driven via WebSocket, no public endpoints — is architecturally novel versus all seeds.

01

Overview

Utah (Inngest) — Overview

Origin

Utah is authored by the Inngest team (GitHub: inngest/utah) as a showcase/reference for Inngest-powered AI agents. Inngest is a durable workflow platform. Utah's README subtitle: "An OpenClaw-like Inngest-powered personal agent."

Philosophy

From the README:

"A durable AI agent built with Inngest and pi-ai. No framework. Just a think/act/observe loop — Inngest provides durability, retries, and observability, while pi-ai provides a unified LLM interface across providers."

"Every LLM call and tool execution is an Inngest step — giving you durability, retries, and observability for free."

Key Inngest features:

  • Singleton execution: One conversation at a time per chat, no race conditions
  • Cancel on new message: User sends again? Current run cancels, new one starts
  • Automatic retries: LLM API timeouts handled by Inngest, not your code
  • WebSocket-based local dev: connect() — no public endpoint, no ngrok, no VPS

The "Always-On" Boundary

Utah's always-on boundary is the Inngest Cloud event bus + the local worker. The worker connects to Inngest Cloud via WebSocket (connect()). Messages flow through Inngest as events; the worker processes them locally. The local worker can be stopped and restarted without losing in-flight work — Inngest checkpoints every step.

Design Philosophy (Agent Architecture)

From agent-loop.ts docstring:

"Every LLM call and tool execution is an Inngest step — giving you durability, retries, and observability for free. pi-ai differences from raw Anthropic API: Unified Message/Tool types that work across providers; TypeBox schemas for tool parameters (validated at runtime); Content blocks use 'toolCall'/'toolResult' instead of 'tool_use'/'tool_result'"

Context Pruning Philosophy

From code comments:

"Two-tier pruning inspired by OpenClaw/pi-agent-core: Soft trim: keep head + tail of old tool results. Hard clear: replace entirely when total context is huge."

02

Architecture

Utah (Inngest) — Architecture

Distribution

  • Type: Clone-and-configure (TypeScript, Node.js 23+)
  • License: Apache-2.0
  • Runtime: Node.js 23+ (uses native TypeScript strip-types)
  • Dependencies: Inngest Cloud account required

Architecture Overview

Channel (e.g. Telegram) → Inngest Cloud (webhook + transform) → WebSocket → Local Worker → LLM (Anthropic/OpenAI/Google) → Reply Event → Channel API

The worker connects to Inngest Cloud via WebSocket — no public endpoint needed.

Directory Structure

src/
├── worker.ts                  # Entry point — connect() or serve()
├── client.ts                  # Inngest client
├── config.ts                  # Configuration from env vars
├── agent-loop.ts              # Core think → act → observe cycle
├── setup.ts                   # Channel setup orchestration
├── lib/
│   ├── llm.ts                 # pi-ai wrapper (multi-provider)
│   ├── tools.ts               # Tool definitions (TypeBox schemas) + execution
│   ├── context.ts             # System prompt builder with workspace file injection
│   ├── session.ts             # JSONL session persistence
│   ├── memory.ts              # File-based memory (daily logs + distillation)
│   └── compaction.ts          # LLM-powered conversation summarization
├── functions/
│   ├── message.ts             # Main agent function (singleton + cancelOn)
│   ├── send-reply.ts          # Channel-agnostic reply dispatch
│   ├── acknowledge-message.ts # Message acknowledgment (typing indicator)
│   ├── heartbeat.ts           # Cron-based memory maintenance (every 30 min)
│   └── failure-handler.ts     # Global error handler with notifications
└── channels/
    ├── telegram/              # Telegram channel implementation
    └── slack/                 # Slack channel implementation
workspace/
├── SOUL.md                    # Agent personality and behavioral guidelines
├── USER.md                    # User information (name, timezone, preferences)
├── MEMORY.md                  # Long-term memory (agent-writable)
├── memory/                    # Daily logs (YYYY-MM-DD.md, auto-managed)
└── sessions/                  # JSONL conversation files

Inngest Functions

Function Purpose Execution
agent-handle-message Main agent loop Singleton per chat, cancel on new message
acknowledge-message Show "typing..." immediately No retries (best effort)
send-reply Format and send response 3 retries, channel dispatch
agent-heartbeat Distill daily logs into long-term memory Cron (every 30 min)
global-failure-handler Catch errors, notify user Triggered by inngest/function.failed
03

Components

Utah (Inngest) — Components

Agent Loop (Core)

The think → act → observe cycle in agent-loop.ts:

  1. Think: step.run("think") — LLM call via pi-ai
  2. Act: if tools requested, each tool runs as step.run("tool-<name>")
  3. Observe: tool results fed back into conversation
  4. Repeat: until LLM responds with text (no tools) or max iterations hit

Each iteration automatically indexed by Inngest (think:0, think:1, etc.).

Tools (from pi-coding-agent + custom)

Tool Source Purpose
read pi-coding-agent Read files with offset/limit, binary detection, truncation
edit pi-coding-agent Exact text match + replace (surgical edits)
write pi-coding-agent Create/overwrite files with directory creation
bash pi-coding-agent Shell execution with timeout + output truncation
grep pi-coding-agent Regex search respecting .gitignore
find pi-coding-agent Glob-based file discovery respecting .gitignore
ls pi-coding-agent Directory listing with tree display
remember custom (Utah) Save note to today's daily log
web_fetch custom (Utah) Fetch URL and return text
delegate_task custom (Utah) Delegate to sub-agent (blocking, isolated context)
delegate_async_task custom (Utah) Delegate to sub-agent (non-blocking)

Memory System (Two-Tier)

  • Daily logs (workspace/memory/YYYY-MM-DD.md): append-only notes via remember tool
  • Long-term memory (workspace/MEMORY.md): curated summary distilled by heartbeat
  • Heartbeat: cron every 30 min — checks if enough content accumulated, then distills

Context Injection

Workspace files injected into system prompt at every turn:

  • workspace/SOUL.md — agent personality + behavioral guidelines
  • workspace/USER.md — user info (name, timezone, preferences)
  • workspace/MEMORY.md — curated long-term memory

Context Compaction

  • Token estimation: chars/4 heuristic
  • Threshold: 80% of configured max (150K default)
  • LLM summarization: old messages summarized into checkpoint (goals, progress, decisions, next steps)
  • Recent messages preserved: ~20K tokens verbatim
  • Persisted: compacted session replaces JSONL file

Context Pruning (per-turn)

Two-tier:

  • Soft trim: tool results over 4K chars → head+tail trim (first 1,500 + last 1,500 chars)
  • Hard clear: total old tool content over 50K chars → replace entirely with placeholder

Channel Adapters

  • Telegram: fully automated setup (bot token only)
  • Slack: Slack app + Event Subscriptions configuration

Each channel implements ChannelHandler interface (sendReply, acknowledge, setup).

05

Prompts

Utah (Inngest) — Prompts

Prompt Architecture

System prompt assembled from workspace markdown files at every turn:

  1. workspace/SOUL.md — agent personality + behavioral guidelines
  2. workspace/USER.md — user info
  3. workspace/MEMORY.md — long-term memory

Plus budget warnings when iterations are running low.

Verbatim Excerpt 1 — Agent Loop Docstring (agent-loop.ts)

/**
 * Agent Loop — the core think → act → observe cycle, powered by pi-ai.
 *
 * Each iteration:
 * 1. Call the LLM via pi-ai's complete() with conversation history + tools
 * 2. If the LLM wants tools, validate args with pi-ai and execute as Inngest steps
 * 3. Feed results back into the conversation
 * 4. Repeat until the LLM responds with text (no tools) or max iterations
 *
 * Every LLM call and tool execution is an Inngest step —
 * giving you durability, retries, and observability for free.
 *
 * pi-ai differences from raw Anthropic API:
 * - Unified Message/Tool types that work across providers
 * - TypeBox schemas for tool parameters (validated at runtime)
 * - Content blocks use "toolCall" / "toolResult" instead of "tool_use" / "tool_result"
 */

Technique: The agent loop is itself structured as a documented behavioral contract — the docstring specifies exactly what the loop does, what the provider abstraction changes, and what Inngest adds. This is architecture-as-documentation pattern.

Verbatim Excerpt 2 — Context Pruning Constants

const PRUNING = {
  keepLastAssistantTurns: 3,
  softTrim: {
    maxChars: 4000,
    headChars: 1500,
    tailChars: 1500,
  },
  hardClear: {
    threshold: 50_000,
    placeholder: "[Tool result cleared — old context]",
  },
} as const;

Technique: Explicit numeric constants for context management — not a prompt engineering technique per se, but the [Tool result cleared — old context] placeholder string IS a model-facing prompt: it tells the LLM that historical tool content was removed and why, preventing the model from being confused by the gap.

Workspace Files as Prompts

The workspace/SOUL.md file is the primary behavioral prompt:

  • Agent personality
  • Behavioral guidelines
  • Tone and boundaries
  • Capabilities description

Users edit this file to customize their agent. The file content is injected verbatim into the system prompt at every turn. This is the "CLAUDE.md for personal agents" pattern.

09

Uniqueness

Utah (Inngest) — Uniqueness

differs_from_seeds

Utah has no close seed equivalent — it occupies a unique position as an event-driven, Inngest-durable personal agent with Slack/Telegram channels and no web UI. The closest seed is claude-conductor (markdown workspace files, simple behavioral configuration) but Utah is a full TypeScript runtime with durable execution guarantees. The decisive differentiator is Inngest durability: every step.run() is checkpointed, retried on failure, and idempotent — something no seed implements. The singleton concurrency pattern (cancelOn: newMessageEvent) elegantly solves the "user sends follow-up while previous response is still generating" race condition. The workspace files (SOUL.md, USER.md, MEMORY.md) are functionally similar to agent-os's standards/ markdown, but Utah's files are actively read/written by the agent and maintained by a cron heartbeat. Unlike all seeds, Utah requires a third-party cloud service (Inngest) as part of its core architecture — the "always-on" boundary is not Utah's process but Inngest's managed event infrastructure.

Positioning

Utah is positioned as a reference harness for Inngest — demonstrating how to build a personal AI agent where the hard problems (durability, retries, concurrency) are solved by a managed workflow platform rather than application code.

Observable Failure Modes

  • Inngest Cloud dependency: Free tier has limits; cost scales with usage; no self-hosted Inngest option mentioned in Utah's docs
  • Slack/Telegram only: No web chat UI, no email, no other channels without adding them
  • No TDD, no spec workflow: Purely reactive conversational agent — not a coding workflow tool
  • 116 stars: Minimal community; reference project more than production harness
  • Node.js 23+ requirement: Native TypeScript strip-types requires Node 23; not compatible with older Node versions
  • No approval gates: Fully autonomous — bash tool runs commands without confirmation
  • pi-ai dependency: Relies on @mariozechner/pi-ai and @mariozechner/pi-coding-agent — third-party libraries with unclear maintenance guarantees
04

Workflow

Utah (Inngest) — Workflow

Setup Workflow (once)

Step Description Artifact
1. Create Inngest account Get Event Key + Signing Key from app.inngest.com Keys
2. Clone + configure cp .env.example .env and fill in LLM + Inngest keys .env file
3. Configure channel Add Telegram token or Slack app credentials Channel creds
4. Start worker npm start Worker connected via WebSocket to Inngest Cloud
5. Channel webhook auto-setup setup.ts auto-configures webhooks + transforms Inngest webhook registered

Per-Message Workflow

Step Description Inngest behavior
1. Message arrives Via Telegram/Slack Inngest receives webhook event
2. Event transforms Plain JS transform converts channel payload to agent.message.received Typed event
3. acknowledge-message Shows typing indicator Best-effort, no retry
4. agent-handle-message Singleton check: if another run active for this chat, cancel it Singleton enforcement
5. Agent loop Think → Act → Observe until done Each step durable via Inngest
6. send-reply Format + send response to channel 3 retries
7. Failure case global-failure-handler triggers on inngest/function.failed User notified

Memory Maintenance Workflow (cron)

Every 30 minutes:

  1. agent-heartbeat checks daily logs
  2. If enough content accumulated → distills into MEMORY.md
  3. Daily logs pruned after 30-day retention period

Approval Gates

None. Utah is fully autonomous — no human-in-the-loop approval gates in the default implementation.

Phase-to-Artifact Map

Phase Artifact
Agent loop Conversation stored in JSONL session file
Memory distillation Updated workspace/MEMORY.md
Context compaction Compacted JSONL session file
Reply Message sent back to Slack/Telegram
06

Memory Context

Utah (Inngest) — Memory & Context

Two-Tier Memory System

Tier 1: Daily Logs

  • Path: workspace/memory/YYYY-MM-DD.md
  • Mechanism: Agent writes via remember tool during conversations
  • Content: Append-only notes — decisions, facts, user preferences, task outcomes
  • Retention: 30 days (configurable)

Tier 2: Long-Term Memory

  • Path: workspace/MEMORY.md
  • Mechanism: Distilled from daily logs by agent-heartbeat Inngest function (cron every 30 min)
  • Content: Curated summary of what's important to remember
  • Agent-writable: Agent can update MEMORY.md directly via write tool

Context Compaction

Automatic when conversation size exceeds 80% of configured max (150K tokens):

  1. Token estimation via chars/4 heuristic
  2. LLM-powered summarization of old messages into structured checkpoint:
    • Goals
    • Progress made
    • Decisions taken
    • Next steps
  3. Recent ~20K tokens preserved verbatim
  4. Compacted session replaces JSONL file on disk

Runs as an Inngest step (step.run("compact")) — durable and retryable.

Context Pruning (per-turn)

Two-tier pruning on each model call:

  • Soft trim: Tool results > 4K chars → head(1,500) + "..." + tail(1,500)
  • Hard clear: Total old tool content > 50K chars → all old results replaced with placeholder
  • Budget warnings: System message injected when iterations are running low

Session Persistence

JSONL files in workspace/sessions/ — gitignored. Resume is implicit via Inngest's singleton concurrency (the same chat ID always routes to the same function run state).

Cross-Session Handoff

Yes — Inngest checkpoints mean if the worker restarts, the in-flight agent loop resumes from the last completed step. This is the key durability feature: the session survives worker restarts.

State Files

  • workspace/SOUL.md — agent personality (human-edited)
  • workspace/USER.md — user info (human-edited)
  • workspace/MEMORY.md — long-term memory (agent-writable)
  • workspace/memory/YYYY-MM-DD.md — daily logs
  • workspace/sessions/ — JSONL conversation files (gitignored)
07

Orchestration

Utah (Inngest) — Orchestration

Multi-Agent Support

Yes — via delegate_task (blocking) and delegate_async_task (non-blocking) tools. Sub-agents get isolated contexts but access the same workspace and tools.

Orchestration Pattern

sequential by default. delegate_task / delegate_async_task enable task delegation but no formal parallel fan-out or multi-agent consensus.

Execution Mode

Event-driven — the primary execution model:

  1. Incoming Slack/Telegram message → Inngest event
  2. agent-handle-message Inngest function fires
  3. Agent loop runs as durable steps
  4. Reply sent back to channel

Cronagent-heartbeat runs every 30 minutes for memory maintenance.

Durability (Key Differentiator)

Every Inngest step (step.run()) is:

  • Checkpointed: If worker dies, Inngest resumes from last completed step
  • Retried: If LLM API fails, Inngest retries automatically (3 retries for send-reply, unlimited for agent-handle-message based on Inngest defaults)
  • Idempotent: Duplicate step IDs auto-indexed (think:0, think:1, etc.)

Singleton Concurrency

cancelOn: newMessageEvent — when a new message arrives for the same chat, the current run is cancelled and a new one starts. Prevents stale responses from completing after the user has already sent follow-up.

Isolation Mechanism

None beyond Inngest's function-level isolation. All files accessed from the same workspace directory.

Multi-Model

Yes — pi-ai provides unified interface across Anthropic, OpenAI, Google. Model configured via environment variable. No per-task model routing.

Consensus Mechanism

None. Single agent, no multi-model consensus.

Infrastructure Dependency

Inngest Cloud account required — free tier available. The durability, retries, and singleton execution come from Inngest's managed infrastructure, not from Utah's code. This is the "always-on" boundary: Inngest Cloud is the durable execution layer that Utah delegates to.

08

Ui Cli Surface

Utah (Inngest) — UI / CLI Surface

CLI Binary

None. Utah is a Node.js application started with npm start or npm run dev.

User Interface

None — Utah is a headless agent accessed entirely through IM channels (Slack, Telegram). There is no web dashboard, no chat UI, no admin panel.

Channel Surfaces

  • Slack: Users interact via Slack workspace messages
  • Telegram: Users interact via Telegram bot messages

Local Development

# Production mode (connects to Inngest Cloud via WebSocket)
npm start

# Development mode (uses local Inngest dev server)
npx inngest-cli@latest dev &
npm run dev

No ngrok needed — connect() establishes a WebSocket to Inngest Cloud.

Inngest Dashboard

The Inngest Cloud dashboard (app.inngest.com) provides:

  • Function run history
  • Step-level execution logs
  • Retry status
  • Event stream

This is the observability surface — not a Utah-specific dashboard.

Configuration

Only .env file — no web configuration interface:

ANTHROPIC_API_KEY=sk-ant-...
INNGEST_EVENT_KEY=...
INNGEST_SIGNING_KEY=signkey-prod-...
# + channel-specific vars (bot token, signing secrets)

Workspace Files

Human-edited files in workspace/:

  • workspace/SOUL.md — agent personality
  • workspace/USER.md — user info
  • workspace/MEMORY.md — long-term memory (also agent-writable)

These are the only "configuration UI" — text files edited manually.

Related frameworks

same archetype · same primary tool · same memory type

BMAD-METHOD ★ 48k

Provides a full agile delivery lifecycle with named expert-persona AI collaborators that elicit the human's best thinking rather…

Agent OS ★ 4.6k

Extracts implicit codebase conventions into token-efficient markdown standards files and injects them selectively into AI agent…

Claude Conductor ★ 367

Gives Claude Code a persistent, cross-linked, auto-analyzed documentation system so it retains codebase context across sessions.

Spec-Driver (Greenfield Spec-Driven Development) ★ 25

Prevents spec rot in AI-assisted development by making implementation changes flow back into evergreen, authoritative specs via…

Anthropic Knowledge Work Plugins ★ 16k

Role-specialized plugin bundles with live MCP connectors that turn Claude into a domain expert for enterprise knowledge workers.

Codex Integration for Claude Code (skill-codex) ★ 1.3k

Single Claude Code skill that handles Codex CLI invocation correctly (stdin blocking, thinking token suppression, session resume)…