Utah (Inngest)

utah-inngest · inngest/utah · ★ 116 · last commit 2026-05-18

Durable Inngest-powered personal agent where every LLM call and tool execution is a checkpointed step with automatic retries, singleton concurrency, and cancel-on-new-message — no ngrok needed.

Best whenDurability should come from the execution platform (Inngest), not from application code — an agent harness should delegate crash recovery, retries, and concu…

Skip ifBuilding durability into agent application code, Running an ngrok tunnel for local development

vs seeds

claude-conductor(markdown workspace files for agent behavioral configuration) but Utah is a full TypeScript runtime with Inngest-durable…

Primitive shape

No installable primitives

Summary

Utah (Inngest) — Summary

Utah (Universally Triggered Agent Harness) is a minimal TypeScript reference implementation of an Inngest-powered durable AI agent with Slack and Telegram messaging channels. Built by the Inngest team, it demonstrates how to build a production-grade personal AI agent where every LLM call and every tool execution is an Inngest step — gaining automatic retries, singleton concurrency, cancel-on-new-message, and WebSocket-based local development (no ngrok needed) for free. The agent uses pi-ai for multi-provider LLM abstraction (Anthropic/OpenAI/Google), pi-coding-agent for battle-tested coding tools (bash, read, write, edit, grep, find, ls), and a two-tier memory system (daily logs + long-term MEMORY.md via cron-triggered distillation). The workspace context (SOUL.md, USER.md, MEMORY.md) is injected into the system prompt. Context compaction, tool result pruning, and conversation summarization are all built in. Utah is explicitly positioned as a reference harness, not a framework.

differs_from_seeds: Utah is unique in this batch in that durability (crash recovery, step retries) is the primary architectural concern — not prompt engineering, not multi-model routing, not UI. Every LLM call and tool invocation is wrapped in step.run(), giving Inngest-level durability guarantees absent from all 11 seeds. Closest to claude-conductor (markdown scaffold, simple workspace files) but Utah is a full TypeScript runtime with durable execution rather than a template repository. The Inngest-powered execution mode — event-driven via WebSocket, no public endpoints — is architecturally novel versus all seeds.

Overview

Utah (Inngest) — Overview

Origin

Utah is authored by the Inngest team (GitHub: inngest/utah) as a showcase/reference for Inngest-powered AI agents. Inngest is a durable workflow platform. Utah's README subtitle: "An OpenClaw-like Inngest-powered personal agent."

Philosophy

From the README:

"A durable AI agent built with Inngest and pi-ai. No framework. Just a think/act/observe loop — Inngest provides durability, retries, and observability, while pi-ai provides a unified LLM interface across providers."

"Every LLM call and tool execution is an Inngest step — giving you durability, retries, and observability for free."

Key Inngest features:

Singleton execution: One conversation at a time per chat, no race conditions
Cancel on new message: User sends again? Current run cancels, new one starts
Automatic retries: LLM API timeouts handled by Inngest, not your code
WebSocket-based local dev: connect() — no public endpoint, no ngrok, no VPS

The "Always-On" Boundary

Utah's always-on boundary is the Inngest Cloud event bus + the local worker. The worker connects to Inngest Cloud via WebSocket (connect()). Messages flow through Inngest as events; the worker processes them locally. The local worker can be stopped and restarted without losing in-flight work — Inngest checkpoints every step.

Design Philosophy (Agent Architecture)

From agent-loop.ts docstring:

"Every LLM call and tool execution is an Inngest step — giving you durability, retries, and observability for free. pi-ai differences from raw Anthropic API: Unified Message/Tool types that work across providers; TypeBox schemas for tool parameters (validated at runtime); Content blocks use 'toolCall'/'toolResult' instead of 'tool_use'/'tool_result'"

Context Pruning Philosophy

From code comments:

"Two-tier pruning inspired by OpenClaw/pi-agent-core: Soft trim: keep head + tail of old tool results. Hard clear: replace entirely when total context is huge."

Architecture

Utah (Inngest) — Architecture

Distribution

Type: Clone-and-configure (TypeScript, Node.js 23+)
License: Apache-2.0
Runtime: Node.js 23+ (uses native TypeScript strip-types)
Dependencies: Inngest Cloud account required

Architecture Overview

Channel (e.g. Telegram) → Inngest Cloud (webhook + transform) → WebSocket → Local Worker → LLM (Anthropic/OpenAI/Google) → Reply Event → Channel API

The worker connects to Inngest Cloud via WebSocket — no public endpoint needed.

Directory Structure

src/
├── worker.ts                  # Entry point — connect() or serve()
├── client.ts                  # Inngest client
├── config.ts                  # Configuration from env vars
├── agent-loop.ts              # Core think → act → observe cycle
├── setup.ts                   # Channel setup orchestration
├── lib/
│   ├── llm.ts                 # pi-ai wrapper (multi-provider)
│   ├── tools.ts               # Tool definitions (TypeBox schemas) + execution
│   ├── context.ts             # System prompt builder with workspace file injection
│   ├── session.ts             # JSONL session persistence
│   ├── memory.ts              # File-based memory (daily logs + distillation)
│   └── compaction.ts          # LLM-powered conversation summarization
├── functions/
│   ├── message.ts             # Main agent function (singleton + cancelOn)
│   ├── send-reply.ts          # Channel-agnostic reply dispatch
│   ├── acknowledge-message.ts # Message acknowledgment (typing indicator)
│   ├── heartbeat.ts           # Cron-based memory maintenance (every 30 min)
│   └── failure-handler.ts     # Global error handler with notifications
└── channels/
    ├── telegram/              # Telegram channel implementation
    └── slack/                 # Slack channel implementation
workspace/
├── SOUL.md                    # Agent personality and behavioral guidelines
├── USER.md                    # User information (name, timezone, preferences)
├── MEMORY.md                  # Long-term memory (agent-writable)
├── memory/                    # Daily logs (YYYY-MM-DD.md, auto-managed)
└── sessions/                  # JSONL conversation files

Inngest Functions

Function	Purpose	Execution
`agent-handle-message`	Main agent loop	Singleton per chat, cancel on new message
`acknowledge-message`	Show "typing..." immediately	No retries (best effort)
`send-reply`	Format and send response	3 retries, channel dispatch
`agent-heartbeat`	Distill daily logs into long-term memory	Cron (every 30 min)
`global-failure-handler`	Catch errors, notify user	Triggered by `inngest/function.failed`

Components

Utah (Inngest) — Components

Agent Loop (Core)

The think → act → observe cycle in agent-loop.ts:

Think: step.run("think") — LLM call via pi-ai
Act: if tools requested, each tool runs as step.run("tool-<name>")
Observe: tool results fed back into conversation
Repeat: until LLM responds with text (no tools) or max iterations hit

Each iteration automatically indexed by Inngest (think:0, think:1, etc.).

Tools (from pi-coding-agent + custom)

Tool	Source	Purpose
`read`	pi-coding-agent	Read files with offset/limit, binary detection, truncation
`edit`	pi-coding-agent	Exact text match + replace (surgical edits)
`write`	pi-coding-agent	Create/overwrite files with directory creation
`bash`	pi-coding-agent	Shell execution with timeout + output truncation
`grep`	pi-coding-agent	Regex search respecting .gitignore
`find`	pi-coding-agent	Glob-based file discovery respecting .gitignore
`ls`	pi-coding-agent	Directory listing with tree display
`remember`	custom (Utah)	Save note to today's daily log
`web_fetch`	custom (Utah)	Fetch URL and return text
`delegate_task`	custom (Utah)	Delegate to sub-agent (blocking, isolated context)
`delegate_async_task`	custom (Utah)	Delegate to sub-agent (non-blocking)

Memory System (Two-Tier)

Daily logs (workspace/memory/YYYY-MM-DD.md): append-only notes via remember tool
Long-term memory (workspace/MEMORY.md): curated summary distilled by heartbeat
Heartbeat: cron every 30 min — checks if enough content accumulated, then distills

Context Injection

Workspace files injected into system prompt at every turn:

workspace/SOUL.md — agent personality + behavioral guidelines
workspace/USER.md — user info (name, timezone, preferences)
workspace/MEMORY.md — curated long-term memory

Context Compaction

Token estimation: chars/4 heuristic
Threshold: 80% of configured max (150K default)
LLM summarization: old messages summarized into checkpoint (goals, progress, decisions, next steps)
Recent messages preserved: ~20K tokens verbatim
Persisted: compacted session replaces JSONL file

Context Pruning (per-turn)

Two-tier:

Soft trim: tool results over 4K chars → head+tail trim (first 1,500 + last 1,500 chars)
Hard clear: total old tool content over 50K chars → replace entirely with placeholder

Channel Adapters

Telegram: fully automated setup (bot token only)
Slack: Slack app + Event Subscriptions configuration

Each channel implements ChannelHandler interface (sendReply, acknowledge, setup).

Prompts

Utah (Inngest) — Prompts

Prompt Architecture

System prompt assembled from workspace markdown files at every turn:

workspace/SOUL.md — agent personality + behavioral guidelines
workspace/USER.md — user info
workspace/MEMORY.md — long-term memory

Plus budget warnings when iterations are running low.

Verbatim Excerpt 1 — Agent Loop Docstring (agent-loop.ts)

/**
 * Agent Loop — the core think → act → observe cycle, powered by pi-ai.
 *
 * Each iteration:
 * 1. Call the LLM via pi-ai's complete() with conversation history + tools
 * 2. If the LLM wants tools, validate args with pi-ai and execute as Inngest steps
 * 3. Feed results back into the conversation
 * 4. Repeat until the LLM responds with text (no tools) or max iterations
 *
 * Every LLM call and tool execution is an Inngest step —
 * giving you durability, retries, and observability for free.
 *
 * pi-ai differences from raw Anthropic API:
 * - Unified Message/Tool types that work across providers
 * - TypeBox schemas for tool parameters (validated at runtime)
 * - Content blocks use "toolCall" / "toolResult" instead of "tool_use" / "tool_result"
 */

Technique: The agent loop is itself structured as a documented behavioral contract — the docstring specifies exactly what the loop does, what the provider abstraction changes, and what Inngest adds. This is architecture-as-documentation pattern.

Verbatim Excerpt 2 — Context Pruning Constants

const PRUNING = {
  keepLastAssistantTurns: 3,
  softTrim: {
    maxChars: 4000,
    headChars: 1500,
    tailChars: 1500,
  },
  hardClear: {
    threshold: 50_000,
    placeholder: "[Tool result cleared — old context]",
  },
} as const;

Technique: Explicit numeric constants for context management — not a prompt engineering technique per se, but the [Tool result cleared — old context] placeholder string IS a model-facing prompt: it tells the LLM that historical tool content was removed and why, preventing the model from being confused by the gap.

Workspace Files as Prompts

The workspace/SOUL.md file is the primary behavioral prompt:

Agent personality
Behavioral guidelines
Tone and boundaries
Capabilities description

Users edit this file to customize their agent. The file content is injected verbatim into the system prompt at every turn. This is the "CLAUDE.md for personal agents" pattern.

Uniqueness

Utah (Inngest) — Uniqueness

differs_from_seeds

Utah has no close seed equivalent — it occupies a unique position as an event-driven, Inngest-durable personal agent with Slack/Telegram channels and no web UI. The closest seed is claude-conductor (markdown workspace files, simple behavioral configuration) but Utah is a full TypeScript runtime with durable execution guarantees. The decisive differentiator is Inngest durability: every step.run() is checkpointed, retried on failure, and idempotent — something no seed implements. The singleton concurrency pattern (cancelOn: newMessageEvent) elegantly solves the "user sends follow-up while previous response is still generating" race condition. The workspace files (SOUL.md, USER.md, MEMORY.md) are functionally similar to agent-os's standards/ markdown, but Utah's files are actively read/written by the agent and maintained by a cron heartbeat. Unlike all seeds, Utah requires a third-party cloud service (Inngest) as part of its core architecture — the "always-on" boundary is not Utah's process but Inngest's managed event infrastructure.

Positioning

Utah is positioned as a reference harness for Inngest — demonstrating how to build a personal AI agent where the hard problems (durability, retries, concurrency) are solved by a managed workflow platform rather than application code.

Observable Failure Modes

Inngest Cloud dependency: Free tier has limits; cost scales with usage; no self-hosted Inngest option mentioned in Utah's docs
Slack/Telegram only: No web chat UI, no email, no other channels without adding them
No TDD, no spec workflow: Purely reactive conversational agent — not a coding workflow tool
116 stars: Minimal community; reference project more than production harness
Node.js 23+ requirement: Native TypeScript strip-types requires Node 23; not compatible with older Node versions
No approval gates: Fully autonomous — bash tool runs commands without confirmation
pi-ai dependency: Relies on @mariozechner/pi-ai and @mariozechner/pi-coding-agent — third-party libraries with unclear maintenance guarantees

Workflow

Utah (Inngest) — Workflow

Setup Workflow (once)

Step	Description	Artifact
1. Create Inngest account	Get Event Key + Signing Key from app.inngest.com	Keys
2. Clone + configure	`cp .env.example .env` and fill in LLM + Inngest keys	.env file
3. Configure channel	Add Telegram token or Slack app credentials	Channel creds
4. Start worker	`npm start`	Worker connected via WebSocket to Inngest Cloud
5. Channel webhook auto-setup	`setup.ts` auto-configures webhooks + transforms	Inngest webhook registered

Per-Message Workflow

Step	Description	Inngest behavior
1. Message arrives	Via Telegram/Slack	Inngest receives webhook event
2. Event transforms	Plain JS transform converts channel payload to `agent.message.received`	Typed event
3. `acknowledge-message`	Shows typing indicator	Best-effort, no retry
4. `agent-handle-message`	Singleton check: if another run active for this chat, cancel it	Singleton enforcement
5. Agent loop	Think → Act → Observe until done	Each step durable via Inngest
6. `send-reply`	Format + send response to channel	3 retries
7. Failure case	`global-failure-handler` triggers on `inngest/function.failed`	User notified

Memory Maintenance Workflow (cron)

Every 30 minutes:

agent-heartbeat checks daily logs
If enough content accumulated → distills into MEMORY.md
Daily logs pruned after 30-day retention period

Approval Gates

None. Utah is fully autonomous — no human-in-the-loop approval gates in the default implementation.

Phase-to-Artifact Map

Phase	Artifact
Agent loop	Conversation stored in JSONL session file
Memory distillation	Updated workspace/MEMORY.md
Context compaction	Compacted JSONL session file
Reply	Message sent back to Slack/Telegram

Memory Context

Utah (Inngest) — Memory & Context

Two-Tier Memory System

Tier 1: Daily Logs

Path: workspace/memory/YYYY-MM-DD.md
Mechanism: Agent writes via remember tool during conversations
Content: Append-only notes — decisions, facts, user preferences, task outcomes
Retention: 30 days (configurable)

Tier 2: Long-Term Memory

Path: workspace/MEMORY.md
Mechanism: Distilled from daily logs by agent-heartbeat Inngest function (cron every 30 min)
Content: Curated summary of what's important to remember
Agent-writable: Agent can update MEMORY.md directly via write tool

Context Compaction

Automatic when conversation size exceeds 80% of configured max (150K tokens):

Token estimation via chars/4 heuristic
LLM-powered summarization of old messages into structured checkpoint:
- Goals
- Progress made
- Decisions taken
- Next steps
Recent ~20K tokens preserved verbatim
Compacted session replaces JSONL file on disk

Runs as an Inngest step (step.run("compact")) — durable and retryable.

Context Pruning (per-turn)

Two-tier pruning on each model call:

Soft trim: Tool results > 4K chars → head(1,500) + "..." + tail(1,500)
Hard clear: Total old tool content > 50K chars → all old results replaced with placeholder
Budget warnings: System message injected when iterations are running low

Session Persistence

JSONL files in workspace/sessions/ — gitignored. Resume is implicit via Inngest's singleton concurrency (the same chat ID always routes to the same function run state).

Cross-Session Handoff

Yes — Inngest checkpoints mean if the worker restarts, the in-flight agent loop resumes from the last completed step. This is the key durability feature: the session survives worker restarts.

State Files

workspace/SOUL.md — agent personality (human-edited)
workspace/USER.md — user info (human-edited)
workspace/MEMORY.md — long-term memory (agent-writable)
workspace/memory/YYYY-MM-DD.md — daily logs
workspace/sessions/ — JSONL conversation files (gitignored)

Orchestration

Utah (Inngest) — Orchestration

Multi-Agent Support

Yes — via delegate_task (blocking) and delegate_async_task (non-blocking) tools. Sub-agents get isolated contexts but access the same workspace and tools.

Orchestration Pattern

sequential by default. delegate_task / delegate_async_task enable task delegation but no formal parallel fan-out or multi-agent consensus.

Execution Mode

Event-driven — the primary execution model:

Incoming Slack/Telegram message → Inngest event
agent-handle-message Inngest function fires
Agent loop runs as durable steps
Reply sent back to channel

Cron — agent-heartbeat runs every 30 minutes for memory maintenance.

Durability (Key Differentiator)

Every Inngest step (step.run()) is:

Checkpointed: If worker dies, Inngest resumes from last completed step
Retried: If LLM API fails, Inngest retries automatically (3 retries for send-reply, unlimited for agent-handle-message based on Inngest defaults)
Idempotent: Duplicate step IDs auto-indexed (think:0, think:1, etc.)

Singleton Concurrency

cancelOn: newMessageEvent — when a new message arrives for the same chat, the current run is cancelled and a new one starts. Prevents stale responses from completing after the user has already sent follow-up.

Isolation Mechanism

None beyond Inngest's function-level isolation. All files accessed from the same workspace directory.

Multi-Model

Yes — pi-ai provides unified interface across Anthropic, OpenAI, Google. Model configured via environment variable. No per-task model routing.

Consensus Mechanism

None. Single agent, no multi-model consensus.

Infrastructure Dependency

Inngest Cloud account required — free tier available. The durability, retries, and singleton execution come from Inngest's managed infrastructure, not from Utah's code. This is the "always-on" boundary: Inngest Cloud is the durable execution layer that Utah delegates to.

Ui Cli Surface

Utah (Inngest) — UI / CLI Surface

CLI Binary

None. Utah is a Node.js application started with npm start or npm run dev.

User Interface

None — Utah is a headless agent accessed entirely through IM channels (Slack, Telegram). There is no web dashboard, no chat UI, no admin panel.

Channel Surfaces

Slack: Users interact via Slack workspace messages
Telegram: Users interact via Telegram bot messages

Local Development

# Production mode (connects to Inngest Cloud via WebSocket)
npm start

# Development mode (uses local Inngest dev server)
npx inngest-cli@latest dev &
npm run dev

No ngrok needed — connect() establishes a WebSocket to Inngest Cloud.

Inngest Dashboard

The Inngest Cloud dashboard (app.inngest.com) provides:

Function run history
Step-level execution logs
Retry status
Event stream

This is the observability surface — not a Utah-specific dashboard.

Configuration

Only .env file — no web configuration interface:

ANTHROPIC_API_KEY=sk-ant-...
INNGEST_EVENT_KEY=...
INNGEST_SIGNING_KEY=signkey-prod-...
# + channel-specific vars (bot token, signing secrets)

Workspace Files

Human-edited files in workspace/:

workspace/SOUL.md — agent personality
workspace/USER.md — user info
workspace/MEMORY.md — long-term memory (also agent-writable)

These are the only "configuration UI" — text files edited manually.

Related frameworks

same archetype · same primary tool · same memory type

BMAD-METHOD ★ 48k

A4 Markdown scaffold

Provides a full agile delivery lifecycle with named expert-persona AI collaborators that elicit the human's best thinking rather…

Agent OS ★ 4.6k

A4 Markdown scaffold

Extracts implicit codebase conventions into token-efficient markdown standards files and injects them selectively into AI agent…

Claude Conductor ★ 367

A4 Markdown scaffold

Gives Claude Code a persistent, cross-linked, auto-analyzed documentation system so it retains codebase context across sessions.

Spec-Driver (Greenfield Spec-Driven Development) ★ 25

A4 Markdown scaffold

Prevents spec rot in AI-assisted development by making implementation changes flow back into evergreen, authoritative specs via…

Anthropic Knowledge Work Plugins ★ 16k

A4 Markdown scaffold

Role-specialized plugin bundles with live MCP connectors that turn Claude into a domain expert for enterprise knowledge workers.

Codex Integration for Claude Code (skill-codex) ★ 1.3k

A4 Markdown scaffold

Single Claude Code skill that handles Codex CLI invocation correctly (stdin blocking, thinking token suppression, session resume)…

Distribution

Type: standalone-repo
License: Apache-2.0
Install: clone-and-configure
Version: unknown (active)

Surfaces

CLI binary: No
CLI subcmds: 0
Local UI: No
Tech stack: none (headless, channel-based only)

Components

Commands: 0
Skills: 0
Subagents: 0
Hooks: 0
MCP servers: 0
MCP tools: 0
Scripts: 5
Templates: 3

Workflow

Phases: 10
Approval gates: 0
Spec format: none
Spec storage: none
Delta or full: none

Orchestration

Multi-agent: Yes
Pattern: sequential
Isolation: none
Consensus: none
Prompt chaining: Yes

Multi-model

Multi-model: Yes
BYOK: Yes
Modal: text+vision

Execution

Mode: event-driven
Crash recovery: Yes
Compaction: Yes
Session handoff: Yes
Streaming: No

Memory

Type: file-based
Persistence: project
Search: none
State files: 5 files

Quality

TDD: No
TDD mechanism: none
Self-review: none

Git / Observability

Auto commit: No
Auto PR: No
Auto merge: No
Worktree/feat: No
Audit log: Yes
Audit format: jsonl
Replay: Yes

Tools

Primary: slack
Targets: 3
Portability: low

Signals

Stars: 116
Last commit: 2026-05-18
Maintainer: active
Quality score: 5/10