Skip to content
/

GolemBot

golembot · 0xranx/golembot · ★ 285 · last commit 2026-05-09

Bridges any coding agent (Claude Code, Cursor, Codex, OpenCode) to IM channels and HTTP APIs, giving agents external reach without framework lock-in.

Best whenGolemBot has only two concepts: assistant directory and Skills — adding abstractions like Tool, Blueprint, or Pipeline would recreate the AI framework proble…
Skip ifGolemBot managing context window or dispatching tools itself, Declaring skills in config (skills/ directory is single source of truth)
vs seeds
openspecadds spec management while GolemBot adds only channel routing. Unlike superp…
Primitive shape 15 total
Commands 8 Skills 7
00

Summary

GolemBot — Summary

GolemBot is a TypeScript npm package (golembot) that gives any coding agent (Cursor, Claude Code, OpenCode, Codex) a "body" by connecting it to messaging channels (Slack, Telegram, Discord, Feishu, DingTalk, WeCom, WeChat) and exposing a programmable HTTP API for product embedding. It works by wrapping the underlying coding agent CLI as a subprocess — GolemBot handles channel message parsing, routing to the agent engine, streaming responses back, and session persistence, but delegates all intelligence to the agent. The key insight is provider-routing: a single provider config block can redirect any engine through OpenRouter, MiniMax, DeepSeek, or SiliconFlow without code changes, enabling Codex running on DeepSeek or Claude Code through OpenRouter in one command. GolemBot bundles access to 13,000+ community skills from the ClawHub ecosystem with one-command install. Closest seed analog is openspec (cross-tool multi-entry-point design), but GolemBot's value proposition is external-channel bridging rather than spec management — it is an agent-to-IM-gateway, not a coding workflow orchestrator.

01

Overview

GolemBot — Overview

Origin

Published on npm as golembot v0.47.1 by 0xranx. 285 GitHub stars (2026-05-09 last commit). The project emerged from the practical problem of coding agents being stuck in a terminal window — the author's insight was that the agent is already "smart enough"; what it needs is a communication channel.

Philosophy

"Any Agent × Any Provider × Anywhere"

"Cursor, Claude Code, OpenCode, Codex — these Coding Agents can already write code, run scripts, analyze data, and reason through complex tasks. But they're stuck in an IDE or a terminal window. GolemBot gives them a body."

The framework deliberately avoids implementing AI logic:

"GolemBot does not manage context window, dispatch tools, reason, or set session TTL. All 'intelligent' behavior is delegated to the underlying Coding Agent."

From CLAUDE.md (hard architectural constraints):

"Don't add new core concepts — The framework has only two concepts: assistant directory + Skill." "Don't declare Skills in config — The skills/ directory is the single source of truth."

Key Differentiators

  1. Channel-first: Primary value is IM channel bridging (7 channels: Slack, Telegram, Discord, Feishu, DingTalk, WeCom, WeChat)
  2. Provider routing: Decouple the agent engine from its LLM provider without code changes
  3. ClawHub integration: One command to install from 13,000+ community skills
  4. Minimal abstractions: Only two concepts — assistant directory (golem.yaml) and Skills

Architecture Philosophy (from CLAUDE.md)

  • Never put CLI logic in core library — cli.ts is a thin shell
  • StreamEvent interface identical across all engines — switching requires zero code changes
  • chat() calls with same sessionKey must be serialized (KeyedMutex); different sessionKeys run in parallel
02

Architecture

GolemBot — Architecture

Distribution

  • Type: npm-package
  • Binary: golembot (via bin: { golembot: './dist/cli.js' })
  • Version: 0.47.1
  • Language: TypeScript (compiled to dist/)

Install Methods

npm install -g golembot
# Then:
mkdir my-bot && cd my-bot
golembot onboard    # guided setup
# Or manually:
golembot init -e claude-code -n my-bot
golembot run        # REPL
golembot gateway    # IM + HTTP service + Dashboard

Required Runtime

  • Node.js >= 18
  • One or more coding agent CLIs installed: claude (Claude Code), agent (Cursor), opencode, or codex

Directory Structure (per assistant)

my-bot/
├── golem.yaml          # Engine, name, and infrastructure config (NOT skills)
├── skills/             # Markdown skills — single source of truth
│   ├── general/        # Built-in skill templates
│   ├── im-adapter/     # IM channel-specific skills
│   ├── escalation/     # Escalation handling
│   ├── kb-guide/       # Knowledge base guidance
│   ├── message-push/   # Push notification skills
│   ├── multi-bot/      # Multi-bot coordination
│   └── task-manager/   # Task management skills
└── AGENTS.md           # Auto-generated from workspace

golem.yaml Schema (minimal)

name: cloudsync-support-bot
engine: opencode     # one of: cursor | claude-code | opencode | codex

Provider Routing Config (in golem.yaml)

provider:
  type: openrouter    # or minimix, deepseek, siliconflow
  api_key: ${OPENROUTER_API_KEY}
  model: anthropic/claude-3.5-sonnet

Engine Implementations

All engines implement the AgentEngine interface with a single invoke(prompt, opts): AsyncIterable<StreamEvent> method:

  • ClaudeCodeEngine~/.local/bin/claude -p <prompt> --output-format stream-json --verbose --dangerously-skip-permissions
  • CursorEngine~/.local/bin/agent --output-format stream-json --stream-partial-output --force --trust --sandbox disabled
  • OpenCodeEngineopencode run <prompt> --format json
  • CodexEnginecodex exec --json --full-auto --skip-git-repo-check [--model X] <prompt>

Channel Architecture

Slack / Telegram / Discord / Feishu / DingTalk / WeCom / WeChat / HTTP API
  Custom Adapters (email, GitHub Issues, ...)
          │
          ▼
  Gateway Service (Channel adapters + HTTP service)
          │
  createAssistant() → GolemBot runtime
          │
  ┌───────┬───────┬───────┐
  ▼       ▼       ▼       ▼
Cursor  Claude  OpenCode  Codex
        Code
  ↕ Provider Routing (OpenRouter, MiniMax, ...)

Target AI Tools

Cursor, Claude Code, OpenCode, Codex — any of the 4 supported engines.

03

Components

GolemBot — Components

CLI Commands (golembot binary)

Command Purpose
golembot init -e <engine> -n <name> Initialize a new assistant directory with golem.yaml
golembot onboard Guided interactive setup (recommended for new users)
golembot run Start REPL conversation with the agent
golembot gateway Start IM channels + HTTP service + Dashboard
golembot fleet ls List all running bot instances
golembot fleet serve Aggregate multiple bots into a Fleet Dashboard
golembot skill search <query> Browse 13,000+ ClawHub skills
golembot skill install <skill> Install a skill from ClawHub

Engine Implementations (src/engines/)

File Engine Notes
claude-code.ts Claude Code --verbose required for intermediate events; uses ANTHROPIC_API_KEY or OAuth
codex.ts OpenAI Codex Responses API required for custom provider routing
cursor.ts Cursor Uses CURSOR_API_KEY; segmentAccum dedup for partial output
opencode.ts OpenCode step_finish events accumulated; done on process close
provider-env.ts Provider routing Maps provider config to engine-specific env vars

Channel Adapters (src/channels/)

File Channel
slack.ts Slack workspace
telegram.ts / telegram-format.ts Telegram bots
discord.ts Discord servers
feishu.ts / feishu-format.ts Feishu/Lark
dingtalk.ts DingTalk
wecom.ts WeCom (WeChat Work)
weixin.ts WeChat (personal)

Core Modules (src/)

File Purpose
index.ts Public API, concurrency locks (KeyedMutex), orchestration
workspace.ts golem.yaml read/write, skills scanning, AGENTS.md generation
session.ts Session persistence indexed by sessionKey
server.ts HTTP service, SSE, bearer auth
gateway.ts HTTP API + IM channel orchestration
scheduler.ts Built-in cron scheduler for recurring tasks
dashboard.ts Web dashboard component
fleet.ts Fleet Dashboard aggregation
registry.ts ClawHub skill registry integration

Skill Templates (templates/)

Directory Skill Type
code-reviewer/ Code review assistant
customer-support/ Customer support bot
data-analyst/ Data analysis assistant
meeting-notes/ Meeting notes processing
ops-assistant/ Operations assistant
research/ Research assistant

Skills Directory (skills/)

Skills are plain Markdown files in skills/. The skills/ directory is the single source of truth — golem.yaml only configures engine and infrastructure:

Directory Purpose
escalation/ Escalation handling flows
general/ General-purpose skills
im-adapter/ IM channel behavior customization
kb-guide/ Knowledge base guidance skills
message-push/ Push notification templates
multi-bot/ Multi-bot coordination skills
task-manager/ Task management flows

Embedding API (5 lines)

import { createAssistant } from 'golembot';
const bot = createAssistant({ dir: './my-agent' });
for await (const event of bot.chat('Analyze last month sales data')) {
  if (event.type === 'text') process.stdout.write(event.content);
}

Session Management

Sessions are keyed by sessionKey. Same key → serialized (KeyedMutex). Different keys → parallel execution. Session IDs are passed to agent engines for --resume / --session / exec resume continuity.

05

Prompts

GolemBot — Prompts

CLAUDE.md — Hard Architectural Constraints

## Architecture Hard Constraints

When modifying code under `src/`, the following constraints must not be violated.

### Things You Must Never Do

1. **Don't do what the Agent should do** — GolemBot does not manage context window, dispatch 
   tools, reason, or set session TTL. All "intelligent" behavior is delegated to the 
   underlying Coding Agent.
2. **Don't add new core concepts** — The framework has only two concepts: assistant directory 
   + Skill. Do not introduce Tool, Blueprint, Registry, Pipeline, or other abstractions.
3. **Don't put CLI logic in the core library** — `cli.ts` is a thin shell; it only parses 
   arguments and formats output. All business logic must live in `index.ts` / `workspace.ts` 
   / `engine.ts` / `session.ts` / `server.ts`.
4. **Don't declare Skills in config** — The `skills/` directory is the single source of 
   truth. `golem.yaml` only configures engine, name, and infrastructure settings.
5. **Process invocation is engine-owned** — All engines use `child_process.spawn`. Do not 
   assume invocation style outside the engine.

Prompting technique: Hard negation list (anti-patterns). This is an unusual pattern — the CLAUDE.md enforces architectural invariants by listing what must never be added. It's a "negative specification" — defining the framework by what it refuses to be.


Engine Notes — Claude Code

From CLAUDE.md (verbatim):

### Claude Code

- Binary: `~/.local/bin/claude`
- Flags: `-p <prompt> --output-format stream-json --verbose --dangerously-skip-permissions`
- `--verbose` is required for intermediate stream events
- Auth: `claude auth login` or `ANTHROPIC_API_KEY` env var

Prompting technique: Exact CLI invocation spec with rationale. Prevents agents from guessing flags.


Engine Notes — Codex

From CLAUDE.md (verbatim):

### Codex

- Binary: `codex` (npm: `@openai/codex`)
- Flags: `exec --json --full-auto --skip-git-repo-check [--model X] <prompt>`
- Resume: `exec resume --json --full-auto --skip-git-repo-check [--model X] <thread_id> <prompt>`

Prompting technique: Separate invocation pattern for new session vs resume — the agent learns both paths.


AGENTS.md (Auto-generated by workspace.ts)

GolemBot auto-generates an AGENTS.md file in the assistant directory by scanning the skills/ folder. This file lists all available skills and their descriptions, giving the underlying coding agent awareness of its skill inventory. The agent is expected to read this at session start.

Prompting technique: Auto-generated capability manifest — the agent learns what it can do from a file, not from hardcoded instructions. This keeps the skill inventory dynamic.

09

Uniqueness

GolemBot — Uniqueness

differs_from_seeds

GolemBot shares no close analog in the 11 seeds. The closest is openspec in that both aim for cross-tool portability (multiple agent engines, multiple entry points), but openspec adds a spec management layer while GolemBot deliberately adds nothing beyond channel bridging. GolemBot is architecturally opposite to frameworks like superpowers or BMAD-METHOD — those inject behavior into the agent; GolemBot just routes the agent's output to new destinations. The fleet/gateway pattern is unique in the corpus: none of the seeds address connecting coding agents to IM channels or providing multi-bot fleet monitoring. The provider-routing feature (same engine, different LLM provider, zero code changes) is also novel — taskmaster-ai has multi-model role assignment, but GolemBot's provider routing is about infrastructure routing rather than task-role mapping.

Distinctive Positioning

  1. Minimal abstraction philosophy: Two concepts only (directory + Skills) and an active commitment to not growing that surface. The CLAUDE.md explicitly bans adding Tool, Blueprint, Registry, or Pipeline abstractions.
  2. IM channel bridging: First-class multi-channel support (7 IM channels) for connecting coding agents to team communication. No other seed framework addresses this.
  3. Provider routing without framework changes: The provider config block decouples engine from provider — you can run Claude Code through OpenRouter without modifying any framework code.
  4. ClawHub ecosystem: Access to 13,000+ community skills, making GolemBot the entry point to the largest community skill library referenced in this batch.
  5. Embedding API: createAssistant() with 5 lines of code — the only framework in this batch designed for embedding agent capabilities into existing Node.js products.

Observable Failure Modes

  • Provider lock-in for Codex: Custom Codex providers must support the Responses API, which many providers don't — routing Codex through /chat/completions providers silently fails
  • No coordination between Fleet instances: Fleet is monitoring only — multiple GolemBot instances don't share session state or coordinate on tasks
  • Skills are static at invocation: Skills are read from disk at each chat() call; dynamic skill updates require process restart
  • Channel auth complexity: 7 different IM platforms each require separate bot setup and token management
  • No built-in memory: GolemBot is stateless beyond session ID mapping — agents lose context if the session ID mapping is lost
04

Workflow

GolemBot — Workflow

Setup Phases

Phase Command Artifact
1. Initialize golembot init -e claude-code -n my-bot golem.yaml, skills/ directory
2. Configure channels Edit golem.yaml — add Slack/Telegram/Discord tokens Channel config in golem.yaml
3. Start gateway golembot gateway HTTP service + IM listeners + Dashboard on port (auto)
4. Skills install golembot skill install <name> Skill Markdown files in skills/
5. Fleet monitoring golembot fleet ls Running instance registry

Request Flow

  1. User sends message to IM channel (or HTTP endpoint)
  2. Channel adapter receives and normalizes to GolemBot message format
  3. Gateway routes to createAssistant() with message + sessionKey
  4. KeyedMutex serializes same-session calls
  5. workspace.ts reads golem.yaml, scans skills/ directory, generates AGENTS.md
  6. engine.ts selects engine implementation based on golem.yaml#engine
  7. Engine spawns agent CLI subprocess with prompt + skill paths + provider config
  8. StreamEvent responses streamed back through gateway to channel adapter
  9. Channel adapter formats and sends response back to user

REPL Mode

golembot run provides an interactive terminal conversation with the agent, identical to the SDK API but with terminal formatting.

Scheduled Tasks

golembot gateway includes a built-in cron scheduler. Tasks configured in golem.yaml run on schedule (e.g., "daily standups, dependency audits, test reports pushed to IM").

Approval Gates

None — GolemBot delegates all decision-making to the underlying coding agent. The /stop command (in REPL or IM) and POST /abort HTTP endpoint allow cancellation without clearing session history.

Multi-modal Support

Image messages from IM channels are saved to disk, then the agent reads and analyzes them. All 7 channels support this. imagePaths are passed to the engine InvokeOpts for native multimodal handling.

06

Memory Context

GolemBot — Memory & Context

Session Persistence

Sessions are stored in-process keyed by sessionKey (a string identifier provided by the caller). The session.ts module handles:

  • Session ID mapping: sessionKey → agent-native session ID (Codex thread ID, Claude Code session ID, etc.)
  • Resume continuity: session IDs are passed to engine invocations for --resume / --session / exec resume

Memory Type

File-based — the assistant directory (golem.yaml, skills/) serves as the persistent memory of the assistant configuration. Session conversation history lives inside the underlying coding agent's native session storage (Claude Code's ~/.claude/projects/, Codex's thread store, etc.).

GolemBot itself does not maintain a memory store — it is stateless at the coordination layer. Memory is delegated to the underlying agent engine.

Cross-Session Handoff

Enabled via the agent engines' native resume mechanisms:

  • Claude Code: --resume <session-id>
  • Cursor: --resume
  • OpenCode: --session <session-id>
  • Codex: exec resume <thread_id> <prompt>

The session.ts module maps sessionKey → engine session ID and stores this mapping for resume purposes.

Context Compaction

Not handled by GolemBot — delegated to the underlying coding agent's own compaction mechanisms.

State Files

my-bot/
├── golem.yaml           # Assistant configuration (persisted across restarts)
├── skills/              # Skill Markdown files (read at each invocation)
└── AGENTS.md            # Auto-generated skill inventory (regenerated from skills/)

No database, no additional state files. The philosophy is radical simplicity: GolemBot has two concepts (assistant directory + Skills) and no persistence beyond what the underlying agent natively provides.

07

Orchestration

GolemBot — Orchestration

Multi-Agent Pattern

GolemBot is primarily single-agent per assistant instance. Multiple concurrent instances (Fleet mode) run independent bots that don't coordinate — Fleet Dashboard is for monitoring, not coordination. Within a single instance, different sessionKeys run in parallel (different users, different conversations), but all route to the same engine configuration.

Multi-Model: Provider Routing

This is GolemBot's key multi-model contribution — not different models for different roles, but the same engine pointing at a different LLM provider:

Engine Default Provider Can be re-routed to
Claude Code Anthropic OpenRouter (any model)
Codex OpenAI Custom provider (must support Responses API)
OpenCode Provider-dependent Any supported provider
Cursor Cursor API OpenRouter

Configuration (in golem.yaml):

provider:
  type: openrouter
  api_key: ${OPENROUTER_API_KEY}
  model: anthropic/claude-3.5-sonnet

This enables "4 engines × any provider" — the provider-env.ts module maps the config to the right environment variables for each engine.

Note: Codex requires Responses API — providers supporting only /chat/completions will fail.

Execution Mode

Interactive-loop for REPL mode; event-driven for gateway mode (responds to incoming IM messages).

Isolation

None — each sessionKey runs in the same Node.js process. Concurrency within a session is controlled by KeyedMutex (serializes calls with the same key; parallel calls with different keys).

Fleet Mode

golembot fleet serve aggregates multiple independent golembot gateway instances into a single Fleet Dashboard. This is a monitoring aggregation pattern, not true multi-agent coordination.

Scheduled Tasks

Built-in cron scheduler in scheduler.ts — runs recurring tasks (e.g., "daily standups") on schedule, pushing results to IM channels.

Crash Recovery

No explicit crash recovery — session IDs enable resume if the process restarts and the session ID is preserved. The task-store.ts module may provide additional task state persistence.

08

Ui Cli Surface

GolemBot — UI & CLI Surface

CLI Binary: golembot

Full-featured CLI (bin: { golembot: './dist/cli.js' }):

Command Description
golembot init Create assistant directory with golem.yaml
golembot onboard Guided interactive setup
golembot run Terminal REPL conversation
golembot gateway Start IM channels + HTTP + Dashboard
golembot fleet ls List running bots
golembot fleet serve Fleet Dashboard server
golembot skill search Search ClawHub skills
golembot skill install Install a ClawHub skill
golembot doctor Health check for dependencies

Web Dashboard

Every golembot gateway instance includes a built-in web dashboard showing:

  • Real-time metrics
  • Channel status (Slack/Telegram/Discord/etc. connection status)
  • Quick-test console
  • Active session count

Fleet Dashboard (golembot fleet serve) aggregates all running bots into a single view.

IM Channels (7)

The gateway connects coding agents to:

  1. Slack
  2. Telegram
  3. Discord
  4. Feishu / Lark
  5. DingTalk
  6. WeCom (WeChat Work)
  7. WeChat (personal)

Each channel has dedicated format modules (slack-format.ts, telegram-format.ts, feishu-format.ts) handling platform-specific message rendering.

HTTP API

golembot gateway exposes an HTTP service with SSE endpoints for embedding into products:

  • POST /chat — send a message, stream response events
  • POST /abort — cancel current task
  • SSE endpoint for real-time streaming

Bearer authentication supported.

IDE Integration

None — GolemBot is a channel bridge, not an IDE extension. It connects agents to external communication channels rather than extending the IDE experience.

Observability

  • Dashboard shows real-time metrics and channel status
  • debug-events.ts module for detailed stream event debugging
  • Cost tracking via costUsd field in StreamEvent.done events

Related frameworks

same archetype · same primary tool · same memory type

CodeMachine CLI ★ 2.5k

JavaScript-DSL workflow orchestration engine that captures repeatable AI coding agent workflows with tracks, condition groups,…

Codexia ★ 690

Tauri desktop app providing visual control plane, task scheduler, git worktree manager, and headless REST API for Codex CLI +…

Kagan ★ 88

Kanban TUI for AI coding agents with a structurally enforced human review gate (REVIEW → DONE cannot be automated) — one git…

oh-my-claudecode (Yeachan-Heo) ★ 35k

Zero-learning-curve teams-first multi-agent orchestration for Claude Code with autopilot (6-phase lifecycle), ralph (PRD-driven…

Paseo ★ 6.8k

Multi-provider AI coding agent orchestration daemon with cross-device access (phone/desktop/CLI) and git worktree isolation.

CCG Workflow ★ 5.4k

Routes Claude + Codex + Gemini to task-appropriate collaboration strategies (direct-fix through full-collaborate) with hook-based…