Skip to content
/

Polka Codes

polka-codes · polka-codes/polka-codes · ★ 16 · last commit 2026-05-22

TypeScript CLI that provides composable AI coding subcommands with multi-provider routing, checkpoint/resume, and a TypeScript API for meta-workflow scripting.

Best whenTypeScript scripting API (import { code, commit }) lets developers compose AI commands programmatically as meta-workflows.
vs seeds
spec-kit(Archetype 2 — paired commands/skills) but as a standalone CLI binary rather than a Claude Code plugin. Polka Codes adds…
Primitive shape 14 total
Commands 11 Subagents 2 MCP tools 1
00

Summary

Polka Codes — Summary

Polka Codes is a TypeScript/Bun CLI tool and multi-agent framework that provides an @polka-codes/cli npm package with dedicated subcommands for planning, coding, fixing, reviewing, committing, and creating PRs — backed by a multi-agent architecture (Architect + Coder), a SKILL.md skill system, checkpoint/resume capability, and support for 7 LLM providers including multi-model routing.

Problem it solves: Developers need a programmable, composable AI coding workflow that integrates with their existing git + GitHub toolchain, supports multiple LLM providers simultaneously, and allows defining custom TypeScript automation scripts alongside AI commands.

Distinctive trait: AGPL-licensed TypeScript CLI with 800+ tests, checkpoint/resume for long-running tasks, YAML-based workflow definitions for dynamic multi-step AI pipelines, and a skills system compatible with the Claude Code .claude/skills/ convention.

Target audience: TypeScript/Bun developers who want a polished CLI for AI-assisted development with GitHub PR integration, multi-provider support, and custom scripting via .polkacodes.yml.

Production-readiness: Active development (16 stars, last pushed May 2026, AGPL-3.0).

Differs from seeds: Closest to spec-kit (Archetype 2 — mirror commands) but as a standalone CLI rather than a Claude Code plugin. Polka Codes ships a full binary CLI (polka) rather than slash commands, supports genuine multi-model routing unlike spec-kit's single-tool design, and adds checkpoint/resume and dynamic YAML workflow definitions not present in any seed.

01

Overview

Polka Codes — Overview

Origin

polka-codes organization (independent), released as a TypeScript monorepo with Bun as the runtime. Published to npm as @polka-codes/cli. AGPL-3.0 licensed. Active development as of May 2026.

Philosophy

Polka Codes is built on a "programmable AI workflow" model — the CLI is designed to be composable, testable, and extensible. With 800+ tests using bun:test and snapshot testing, the framework treats AI workflow code with the same rigor as production software.

The .polkacodes.yml configuration allows defining custom scripts (TypeScript or shell) that compose polka commands programmatically, enabling automation pipelines beyond one-shot AI interactions.

Key Manifesto-style statements (from README)

"Multi-Agent System: Specialized AI agents (Architect, Coder, etc.) collaborate on complex tasks"

"Checkpoint & Resume: Save and restore work state with the checkpoints and continue commands"

"Multiple AI Providers: Anthropic Claude, Google Vertex, DeepSeek, OpenAI, OpenRouter, Ollama"

"MCP Support: Consume tools from external MCP servers and expose workflows via MCP server"

Design Choices

  • Bun runtime for performance (TypeScript-native)
  • YAML-based dynamic workflow definitions for reusable multi-step pipelines
  • Checkpoint system persists agent state for resume across sessions
  • Review loop with --loop flag: re-reviews its own changes until clean
  • GitHub Actions integration for PR/issue mention automation
  • Skills compatible with .claude/skills/ convention but also ~/.claude/skills/ and npm plugin skills
02

Architecture

Polka Codes — Architecture

Distribution

npm package @polka-codes/cli. TypeScript monorepo with Bun runtime.

Install

npm install -g @polka-codes/cli
# or
bun add -g @polka-codes/cli
# or
npx @polka-codes/cli "your task description"

Monorepo Packages

packages/
├── core/           # AI services, workflow engine, agents, tools
│   └── src/
│       ├── agents/         # Agent definitions
│       ├── workflow/       # WorkflowFn, BaseWorkflowContext, DynamicWorkflow
│       ├── tools/          # File ops, search, memory, todo, skills
│       ├── skills/         # Skill discovery + loading
│       └── memory/         # readMemory, updateMemory, listMemoryTopics
├── cli/            # CLI interface (Commander.js)
│   └── bin/trellis.js
├── cli-shared/     # Shared utilities
├── github/         # GitHub integration (PR/issue via gh CLI)
└── runner/         # Agent runner service

Config Files

  • .polkacodes.yml — project configuration (scripts, commands, excludeFiles)
  • ci.polkacodes.yml — CI-specific configuration
  • .claude/skills/ — project skill definitions (SKILL.md format)
  • ~/.claude/skills/ — personal skills

Required Runtime

  • Node.js >= 18 or Bun >= 1.0
  • gh CLI (optional, for PR/review features)

Target AI Tools

Standalone CLI (not Claude Code plugin). Supports: Anthropic Claude, OpenAI, Google Vertex, DeepSeek, OpenRouter, Ollama.

Key Architectural Patterns

  • WorkflowFn<TInput, TOutput, TTools> — typed workflow function
  • step function — named execution unit with retry and caching
  • DynamicWorkflow — YAML-defined multi-step AI pipelines with sub-workflow calls
  • ToolRegistry — type-safe tool registry
03

Components

Polka Codes — Components

CLI Commands (binary: polka)

Command Purpose
polka code Plan + implement feature using Architect → Coder agent pipeline
polka plan Create implementation plan, save to markdown file
polka fix Auto-fix failing tests/commands iteratively
polka review AI-powered code review of local changes or GitHub PR
polka commit Generate AI commit message for staged changes
polka pr Create GitHub PR with AI-generated title/description
polka workflow Run custom YAML workflow file
polka checkpoints List available work checkpoints for resume
polka continue Resume work from a checkpoint
polka run Execute custom scripts from .polkacodes.yml
polka init Initialize project configuration
polka <task> Default: AI determines best workflow for task description

Agents

Agent Purpose
Architect Creates implementation plans for polka code
Coder Implements code from Architect's plan

Tools (core/src/tools/)

Tool Purpose
readFile, writeToFile, replaceInFile, removeFile File operations
search, searchFiles, listFiles Search and discovery
executeCommand Shell command execution
askFollowupQuestion Human-in-loop clarification
fetchUrl URL fetching
readMemory, updateMemory, listMemoryTopics Per-topic memory store
getTodoItem, updateTodoItem, listTodoItems Todo tracking
loadSkill, listSkills Skill loading from SKILL.md files

Skills System (SKILL.md format)

Skills are markdown files with YAML front matter:

---
name: react-component-generator
description: Generate React components
allowed-tools: [readFile, writeToFile]
---
# React Component Generator

Skill discovery priority:

  1. .claude/skills/ (project, git-tracked)
  2. ~/.claude/skills/ (personal)
  3. node_modules/@polka-codes/skill-*/ (npm plugin skills)

Dynamic Workflows (YAML)

YAML-based workflow definitions with:

  • AI-agent-executed steps
  • Sub-workflow calls via runWorkflow
  • State management across steps
  • Script references to TypeScript automation files

GitHub Integration

  • PR creation via gh CLI
  • PR review with --pr <number> flag
  • Issue mention automation via GitHub Actions
05

Prompts

Polka Codes — Prompts

Excerpt 1: AGENTS.md — SKILL.md Format (verbatim)

File: AGENTS.md Technique: Convention specification. Defines the skill file format that both humans and AI tools (Claude Code, Polka) read.

**SKILL.md Format**:
```yaml
---
name: react-component-generator
description: Generate React components
allowed-tools: [readFile, writeToFile]
---

# React Component Generator

## Excerpt 2: .polkacodes.yml — Script Definition Format (verbatim from README)

**Technique**: YAML-based workflow definition allowing TypeScript automation scripts to compose Polka CLI commands programmatically.

```yaml
scripts:
  # Simple shell command
  test: bun test

  # Command with description
  lint:
    command: bun run lint
    description: Run linter

  # TypeScript script
  deploy:
    script: .polka-scripts/deploy.ts
    description: Deploy to production
    timeout: 300000  # 5 minutes
    permissions:
      fs: write
      network: true

Excerpt 3: TypeScript Script Template (verbatim from README)

Technique: API composition — Polka exposes a TypeScript API so custom scripts can call code(), commit(), etc. programmatically, enabling meta-workflows.

import { code, commit } from '@polka-codes/cli'

export async function main(args: string[]) {
  console.log('Running script: my-script')
  console.log('Arguments:', args)

  // Your automation here
  // await code({ task: 'Add feature', interactive: false })
  // await commit({ all: true, context: 'Feature complete' })

  console.log('Script completed successfully')
}

if (import.meta.main) {
  main(process.argv.slice(2))
}
09

Uniqueness

Polka Codes — Uniqueness

Differs from Seeds

Closest to spec-kit (Archetype 2 — each command has a matching skill/workflow) but as a standalone CLI (polka binary) rather than a Claude Code plugin. Polka Codes' delta over spec-kit: genuine multi-provider LLM routing (7 providers vs spec-kit's single tool), checkpoint/resume for long-running work, YAML dynamic workflow definitions, MCP dual-mode (consuming and exposing), and TypeScript scripting API (import { code, commit } from '@polka-codes/cli'). Compared to taskmaster-ai, Polka generates plan files rather than a JSON task store, and adds review/fix/commit automation absent from taskmaster. The AGPL license is more restrictive than MIT/Apache used by most seeds.

Positioning

Polka Codes positions as a "powerful TypeScript-based AI coding assistant framework" — emphasizing composability and testing rigor (800+ tests) over convention-over-configuration simplicity.

Distinctive Opinion

The TypeScript scripting API is Polka's most opinionated feature: by letting users import code() and commit() as TypeScript functions, it enables meta-workflows where AI commands are composed programmatically — a capability no seed framework offers.

Observable Failure Modes

  1. AGPL license: May discourage enterprise adoption or commercial use without careful review
  2. Low adoption: 16 stars suggests limited community validation
  3. Multi-provider complexity: Supporting 7 LLM providers increases maintenance surface
  4. No autonomous hooks: Must be invoked explicitly from CLI; no agent-mode continuous operation
  5. Bun requirement: Bun as primary runtime limits adoption in Node.js-only environments
04

Workflow

Polka Codes — Workflow

Standard Development Workflow

Step Command Artifact
1. Plan polka plan "feature description" feature.plan.md
2. Implement polka code --file feature.plan.md Modified source files
3. Fix issues polka fix "bun test" Fixed source files
4. Review polka review Review feedback (+ auto-applied fixes)
5. Commit polka commit -a Git commit with AI message
6. PR polka pr GitHub PR

Typical Commands

polka plan --plan-file auth.plan.md "Implement JWT-based auth"
polka code --file auth.plan.md
polka fix "bun test"
polka review --loop 3   # re-review until clean
polka commit -a
polka pr

Agent Pipeline (for polka code)

User prompt → Architect (creates plan) → Coder (implements) → Fix workflow (auto-run)

Approval Gates

  • polka review output: human reviews AI feedback before applying
  • --yes flag: auto-apply review fixes without confirmation
  • polka code --preview: show diff before applying (preview gate)
  • Fix loop: bounded by --loop N or bail conditions

Checkpoint/Resume

polka checkpoints         # list saved checkpoints
polka continue            # resume from last checkpoint

Checkpoints save agent state including completed/failed/pending tasks.

Artifacts Per Phase

Phase Artifact
Planning .plan.md file
Implementation Modified source files
Fix Source files (fixed)
Review Review report (JSON via --json)
Commit Git commit
PR GitHub PR

Custom Workflow Mode

YAML workflow files define multi-step pipelines that can be reused:

polka workflow -f my-workflow.workflow
06

Memory Context

Polka Codes — Memory & Context

Memory Tools

Built-in memory tools in core/src/tools/memory/:

  • readMemory(topic) — retrieve stored memory for a topic
  • updateMemory(topic, content) — write/update memory for a topic
  • listMemoryTopics() — list all stored topics

Memory is per-topic, file-based, stored in project directory.

Todo Tools

  • getTodoItem, updateTodoItem, listTodoItems — task tracking within a session

Checkpoint System

Polka's most distinctive memory feature: polka checkpoints lists saved work states including completed, failed, and pending tasks. polka continue resumes from the last checkpoint. This allows long-running multi-file implementations to be paused and resumed.

Cross-Session Handoff

Yes — via checkpoint files. Session state (tasks + progress) is persisted to disk.

Context Compaction

Unknown — no explicit compaction mechanism documented for the main agent loop.

Memory Persistence

Project-scoped (file-based in project directory).

Skills as Memory

The skills system (SKILL.md files) provides a form of reusable behavioral memory — skills encode best practices that can be loaded on-demand rather than repeated in every prompt.

07

Orchestration

Polka Codes — Orchestration

Multi-Agent

Yes. polka code uses an Architect → Coder pipeline. Dynamic workflows can dispatch sub-workflows.

Orchestration Pattern

Sequential (Architect → Coder). Dynamic workflows can create task-decomposition-tree patterns via runWorkflow sub-calls.

Execution Mode

One-shot (each CLI command is a discrete invocation). Continuous loop available via polka fix --loop N and polka review --loop N.

Isolation Mechanism

None (edits in-place). Git branching expected but not automated by the framework.

Multi-Model

Yes. Supports 7 providers: Anthropic Claude, OpenAI, Google Vertex, DeepSeek, OpenRouter, Ollama, local models. Provider selection via .polkacodes.yml or environment variables.

Model Role Mapping

Not explicitly role-differentiated per agent in published documentation; all agents use the configured primary model. Multi-model is primarily provider diversity, not per-role routing.

Supports BYOK

Yes — all providers via API keys.

MCP Support

Yes — both consuming MCP servers (external tools) and exposing workflows as an MCP server.

Consensus Mechanism

None.

Prompt Chaining

Yes. polka plan output (.plan.md) is consumed by polka code --file. One stage's output is the next stage's explicit input.

Crash Recovery

Yes — checkpoint system. polka checkpoints + polka continue.

GitHub Actions Integration

Polka can be triggered by GitHub Actions on PR/issue mentions, enabling event-driven workflow execution in CI.

08

Ui Cli Surface

Polka Codes — UI / CLI Surface

CLI Binary

Yes. Binary name: polka (also npx @polka-codes/cli).

Subcommands: code, plan, fix, review, commit, pr, workflow, checkpoints, continue, run, init + default (bare task description).

Is thin wrapper: No — own runtime with full agent orchestration.

Local UI

None (no web dashboard or TUI).

IDE Integration

None dedicated. Works from any terminal.

MCP Server Exposure

Yes — Polka can expose its workflows as an MCP server, allowing other tools (Claude Code, etc.) to consume Polka workflows as MCP tools.

JSON Output Mode

polka review --json produces machine-readable review output for programmatic use in CI pipelines.

GitHub Integration

  • polka pr creates GitHub PRs via gh CLI
  • polka review --pr 123 reviews GitHub PRs
  • GitHub Actions integration for automation

Observability

  • Checkpoint list shows completed/failed/pending tasks
  • --verbose flags on various commands
  • JSON output mode for CI integration

Test Suite

800+ tests using bun:test with snapshot testing — the framework's own code quality is well-covered.

Related frameworks

same archetype · same primary tool · same memory type

Claude-Flow / Ruflo ★ 55k

Eliminates single-agent context limits and sequential bottlenecks by orchestrating fault-tolerant swarms of specialized AI agents…

Hermes Agent (NousResearch) ★ 168k

Self-improving personal AI agent with closed learning loop, 7 terminal backends, and messaging gateway — not tied to any AI…

OpenCode ★ 165k

Terminal-first AI coding agent with multi-model routing, native desktop app, and a typed .opencode/ configuration system for…

OpenHands ★ 75k

Open-source AI software development platform (open-source Devin alternative) with Docker sandbox isolation, 77.6% SWE-bench…

DeerFlow ★ 70k

Long-horizon superagent that researches, codes, and creates by orchestrating parallel sub-agents with isolated contexts in Docker…

oh-my-openagent (omo) ★ 60k

Multi-provider AI agent orchestration for OpenCode: escape vendor lock-in by routing Sisyphus (Claude/Kimi/GLM) and Hephaestus…