Polka Codes

polka-codes · polka-codes/polka-codes · ★ 16 · last commit 2026-05-22

TypeScript CLI that provides composable AI coding subcommands with multi-provider routing, checkpoint/resume, and a TypeScript API for meta-workflow scripting.

Best whenTypeScript scripting API (import { code, commit }) lets developers compose AI commands programmatically as meta-workflows.

vs seeds

spec-kit(Archetype 2 — paired commands/skills) but as a standalone CLI binary rather than a Claude Code plugin. Polka Codes adds…

Primitive shape 14 total

Commands 11 Subagents 2 MCP tools 1

Summary

Polka Codes — Summary

Polka Codes is a TypeScript/Bun CLI tool and multi-agent framework that provides an @polka-codes/cli npm package with dedicated subcommands for planning, coding, fixing, reviewing, committing, and creating PRs — backed by a multi-agent architecture (Architect + Coder), a SKILL.md skill system, checkpoint/resume capability, and support for 7 LLM providers including multi-model routing.

Problem it solves: Developers need a programmable, composable AI coding workflow that integrates with their existing git + GitHub toolchain, supports multiple LLM providers simultaneously, and allows defining custom TypeScript automation scripts alongside AI commands.

Distinctive trait: AGPL-licensed TypeScript CLI with 800+ tests, checkpoint/resume for long-running tasks, YAML-based workflow definitions for dynamic multi-step AI pipelines, and a skills system compatible with the Claude Code .claude/skills/ convention.

Target audience: TypeScript/Bun developers who want a polished CLI for AI-assisted development with GitHub PR integration, multi-provider support, and custom scripting via .polkacodes.yml.

Production-readiness: Active development (16 stars, last pushed May 2026, AGPL-3.0).

Differs from seeds: Closest to spec-kit (Archetype 2 — mirror commands) but as a standalone CLI rather than a Claude Code plugin. Polka Codes ships a full binary CLI (polka) rather than slash commands, supports genuine multi-model routing unlike spec-kit's single-tool design, and adds checkpoint/resume and dynamic YAML workflow definitions not present in any seed.

Overview

Polka Codes — Overview

Origin

polka-codes organization (independent), released as a TypeScript monorepo with Bun as the runtime. Published to npm as @polka-codes/cli. AGPL-3.0 licensed. Active development as of May 2026.

Philosophy

Polka Codes is built on a "programmable AI workflow" model — the CLI is designed to be composable, testable, and extensible. With 800+ tests using bun:test and snapshot testing, the framework treats AI workflow code with the same rigor as production software.

The .polkacodes.yml configuration allows defining custom scripts (TypeScript or shell) that compose polka commands programmatically, enabling automation pipelines beyond one-shot AI interactions.

Key Manifesto-style statements (from README)

"Multi-Agent System: Specialized AI agents (Architect, Coder, etc.) collaborate on complex tasks"

"Checkpoint & Resume: Save and restore work state with the checkpoints and continue commands"

"Multiple AI Providers: Anthropic Claude, Google Vertex, DeepSeek, OpenAI, OpenRouter, Ollama"

"MCP Support: Consume tools from external MCP servers and expose workflows via MCP server"

Design Choices

Bun runtime for performance (TypeScript-native)
YAML-based dynamic workflow definitions for reusable multi-step pipelines
Checkpoint system persists agent state for resume across sessions
Review loop with --loop flag: re-reviews its own changes until clean
GitHub Actions integration for PR/issue mention automation
Skills compatible with .claude/skills/ convention but also ~/.claude/skills/ and npm plugin skills

Architecture

Polka Codes — Architecture

Distribution

npm package @polka-codes/cli. TypeScript monorepo with Bun runtime.

Install

npm install -g @polka-codes/cli
# or
bun add -g @polka-codes/cli
# or
npx @polka-codes/cli "your task description"

Monorepo Packages

packages/
├── core/           # AI services, workflow engine, agents, tools
│   └── src/
│       ├── agents/         # Agent definitions
│       ├── workflow/       # WorkflowFn, BaseWorkflowContext, DynamicWorkflow
│       ├── tools/          # File ops, search, memory, todo, skills
│       ├── skills/         # Skill discovery + loading
│       └── memory/         # readMemory, updateMemory, listMemoryTopics
├── cli/            # CLI interface (Commander.js)
│   └── bin/trellis.js
├── cli-shared/     # Shared utilities
├── github/         # GitHub integration (PR/issue via gh CLI)
└── runner/         # Agent runner service

Config Files

.polkacodes.yml — project configuration (scripts, commands, excludeFiles)
ci.polkacodes.yml — CI-specific configuration
.claude/skills/ — project skill definitions (SKILL.md format)
~/.claude/skills/ — personal skills

Required Runtime

Node.js >= 18 or Bun >= 1.0
gh CLI (optional, for PR/review features)

Target AI Tools

Standalone CLI (not Claude Code plugin). Supports: Anthropic Claude, OpenAI, Google Vertex, DeepSeek, OpenRouter, Ollama.

Key Architectural Patterns

WorkflowFn<TInput, TOutput, TTools> — typed workflow function
step function — named execution unit with retry and caching
DynamicWorkflow — YAML-defined multi-step AI pipelines with sub-workflow calls
ToolRegistry — type-safe tool registry

Components

Polka Codes — Components

CLI Commands (binary: `polka`)

Command	Purpose
`polka code`	Plan + implement feature using Architect → Coder agent pipeline
`polka plan`	Create implementation plan, save to markdown file
`polka fix`	Auto-fix failing tests/commands iteratively
`polka review`	AI-powered code review of local changes or GitHub PR
`polka commit`	Generate AI commit message for staged changes
`polka pr`	Create GitHub PR with AI-generated title/description
`polka workflow`	Run custom YAML workflow file
`polka checkpoints`	List available work checkpoints for resume
`polka continue`	Resume work from a checkpoint
`polka run`	Execute custom scripts from `.polkacodes.yml`
`polka init`	Initialize project configuration
`polka <task>`	Default: AI determines best workflow for task description

Agents

Agent	Purpose
Architect	Creates implementation plans for `polka code`
Coder	Implements code from Architect's plan

Tools (core/src/tools/)

Tool	Purpose
`readFile`, `writeToFile`, `replaceInFile`, `removeFile`	File operations
`search`, `searchFiles`, `listFiles`	Search and discovery
`executeCommand`	Shell command execution
`askFollowupQuestion`	Human-in-loop clarification
`fetchUrl`	URL fetching
`readMemory`, `updateMemory`, `listMemoryTopics`	Per-topic memory store
`getTodoItem`, `updateTodoItem`, `listTodoItems`	Todo tracking
`loadSkill`, `listSkills`	Skill loading from SKILL.md files

Skills System (SKILL.md format)

Skills are markdown files with YAML front matter:

---
name: react-component-generator
description: Generate React components
allowed-tools: [readFile, writeToFile]
---
# React Component Generator

Skill discovery priority:

.claude/skills/ (project, git-tracked)
~/.claude/skills/ (personal)
node_modules/@polka-codes/skill-*/ (npm plugin skills)

Dynamic Workflows (YAML)

YAML-based workflow definitions with:

AI-agent-executed steps
Sub-workflow calls via runWorkflow
State management across steps
Script references to TypeScript automation files

GitHub Integration

PR creation via gh CLI
PR review with --pr <number> flag
Issue mention automation via GitHub Actions

Prompts

Polka Codes — Prompts

Excerpt 1: AGENTS.md — SKILL.md Format (verbatim)

File: AGENTS.md Technique: Convention specification. Defines the skill file format that both humans and AI tools (Claude Code, Polka) read.

**SKILL.md Format**:
```yaml
---
name: react-component-generator
description: Generate React components
allowed-tools: [readFile, writeToFile]
---

# React Component Generator


## Excerpt 2: .polkacodes.yml — Script Definition Format (verbatim from README)

**Technique**: YAML-based workflow definition allowing TypeScript automation scripts to compose Polka CLI commands programmatically.

```yaml
scripts:
  # Simple shell command
  test: bun test

  # Command with description
  lint:
    command: bun run lint
    description: Run linter

  # TypeScript script
  deploy:
    script: .polka-scripts/deploy.ts
    description: Deploy to production
    timeout: 300000  # 5 minutes
    permissions:
      fs: write
      network: true

Excerpt 3: TypeScript Script Template (verbatim from README)

Technique: API composition — Polka exposes a TypeScript API so custom scripts can call code(), commit(), etc. programmatically, enabling meta-workflows.

import { code, commit } from '@polka-codes/cli'

export async function main(args: string[]) {
  console.log('Running script: my-script')
  console.log('Arguments:', args)

  // Your automation here
  // await code({ task: 'Add feature', interactive: false })
  // await commit({ all: true, context: 'Feature complete' })

  console.log('Script completed successfully')
}

if (import.meta.main) {
  main(process.argv.slice(2))
}

Uniqueness

Polka Codes — Uniqueness

Differs from Seeds

Closest to spec-kit (Archetype 2 — each command has a matching skill/workflow) but as a standalone CLI (polka binary) rather than a Claude Code plugin. Polka Codes' delta over spec-kit: genuine multi-provider LLM routing (7 providers vs spec-kit's single tool), checkpoint/resume for long-running work, YAML dynamic workflow definitions, MCP dual-mode (consuming and exposing), and TypeScript scripting API (import { code, commit } from '@polka-codes/cli'). Compared to taskmaster-ai, Polka generates plan files rather than a JSON task store, and adds review/fix/commit automation absent from taskmaster. The AGPL license is more restrictive than MIT/Apache used by most seeds.

Positioning

Polka Codes positions as a "powerful TypeScript-based AI coding assistant framework" — emphasizing composability and testing rigor (800+ tests) over convention-over-configuration simplicity.

Distinctive Opinion

The TypeScript scripting API is Polka's most opinionated feature: by letting users import code() and commit() as TypeScript functions, it enables meta-workflows where AI commands are composed programmatically — a capability no seed framework offers.

Observable Failure Modes

AGPL license: May discourage enterprise adoption or commercial use without careful review
Low adoption: 16 stars suggests limited community validation
Multi-provider complexity: Supporting 7 LLM providers increases maintenance surface
No autonomous hooks: Must be invoked explicitly from CLI; no agent-mode continuous operation
Bun requirement: Bun as primary runtime limits adoption in Node.js-only environments

Workflow

Polka Codes — Workflow

Standard Development Workflow

Step	Command	Artifact
1. Plan	`polka plan "feature description"`	`feature.plan.md`
2. Implement	`polka code --file feature.plan.md`	Modified source files
3. Fix issues	`polka fix "bun test"`	Fixed source files
4. Review	`polka review`	Review feedback (+ auto-applied fixes)
5. Commit	`polka commit -a`	Git commit with AI message
6. PR	`polka pr`	GitHub PR

Typical Commands

polka plan --plan-file auth.plan.md "Implement JWT-based auth"
polka code --file auth.plan.md
polka fix "bun test"
polka review --loop 3   # re-review until clean
polka commit -a
polka pr

Agent Pipeline (for `polka code`)

User prompt → Architect (creates plan) → Coder (implements) → Fix workflow (auto-run)

Approval Gates

polka review output: human reviews AI feedback before applying
--yes flag: auto-apply review fixes without confirmation
polka code --preview: show diff before applying (preview gate)
Fix loop: bounded by --loop N or bail conditions

Checkpoint/Resume

polka checkpoints         # list saved checkpoints
polka continue            # resume from last checkpoint

Checkpoints save agent state including completed/failed/pending tasks.

Artifacts Per Phase

Phase	Artifact
Planning	`.plan.md` file
Implementation	Modified source files
Fix	Source files (fixed)
Review	Review report (JSON via `--json`)
Commit	Git commit
PR	GitHub PR

Custom Workflow Mode

YAML workflow files define multi-step pipelines that can be reused:

polka workflow -f my-workflow.workflow

Memory Context

Polka Codes — Memory & Context

Memory Tools

Built-in memory tools in core/src/tools/memory/:

readMemory(topic) — retrieve stored memory for a topic
updateMemory(topic, content) — write/update memory for a topic
listMemoryTopics() — list all stored topics

Memory is per-topic, file-based, stored in project directory.

Todo Tools

getTodoItem, updateTodoItem, listTodoItems — task tracking within a session

Checkpoint System

Polka's most distinctive memory feature: polka checkpoints lists saved work states including completed, failed, and pending tasks. polka continue resumes from the last checkpoint. This allows long-running multi-file implementations to be paused and resumed.

Cross-Session Handoff

Yes — via checkpoint files. Session state (tasks + progress) is persisted to disk.

Context Compaction

Unknown — no explicit compaction mechanism documented for the main agent loop.

Memory Persistence

Project-scoped (file-based in project directory).

Skills as Memory

The skills system (SKILL.md files) provides a form of reusable behavioral memory — skills encode best practices that can be loaded on-demand rather than repeated in every prompt.

Orchestration

Polka Codes — Orchestration

Multi-Agent

Yes. polka code uses an Architect → Coder pipeline. Dynamic workflows can dispatch sub-workflows.

Orchestration Pattern

Sequential (Architect → Coder). Dynamic workflows can create task-decomposition-tree patterns via runWorkflow sub-calls.

Execution Mode

One-shot (each CLI command is a discrete invocation). Continuous loop available via polka fix --loop N and polka review --loop N.

Isolation Mechanism

None (edits in-place). Git branching expected but not automated by the framework.

Multi-Model

Yes. Supports 7 providers: Anthropic Claude, OpenAI, Google Vertex, DeepSeek, OpenRouter, Ollama, local models. Provider selection via .polkacodes.yml or environment variables.

Model Role Mapping

Not explicitly role-differentiated per agent in published documentation; all agents use the configured primary model. Multi-model is primarily provider diversity, not per-role routing.

Supports BYOK

Yes — all providers via API keys.

MCP Support

Yes — both consuming MCP servers (external tools) and exposing workflows as an MCP server.

Consensus Mechanism

None.

Prompt Chaining

Yes. polka plan output (.plan.md) is consumed by polka code --file. One stage's output is the next stage's explicit input.

Crash Recovery

Yes — checkpoint system. polka checkpoints + polka continue.

GitHub Actions Integration

Polka can be triggered by GitHub Actions on PR/issue mentions, enabling event-driven workflow execution in CI.

Ui Cli Surface

Polka Codes — UI / CLI Surface

CLI Binary

Yes. Binary name: polka (also npx @polka-codes/cli).

Subcommands: code, plan, fix, review, commit, pr, workflow, checkpoints, continue, run, init + default (bare task description).

Is thin wrapper: No — own runtime with full agent orchestration.

Local UI

None (no web dashboard or TUI).

IDE Integration

None dedicated. Works from any terminal.

MCP Server Exposure

Yes — Polka can expose its workflows as an MCP server, allowing other tools (Claude Code, etc.) to consume Polka workflows as MCP tools.

JSON Output Mode

polka review --json produces machine-readable review output for programmatic use in CI pipelines.

GitHub Integration

polka pr creates GitHub PRs via gh CLI
polka review --pr 123 reviews GitHub PRs
GitHub Actions integration for automation

Observability

Checkpoint list shows completed/failed/pending tasks
--verbose flags on various commands
JSON output mode for CI integration

Test Suite

800+ tests using bun:test with snapshot testing — the framework's own code quality is well-covered.

Related frameworks

same archetype · same primary tool · same memory type

Claude-Flow / Ruflo ★ 55k

A6 Multi-agent orchestrator

Eliminates single-agent context limits and sequential bottlenecks by orchestrating fault-tolerant swarms of specialized AI agents…

Hermes Agent (NousResearch) ★ 168k

A6 Multi-agent orchestrator

Self-improving personal AI agent with closed learning loop, 7 terminal backends, and messaging gateway — not tied to any AI…

OpenCode ★ 165k

A6 Multi-agent orchestrator

Terminal-first AI coding agent with multi-model routing, native desktop app, and a typed .opencode/ configuration system for…

OpenHands ★ 75k

A6 Multi-agent orchestrator

Open-source AI software development platform (open-source Devin alternative) with Docker sandbox isolation, 77.6% SWE-bench…

DeerFlow ★ 70k

A6 Multi-agent orchestrator

Long-horizon superagent that researches, codes, and creates by orchestrating parallel sub-agents with isolated contexts in Docker…

oh-my-openagent (omo) ★ 60k

A6 Multi-agent orchestrator

Multi-provider AI agent orchestration for OpenCode: escape vendor lock-in by routing Sisyphus (Claude/Kimi/GLM) and Hephaestus…

Distribution

Type: npm-package
License: AGPL-3.0
Install: npm-install

Surfaces

CLI binary: polka
CLI subcmds: 11
Local UI: No

Components

Commands: 11
Skills: 0
Subagents: 2
Hooks: 0
MCP servers: 1
Scripts: 0
Templates: 0

Workflow

Phases: 6
Approval gates: 2
Spec format: markdown
Spec storage: flat-files
Delta or full: whole-file

Orchestration

Multi-agent: Yes
Pattern: sequential
Isolation: none
Consensus: none
Prompt chaining: Yes

Multi-model

Multi-model: Yes
BYOK: Yes
Modal: text

Execution

Mode: one-shot
Crash recovery: Yes
Session handoff: Yes

Memory

Type: file-based
Persistence: project
Search: none
State files: 2 files

Quality

TDD: Optional
TDD mechanism: none
Validators: 2
Self-review: inline-self

Git / Observability

Auto commit: Yes
Auto PR: Yes
Auto merge: No
Worktree/feat: No
Audit log: Yes
Audit format: structured-md
Replay: Yes

Tools

Primary: standalone-cli
Targets: 3
Portability: medium

Signals

Stars: 16
Last commit: 2026-05-22
Contributors: 4
Maintainer: active
Quality score: 5.9/10

Summary

Polka Codes — Summary

Overview

Polka Codes — Overview

Origin

Philosophy

Key Manifesto-style statements (from README)

Design Choices

Architecture

Polka Codes — Architecture

Distribution

Install

Monorepo Packages

Config Files

Required Runtime

Target AI Tools

Key Architectural Patterns

Components

Polka Codes — Components

CLI Commands (binary: polka)

Agents

Tools (core/src/tools/)

Skills System (SKILL.md format)

Dynamic Workflows (YAML)

GitHub Integration

Prompts

Polka Codes — Prompts

Excerpt 1: AGENTS.md — SKILL.md Format (verbatim)

Excerpt 3: TypeScript Script Template (verbatim from README)

Uniqueness

Polka Codes — Uniqueness

Differs from Seeds

Positioning

Distinctive Opinion

Observable Failure Modes

Workflow

Polka Codes — Workflow

Standard Development Workflow

Typical Commands

Agent Pipeline (for polka code)

Approval Gates

Checkpoint/Resume

Artifacts Per Phase

Custom Workflow Mode

Memory Context

Polka Codes — Memory & Context

Memory Tools

Todo Tools

Checkpoint System

Cross-Session Handoff

Context Compaction

Memory Persistence

Skills as Memory

Orchestration

Polka Codes — Orchestration

Multi-Agent

Orchestration Pattern

Execution Mode

Isolation Mechanism

Multi-Model

Model Role Mapping

Supports BYOK

MCP Support

Consensus Mechanism

Prompt Chaining

Crash Recovery

GitHub Actions Integration

Ui Cli Surface

Polka Codes — UI / CLI Surface

CLI Binary

Local UI

IDE Integration

MCP Server Exposure

JSON Output Mode

GitHub Integration

Observability

Test Suite

Related frameworks

CLI Commands (binary: `polka`)

Agent Pipeline (for `polka code`)