Skip to content
/

flow-next

flow-next · gmickel/flow-next · ★ 615 · last commit 2026-05-25

Enforce spec-first task hygiene with fresh-context workers, source-tagged capture, and cross-model review gates to prevent context bleed and hallucinated requirements.

Best whenNo standalone tasks; every work unit belongs to a spec; cross-model review (different vendor model) gates every implementation.
Skip ifStandalone tasks without a parent spec, Worker subagents spawning further sub-workers
vs seeds
bmad-methodin persona-md workers, plan-first philosophy, and Ralph autonomous mode, but adds four distinct capabilities absent from…
Primitive shape 49 total
Skills 23 Subagents 21 Hooks 5
00

Summary

Flow-Next — Summary

Flow-Next (615 stars) is a spec-first AI workflow plugin by Gordon Mickel that ships 23 skills, 21 subagent workers, and a bundled flowctl CLI binary for the complete lifecycle from idea capture to autonomous overnight execution. Its primary architecture rule is that every unit of work must belong to a spec (.flow/specs/fn-N-slug.md) before tasks can be created — there are no standalone tasks. The flow-next-work skill executes each task by spawning a fresh-context worker subagent per task, preventing context bleed between implementation units. Cross-model review gates are baked into the loop: flow-next-impl-review and flow-next-plan-review can use RepoPrompt, Codex, or Copilot as the reviewing model. The autonomous "Ralph" mode (flow-next-ralph-init) scaffolds a local loop under scripts/ralph/ that runs overnight with multi-model review gates and auto-block on stuck tasks. First-class support spans Claude Code, OpenAI Codex, and Factory Droid; a community port covers OpenCode. Compared to seeds, flow-next is closest to BMAD-METHOD (persona-md workers, plan-first) but adds a bundled CLI binary (flowctl) for all task state operations, a formal spec-first ID system, decay-aware project memory, and Ralph autonomous mode — capabilities absent from BMAD.

01

Overview

Flow-Next — Overview

Origin

Created by Gordon Mickel (mickel.tech), independent developer. First published under the flow-next name; v1.0 renamed the core primitive from "epic" to "spec" across the entire surface. The framework evolved from a simpler planning plugin into a multi-runtime, multi-model workflow system.

Philosophy

"Plan-first AI workflow. Zero external dependencies."

"Spec-first. Every unit of work belongs to a spec fn-N. Tasks fn-N.M inherit context." "Fresh-context workers. Each task runs in its own subagent. No token bleed between tasks." "Cross-model reviews. A different model (RepoPrompt / Codex / Copilot) gates every implementation." "R-IDs frozen at handover. Acceptance criteria numbered once, never renumbered."

The system is designed for context hygiene: the planner (main agent) never implements, the worker (subagent) starts clean, and the reviewer uses a different model from the implementer.

Key Design Decisions

  1. No standalone tasks — Every task must have a parent spec, even one-offs. This forces context anchoring.
  2. Bundled CLI (flowctl) — All task state is read/written via the bundled CLI, not by the agent reading files directly. The agent cannot see state without calling flowctl, which enforces a clean separation between narrative (markdown) and metadata (JSON).
  3. R-IDs — Acceptance criteria get stable numeric IDs at spec creation, never renumbered. Review gates reference these IDs to verify coverage.
  4. Source-tagged capture — Every acceptance criterion in a captured spec is tagged [user], [paraphrase], or [inferred] to prevent hallucinated requirements.

Version

v1.1.11 (at analysis date 2026-05-25)

02

Architecture

Flow-Next — Architecture

Distribution

  • Type: Claude Code plugin (also Codex, Factory Droid, OpenCode community port)
  • Plugin manifest: plugins/flow-next/.claude-plugin/plugin.json
  • Version: 1.1.11

Installation

Runtime Method
Claude Code /plugin marketplace add https://github.com/gmickel/flow-next/plugin install flow-next/flow-next:setup
OpenAI Codex git clone + ./scripts/install-codex.sh flow-next (merges 21 agents + hooks into ~/.codex/config.toml)
Factory Droid droid plugin marketplace add ... → TUI install
OpenCode Community port (separate repo)

Directory Tree

gmickel/flow-next/
├── .claude-plugin/marketplace.json
├── .agents/plugins/                    # Global agent mount
├── CLAUDE.md / AGENTS.md              # Runtime instructions
├── plugins/flow-next/
│   ├── .claude-plugin/plugin.json     # Plugin manifest (v1.1.11)
│   ├── .codex-plugin/                 # Codex-specific
│   ├── agents/                        # 21 subagent persona files (.md)
│   ├── skills/                        # 23 skill directories (SKILL.md)
│   ├── hooks/hooks.json               # Ralph workflow guards
│   ├── scripts/                       # flowctl CLI + ralph scripts
│   ├── templates/                     # Spec/task markdown templates
│   └── commands/                      # Alias commands
├── .flow/                             # Runtime state (generated in target project)
│   ├── specs/fn-N-slug.{md,json}
│   ├── tasks/fn-N-slug.M.{md,json}
│   ├── memory/bug/ + knowledge/
│   └── config.json
├── flow-next-tui/                     # Optional TUI (separate npm: @gmickel/flow-next-tui)
└── scripts/ralph/                     # Autonomous overnight loop scripts

Required Runtime

  • Claude Code (primary), Codex CLI, or Factory Droid
  • Python (for ralph hook script)
  • No external services (zero external dependencies)

Config Files

  • .flow/config.json — memory enabled flag, review backend setting
  • .flow/meta.json — schema version, next spec ID
  • plugins/flow-next/.claude-plugin/plugin.json — plugin version
03

Components

Flow-Next — Components

Skills (23 user-invocable)

Skill Purpose
flow-next-strategy Write STRATEGY.md — target problem, approach, users, metrics
flow-next-prospect Generate ranked candidate ideas grounded in the repo
flow-next-capture Synthesize conversation → spec (source-tagged, mandatory read-back)
flow-next-interview Deep spec refinement with lead-with-recommendation + confidence tiers
flow-next-plan Research codebase → create spec + dependency-ordered tasks
flow-next-work Execute tasks: re-anchor per task + worker subagent + review gates
flow-next-impl-review Cross-model implementation review (RepoPrompt/Codex/Copilot)
flow-next-plan-review Cross-model plan review
flow-next-spec-completion-review Verify combined implementation matches spec (renamed from epic-review in 1.0)
flow-next-make-pr Render cognitive-aid PR body (9 input streams) and open via gh
flow-next-resolve-pr Resolve GitHub PR review threads (fetch → triage → fix → reply → resolve)
flow-next-audit Agent-native review of .flow/memory/ entries (Keep/Update/Consolidate/Replace/Delete)
flow-next-memory-migrate Lift legacy flat memory files into categorized schema
flow-next-prime 8-pillar agent-readiness assessment (48 criteria, parallel scouts, GitHub API)
flow-next-ralph-init Scaffold autonomous loop (scripts/ralph/)
flow-next-sync Manually trigger plan-sync after drift
flow-next-setup Initialize project, migrate if needed
flow-next-deps Manage task dependency declarations
flow-next-export-context Export spec/task context for handoff
flow-next-rp-explorer RepoPrompt-specific explorer integration
flow-next-worktree-kit Worktree management utilities
flow-next Meta-skill (entry point dispatcher)
browser Browser-based testing skill

Agents (21 subagent workers)

Agent Purpose
worker Task implementation worker — spawned by flow-next-work for each task
build-scout Analyze build system and CI configuration
claude-md-scout Analyze CLAUDE.md / AGENTS.md for project standards
context-scout Gather cross-cutting project context
docs-gap-scout Identify documentation gaps
docs-scout Research existing documentation
env-scout Analyze environment variables and configuration
flow-gap-analyst Identify gaps in spec coverage
github-scout Query GitHub API for issues/PRs/workflows
memory-scout Search .flow/memory/ for relevant past learnings
observability-scout Analyze monitoring and logging
plan-sync Sync plan when implementation diverges from spec
pr-comment-resolver Address individual PR review comments
practice-scout Research project practices and conventions
quality-auditor Audit implementation quality
repo-scout Analyze repo structure and dependencies
security-scout Identify security concerns
spec-scout Analyze existing specs for context
testing-scout Analyze test coverage and patterns
tooling-scout Research tooling decisions
workflow-scout Analyze workflow patterns

Hooks (5 events via hooks.json)

Ralph workflow guards — active only when FLOW_RALPH=1:

  • PreToolUse (Bash|Execute matcher) — ralph-guard.py
  • PreToolUse (Edit|Write matcher) — ralph-guard.py
  • PostToolUse (Bash|Execute) — ralph-guard.py
  • Stop — ralph-guard.py
  • SubagentStop — ralph-guard.py

Scripts

  • plugins/flow-next/scripts/flowctlBundled CLI binary (bash/Python), all task state operations
  • scripts/ralph/ralph.sh — Autonomous loop runner
  • scripts/ralph/hooks/ralph-guard.py — Hook guard for Ralph mode
  • scripts/install-codex.sh — Codex platform installer

Templates

Located in plugins/flow-next/templates/: spec templates, task templates, memory entry templates

05

Prompts

Flow-Next — Prompts

Verbatim Excerpt 1: flow-next-plan SKILL.md (Plan Role + No-Implementation Rule)

---
name: flow-next-plan
description: Create structured build plans from feature requests or Flow IDs...
user-invocable: false
---

## The Golden Rule: No Implementation Code

**Plans are specs, not implementations.** Do NOT write the code that will be implemented.

### Code IS allowed:
- **Signatures/interfaces** (what, not how): `function validate(input: string): Result`
- **Patterns from this repo** (with file:line ref): "Follow pattern at `src/auth.ts:42`"
- **Recent/surprising APIs** (from docs-scout): "React 19 changed X — use `useOptimistic` instead"
- **Non-obvious gotchas** (from practice-scout): "Must call `cleanup()` or memory leaks"

### Code is FORBIDDEN:
- Complete function implementations
- Full class/module bodies
- "Here's what you'll write" blocks
- Copy-paste ready snippets (>10 lines)

**Why:** Implementation happens in `/flow-next:work` with fresh context. Writing it here wastes tokens in planning, review, AND implementation — then causes drift when the implementer does it differently anyway.

Prompting technique: Explicit "code IS allowed / FORBIDDEN" table with forbidden items listed as concrete examples. The "Why" rationale section makes the rule self-enforcing: the model understands the token-efficiency reason, not just the rule.


Verbatim Excerpt 2: worker agent (Re-Anchor Phase)

---
name: worker
description: Task implementation worker. Spawned by flow-next-work to implement a single task...
model: inherit
disallowedTools: Task
color: "#3B82F6"
---

## Phase 1: Re-anchor (CRITICAL - DO NOT SKIP)

Use the FLOWCTL path and IDs from your prompt:

```bash
# 1. Read task and parent specs (substitute actual values)
<FLOWCTL> show <TASK_ID> --json
<FLOWCTL> cat <TASK_ID>
<FLOWCTL> show <SPEC_ID> --json
<FLOWCTL> cat <SPEC_ID>

# 2. Check git state
git status
git log -5 --oneline

# 3. Check memory system
<FLOWCTL> config get memory.enabled --json

If memory.enabled is true, query relevant memory via the CLI (not by reading files directly)...


**Prompting technique:** "Re-anchor" pattern forces every worker to re-read full spec context before implementing. The `disallowedTools: Task` declaration prevents worker subagents from spawning further sub-subagents (depth control). CLI-mediated state access (`<FLOWCTL> show`) enforces the metadata/narrative separation at the prompt level.

---

## Verbatim Excerpt 3: `hooks.json` (Ralph Guards)

```json
{
  "description": "Ralph workflow guards - only active when FLOW_RALPH=1 and ralph-init has been run",
  "hooks": {
    "PreToolUse": [
      {
        "matcher": "Bash|Execute",
        "hooks": [{
          "type": "command",
          "command": "[ ! -f scripts/ralph/hooks/ralph-guard.py ] || scripts/ralph/hooks/ralph-guard.py",
          "timeout": 5
        }]
      }
    ],
    "Stop": [
      {
        "hooks": [{
          "type": "command",
          "command": "[ ! -f scripts/ralph/hooks/ralph-guard.py ] || scripts/ralph/hooks/ralph-guard.py",
          "timeout": 5
        }]
      }
    ]
  }
}

Design pattern: Guard-via-file-existence — the hook only activates if ralph-guard.py exists (i.e., after ralph-init has run). Non-intrusive for users who don't use Ralph. The guard reads FLOW_RALPH=1 env var at runtime.

09

Uniqueness

Flow-Next — Uniqueness

Differs from Seeds

Closest seed: BMAD-METHOD — both use persona-md workers, plan-first philosophy, and support autonomous loop mode (Ralph). Key architectural deltas: (1) flow-next ships a bundled flowctl CLI binary as the exclusive task state interface — agents cannot read/write state without it, enforcing strict separation of metadata from narrative; BMAD has no equivalent CLI layer; (2) flow-next's source-tagged capture ([user], [paraphrase], [inferred]) prevents hallucinated requirements at the spec creation gate; (3) flow-next's cross-model review gates explicitly route to a different vendor's model (RepoPrompt/Codex/Copilot) rather than the same model doing self-review; (4) the decay-aware categorized memory (bug/ + knowledge/ tracks with audit + consolidate operations) is more sophisticated than BMAD's flat CLAUDE.md memory. Compared to taskmaster-ai (the other bundled-CLI seed), flow-next's CLI is exclusively state-management rather than a user-facing entry point.

Positioning

Flow-next is the most carefully engineered context-hygiene framework in this batch. The disallowedTools: Task constraint on workers, the re-anchor protocol, the cross-model review, and the source-tagged capture all address documented failure modes in AI coding agents (context bleed, hallucinated requirements, single-model blind spots).

Observable Failure Modes

  1. Flowctl path dependency — Skills reference ${CLAUDE_PLUGIN_ROOT}/scripts/flowctl. If the plugin root path changes or the env var is unavailable, all state operations break.
  2. Codex platform complexity — The install-codex.sh path requires merging 21 agents into ~/.codex/config.toml — brittle if Codex's config format changes.
  3. spec-completion-review deprecationflow-next-spec-completion-review was renamed from epic-review in v1.0 with a "soft-removal target 2.0.0" note, signaling active churn in the skill surface.
  4. Cross-model review availability — If the user doesn't have RepoPrompt, Codex, or Copilot available, the cross-model review gate falls back to the main model — defeating the adversarial review intent.
  5. Memory growth — Without regular flow-next-audit runs, .flow/memory/ accumulates stale entries that pollute future searches. The audit skill is manual, not automatic.
04

Workflow

Flow-Next — Workflow

Main Flow

[prospect?] → [capture | interview?] → [plan] → [work] → [impl-review] → [make-pr] → [resolve-pr?]
                                                    ↑___(NEEDS_WORK)_____|

Phases + Artifacts

Phase Skill Artifact
Strategy flow-next-strategy STRATEGY.md
Prospect (opt) flow-next-prospect Ranked candidate list
Capture flow-next-capture .flow/specs/fn-N-slug.md + .json (source-tagged)
Interview (opt) flow-next-interview Enriched spec
Plan flow-next-plan Spec updated with dependency-ordered tasks fn-N-slug.M
Work flow-next-work Commits per task with evidence sections; tasks updated via flowctl done
Impl Review flow-next-impl-review Review report (SHIP or NEEDS_WORK verdict)
Make PR flow-next-make-pr GitHub PR with 9-stream cognitive-aid body
Resolve PR flow-next-resolve-pr PR threads resolved via GraphQL

Approval Gates

Gate Condition
Spec read-back capture requires mandatory user acknowledgment of source-tagged R-IDs
Plan review flow-next-plan-review provides APPROVE/NEEDS_WORK gate before work starts
Implementation review flow-next-impl-review SHIP/NEEDS_WORK gate after each task batch
Spec completion review flow-next-spec-completion-review verifies all R-IDs covered
Ralph autonomous mode Ralph loops with auto-block on stuck tasks; human can interrupt at any time

The Worker Subagent Loop

For each task, flow-next-work spawns a worker subagent:

  1. Worker reads spec + task via flowctl show/cat
  2. Worker checks git state and memory
  3. Worker reads all required investigation files
  4. Worker implements, commits with evidence
  5. Worker calls flowctl done to record completion
  6. Optional cross-model review gate
  7. Parent resumes with next unblocked task

Ralph Autonomous Mode

flow-next-ralph-init scaffolds scripts/ralph/:

  • ralph.sh — Main loop: fetch next task → spawn worker → review → repeat
  • Ralph guard hooks prevent direct tool use by main agent during autonomous run
  • Auto-blocks stuck tasks after N failures
  • Multi-model review gates (different model from implementer)
  • Runs overnight with fresh context per iteration
06

Memory Context

Flow-Next — Memory & Context

State Storage

Type: File-based (markdown + JSON, split by responsibility)
Persistence: Project-scoped
Location: .flow/ directory in target project

File Structure

Path Content
.flow/specs/fn-N-slug.md Spec narrative (plan, scope, R-IDs, acceptance criteria)
.flow/specs/fn-N-slug.json Spec metadata (id, title, status, deps, branch_name)
.flow/tasks/fn-N-slug.M.md Task description + done summary + evidence
.flow/tasks/fn-N-slug.M.json Task metadata (id, status, priority, deps, assignee)
.flow/memory/bug/<category>/ Failure/defect learnings (build-errors, test-failures, runtime-errors...)
.flow/memory/knowledge/<category>/ Patterns/decisions (architecture-patterns, conventions, tooling-decisions...)
.flow/config.json Project settings (memory enabled, review backend)
.flow/meta.json Schema version, next spec ID counter

Metadata / Narrative Split

Design principle: JSON = metadata (plumbing), Markdown = narrative (content). The agent reads markdown for context; flowctl reads JSON for operations. This decouples schema evolution from prose changes.

Memory System (Decay-Aware)

.flow/memory/ is organized into two tracks and multiple categories:

  • bug track: build-errors, test-failures, runtime-errors, performance, security, integration, data, ui
  • knowledge track: architecture-patterns, conventions, tooling-decisions, workflow, best-practices

Workers query memory via flowctl memory search "<keyword>" — not by reading files directly. The flow-next-audit skill reviews each memory entry (Keep / Update / Consolidate / Replace / Delete) to prevent accumulation of stale entries. Legacy flat files (pitfalls.md, conventions.md) are accessible via flowctl memory list with track=legacy.

Cross-Session Handoff

Specs and tasks persist across sessions as committed files. Any new session running flowctl show fn-1-add-oauth gets the full context. Workers re-anchor to spec state at the start of every task, making them session-independent.

Context Compaction

Re-anchoring pattern in worker: each task worker starts with a fresh context window, reads spec + task + memory from disk via flowctl, then implements. This is by design — the worker never accumulates context from previous tasks in the run.

The TUI

An optional standalone TUI (@gmickel/flow-next-tui, installed via bun add -g) provides a terminal UI for monitoring Ralph's autonomous runs.

07

Orchestration

Flow-Next — Orchestration

Multi-Agent Pattern

Pattern: Task-decomposition tree with sequential dependency ordering
The flow-next-plan skill creates a dependency graph of tasks fn-N.M where each task declares blockers. flow-next-work executes tasks in dependency order, spawning one worker subagent per task.

Parallelism: Independent tasks (no blockers) can run in parallel via Claude Code's native parallel Task spawning.

Worker Subagent

  • Definition format: persona-md (worker.md with YAML front-matter)
  • Spawn mechanism: Claude Code Task tool (from within flow-next-work skill)
  • Isolation: Fresh context per task — each worker starts with zero task history from prior workers
  • Constraint: disallowedTools: Task prevents workers from spawning sub-workers (depth=1 only)
  • Re-anchor protocol: Phase 1 of every worker reads full spec + task context via flowctl before any implementation

Cross-Model Review Gates

Three distinct review skills designed to use a different model than the implementer:

Skill Review Model Options
flow-next-impl-review RepoPrompt, OpenAI Codex, GitHub Copilot
flow-next-plan-review Same options
flow-next-spec-completion-review Same options

The review backend is configured in .flow/config.json and queried via flowctl review-backend. This is the most explicit multi-model routing in the batch — not routing different classes of tasks to different models, but routing the same work to a different vendor's model for adversarial review.

Isolation Mechanism

Fresh context per task (primary), with optional git worktree support via flow-next-worktree-kit for parallel branch development.

Autonomous Mode (Ralph)

  • flow-next-ralph-init scaffolds scripts/ralph/ralph.sh
  • Ralph runs continuously: fetch next task → spawn worker → review gate → loop
  • FLOW_RALPH=1 env var activates the hook guards
  • Auto-blocks tasks that fail more than N times (moves on, doesn't retry indefinitely)
  • Execution mode: continuous-ralph (the only framework in this batch with a named continuous loop mode equivalent to BMAD's Ralph)

Execution Mode

Interactive-loop (normal), continuous-ralph (autonomous overnight)

Auto-Validators

flow-next-impl-review runs quality validation post-implementation. The worker's Phase 1 baseline check runs existing tests/lints before starting work. No automatic post-hook test runner found — validation is prompted within worker instructions.

08

Ui Cli Surface

Flow-Next — UI, CLI & Surface

Bundled CLI: flowctl

Location: plugins/flow-next/scripts/flowctl
Distribution: Bundled with the plugin — explicitly NOT installed globally. Skills must reference it as ${CLAUDE_PLUGIN_ROOT}/scripts/flowctl.

Key subcommands:

Command Purpose
flowctl show <id> --json Get spec/task metadata as JSON
flowctl cat <id> Get spec/task narrative markdown
flowctl done <task-id> Mark task complete, append done summary + evidence
flowctl review-backend Get configured cross-model review backend
flowctl memory list --json List all memory entries with metadata
flowctl memory search "<term>" --json Search memory by keyword
flowctl memory read <entry-id> Read specific memory entry
flowctl config get <key> --json Get config value
flowctl init Initialize .flow/ directory
flowctl migrate-rename --yes Migrate epic→spec rename (transactional with rollback)

The flowctl CLI is the single source of truth for task state — agents cannot modify task state without going through it.

Optional TUI: flow-next-tui

Package: @gmickel/flow-next-tui
Install: bun add -g @gmickel/flow-next-tui
Purpose: Terminal UI for monitoring Ralph's autonomous overnight runs
Type: Terminal TUI (not a web dashboard)

Website

  • flow-next.dev — narrative documentation site
  • Per-skill deep-dive pages

IDE Integration

Works within Claude Code, Codex, and Factory Droid native chat interfaces. No standalone IDE extension.

GitHub Integration

flow-next-make-pr calls gh pr create with a 9-stream cognitive-aid body.
flow-next-resolve-pr uses GitHub GraphQL API to reply to and resolve review threads.
flow-next-prime uses GitHub API to assess agent-readiness across 8 pillars (48 criteria).

Related frameworks

same archetype · same primary tool · same memory type

claude-mem (thedotmack) ★ 78k

Background worker service captures every tool call as an observation, AI-compresses sessions, and auto-injects relevant past…

pi (badlogic/earendil) ★ 55k

A minimal, hackable, multi-provider terminal coding agent that adapts to your workflows via npm-installable TypeScript Extensions…

Agent Skills (Addy Osmani) ★ 46k

Encodes senior-engineer software development lifecycle as 23 auto-routed skills and 7 slash commands for any AI coding agent.

wshobson/agents Plugin Marketplace ★ 36k

Single Markdown source for 83 domain-specialized plugins that auto-generates idiomatic artifacts for five AI coding harnesses.

TabbyML/Tabby ★ 34k

Self-hosted AI coding assistant server (alternative to GitHub Copilot) with admin dashboard, RAG-based completions, and multi-IDE…

Compound Engineering ★ 17k

Make each unit of engineering work compound into easier future work via brainstorm→plan→execute→review→learn cycles.