Batch 32 — Verification, Review Surfaces, Eval Marketplaces + Secondary Isolation
Roster (7)
| slug | stars | distribution | cli_binary | local_ui | orchestration | multi_model | tier |
|---|---|---|---|---|---|---|---|
| vet-imbue | 385 | cli-tool (PyPI) | vet | none | none | yes (BYOK) | A |
| clearwing | 982 | standalone-repo | clearwing (20 subcmds) | web-dashboard + TUI | hierarchical | yes (4 roles) | A |
| eval-marketplace | 22 | claude-plugin (deprecated) | none | none | sequential | no | B |
| aiignore-cli | 8 | npm-package | aiignore (4 subcmds) | none | none | no (no LLM) | A |
| crit-review | 350 | cli-tool (Go binary) | crit (10 subcmds) | web-dashboard (local) | sequential | no | A |
| sd0x-dev-flow | 157 | claude-plugin | none | none | hierarchical | yes (opus subagent) | A |
| mirrord | 5,089 | cli-tool + vscode-ext | mirrord (5 subcmds) | vscode-extension | none | no | A |
Intra-batch Patterns
All 7 frameworks address the verification/trust problem at different levels of the stack, but through architecturally distinct mechanisms. Vet and Clearwing both use LLMs to review artifacts (code diffs vs source code for vulnerabilities), but Vet is lightweight and cross-agent while Clearwing is a deep security research platform. Crit and sd0x-dev-flow both enforce human-or-mechanical review gates, but crit does so through a browser UI requiring explicit human action, while sd0x-dev-flow does so through hooks that parse stdout sentinel markers. Three frameworks (vet-imbue, aiignore-cli, crit-review) achieve high cross-tool portability (7–13 agent integrations) by using plain-file protocols instead of agent-specific APIs. mirrord is architecturally orthogonal to all others — it operates at the syscall/OS level rather than the prompt/file level.
The "human-in-loop vs machine-checkable" comparison axis from the batch brief: crit is pure human-in-loop (browser gate, no automation); vet is pure machine-checkable (exit code 10 = issues found, CI-compatible); sd0x-dev-flow sits between them (hooks mechanically enforce that the human-readable review runs, but the review itself is agent-generated). eval-marketplace and clearwing both use LLM analysis to generate human-readable reports that humans then act on.
Most Interesting Finds
sd0x-dev-flow: The sentinel-driven state machine (
✅ Ready/⛔ Blockedstdout markers parsed by PostToolUse hooks) is a genuinely novel harness engineering primitive. No other framework in this corpus implements review-gate state as parseable stdout rather than file state or LLM memory. The explicit naming of 10 canonical harness sub-problems with code evidence for each makes this the clearest reference implementation of harness engineering theory encountered in Phase B.crit-review: The
PermissionRequest/ExitPlanModehook with a 4-day timeout is architecturally significant — it is the only example in this corpus of hooking Claude Code's plan-mode exit as a mandatory human review gate (rather than a Stop gate or PostToolUse gate). The persistent round-to-round diff withdrifted: truedetection for stale comments is production-quality engineering that other review tools lack.
Items Written as Tier C
None. All 7 frameworks had sufficient public material for full 11-file reports.
Cross-references Discovered
- vet-imbue installs skills into
.claude/,.codex/,.opencode/,.agents/simultaneously — the same multi-harness install pattern used by aiignore-cli (which generates ignore files for 9 tools simultaneously). Both are "do the research once, apply across all tools" frameworks. - crit-review and sd0x-dev-flow both enforce review gates, but through opposite mechanisms: crit requires a human browser click while sd0x-dev-flow requires sentinel stdout markers. Neither references the other.
- eval-marketplace is deprecated in favor of
jeredblu-marketplace— the canonical successor repo was not in this batch but should be flagged for a future analysis. - mirrord's Agent Skills (in
metalbear-co/skills) follow the same SKILL.md format as superpowers (seed: Archetype 1), confirming that the Agent Skills standard has been adopted beyond the frameworks that invented it. - clearwing explicitly positions itself as an open-source reimplementation of Anthropic's internal Glasswing tool — a direct named relationship to a closed-source Anthropic product.