RUBRIC-v3 — The evaluation schema
Every framework in the corpus was analyzed against RUBRIC-v3, a 25-dimension engineering schema. It deliberately excludes pricing, sentiment, and closed-source signals — those are derivable noise, not architectural differentiators. The schema was finalized in Phase B and extended with 5 new dimensions in Phase D (cross-vendor model routing, compiled enforcement, self-evolving plugins, token-efficient encoding, LLM-evaluated hooks).
What changed from v2
Dropped from v2: pricing_model, closed_source_ide,
security_incidents, token_cost_concerns,
reddit_sentiment, hn_sentiment. Added in v3:
cli_binary, local_ui, multi_model,
orchestration_pattern, isolation_mechanism,
scripts_vs_hooks, subagent_definition_format,
auto_validators, git_automation, audit_log,
execution_mode, cross_tool_portability.
Output structure per framework (11 files)
Each analysis lives under spec-system/<slug>/ with 11 files:
| # | File | Content focus |
|---|---|---|
| 1 | 00-summary.md | 5–7 sentence elevator pitch |
| 2 | 01-overview.md | Origin, philosophy, manifesto-style quotes verbatim |
| 3 | 02-architecture.md | Distribution, install, dir tree, deps, target AI tools |
| 4 | 03-components.md | Every named primitive (cmd / skill / agent / hook / mcp / script / template) |
| 5 | 04-workflow.md | Phases + artifacts per phase + approval gates |
| 6 | 05-prompts.md | Verbatim excerpts from ≥2 key prompt files |
| 7 | 06-memory-context.md | State storage, persistence, compaction, handoffs |
| 8 | 07-orchestration.md | Multi-agent pattern, multi-model routing, isolation, execution mode |
| 9 | 08-ui-cli-surface.md | Dedicated CLI binary, local UI/dashboard, IDE integration, observability |
| 10 | 09-uniqueness.md | differs_from_seeds paragraph + positioning + failure modes |
| 11 | METRICS.yaml | Machine-readable, v3 schema exactly |
The 25 engineering dimensions
distribution_typeinstall_complexityrequired_runtimeconfig_files cli_binary (exists, name, subcommands)local_ui (type, port, stack, features) commandsskillssubagentshooksscriptsmcp_serverstemplates workflow_phasesapproval_gatesspec_formatdelta_or_whole_file_specs orchestration_patternisolation_mechanismmax_concurrent_agentsprompt_chaining_pattern multi_modelmodel_role_mappingsupports_byoklocked_to_model execution_modecrash_recoverycontext_compaction_handlingcross_session_handoff memory_typememory_persistencestate_filessearch_mechanism tdd_enforcedtdd_enforcement_mechanismauto_validatorsself_review_pattern commits_automaticallycreates_pr_automaticallymerges_automaticallyworktree_per_feature audit_logaudit_log_formatreplay_capability target_toolsprimary_toolcross_tool_portability What "good" looks like
A great v3 report answers these questions without reading prose:
- Does it ship a CLI? What's the binary? (
cli_binary.name) - Does it have a local dashboard? What port and stack? (
local_ui) - How many distinct prompts does it actually contain? (
commands.count + skills.count + subagents.count) - Can multiple agents run? With what coordination? (
orchestration_pattern) - Does it route different models for different jobs? (
model_role_mapping) - What does it auto-validate before saying done? (
auto_validators) - Where does state live? (
memory_type,state_files) - Which AI tools does it work with? (
target_tools,cross_tool_portability) - What does it do automatically with git? (
commits_automatically, etc.)