Skip to content
/

Cosine

cosine · cosine.sh/

Primitive shape
No installable primitives
00

Summary

Cosine — Summary

Cosine is a UK-based AI coding company (cosine.sh) whose primary product is Lumen — a family of production-first coding models optimized for maintainability, minimal diffs, and real engineering work in enterprise and niche languages (C, R, Matlab, Fortran, Verilog, Rust). The product has three surfaces: Cosine Desktop (native app for active engineering with multi-window workflows and visible agent state), Cosine Cloud (parallel cloud execution for long-running tasks), and Cosine CLI (cos binary, installable via Homebrew). An enterprise air-gapped deployment option exists. Cosine received UK government backing as part of a £675M sovereign AI strategy in April 2026.

Cosine's distinctive architectural bet is "production-first" models: trained with an 8-step data pipeline that produces verifiable training trajectories, with behavioral RL on "scope discipline, honesty, evidence, and plan calibration." The benchmarks (Niche-Bench, Vibe-Bench, Slop-Bench) are proprietary custom benchmarks testing characteristics mainstream benchmarks miss.

Closest seed: kiro — both are closed-source commercial IDE/agent products with proprietary model-model integration. Cosine differs by having its own first-party models (Lumen family) rather than routing to 3rd-party LLMs, and by being cloud-hybrid rather than local-only.

01

Overview

Cosine — Overview

Origin

UK-based AI company. Tagline: "The copilot era is over." Received UK Government backing in April 2026 as part of a £675M sovereign AI fund. The company has a news presence including coverage in Wired and Tech.eu.

Philosophy

From homepage: "Hand off complex coding tasks without sacrificing maintainability or visibility."

Cosine positions itself against the "copilot era" — interactive line-by-line AI completion — and toward autonomous task delegation. The Lumen model family is explicitly described as "production-first" with three training goals:

  1. Eliminating Slop: "Trained to avoid duplication, dead code, and unnecessary complexity. Optimized for minimal diffs that match your codebase's style."
  2. Beyond Mainstream Languages: "Post-trained for enterprise and niche languages including C, R, Matlab, Fortran, Verilog, and Rust."
  3. Perfecting Vibe: "Behavioral RL on scope discipline, honesty, evidence, and plan calibration. Less filler, more clarity."

Model Benchmarks

Cosine publishes three custom benchmarks:

  • Niche-Bench: Performance on rare/enterprise languages
  • Vibe-Bench: Behavioral quality (scope, honesty, calibration)
  • Slop-Bench: Code quality / anti-slop metric
  • Cost per successful task: Lumen Outpost shows competitive cost efficiency

Lumen Outpost benchmark results (from homepage):

  • Niche-Bench: 53.9% (vs GPT-5.4: 44.9%, Gemini 3.1 Pro: 42.4%)
  • Vibe-Bench: 31.9% (vs GPT-5.5: 29.4%, GPT-5.4: 27.9%)

Enterprise Deployment Options

  1. Public cloud (standard): Fully managed, fastest onboarding
  2. Dedicated tenant (managed single-tenant): Private environment, enterprise controls
  3. Air-gapped (fully isolated): Runs inside customer's security perimeter, no external dependencies
02

Architecture

Cosine — Architecture

Distribution

  • Type: SaaS + desktop app + CLI (proprietary)
  • Models: Lumen family (proprietary, first-party)
  • Closed-source — no public GitHub repository

Surfaces (3)

  1. Cosine Desktop — native desktop app (macOS, Windows, Linux)

    • Multi-window workflows
    • Visible agent state
    • Reviewable changes
    • Bridge to cloud execution
  2. Cosine Cloud — cloud execution backend

    • Run work in parallel
    • Long-running tasks
    • Shared workspace for teams (engineers, PMs, stakeholders)
    • Plan/review/ship interface
  3. Cosine CLIcos binary

    • Install: brew install CosineAI/tap/cos (macOS)
    • Also: Linux and Windows installers
    • Terminal-native workflow
    • Local-to-remote execution handoff
    • MCP tool access from terminal
    • Multi-agent orchestration from CLI

Enterprise Architecture

  • Public cloud: Standard managed service
  • Dedicated tenant: Private cloud environment with enterprise controls
  • Air-gapped: Customer's own infrastructure, no external dependencies, option to fine-tune/post-train on internal codebases

Lumen Model Family

  • "Production-first coding models built for real engineering work"
  • 8-step data pipeline generating verifiable training trajectories
  • Behavioral RL training
  • "Lumen Outpost" is the model variant benchmarked publicly

Config Files

Unknown — closed-source, no public documentation of file conventions

03

Components

Cosine — Components

Note on Limited Public Documentation

Cosine is closed-source with no public SDK or API documentation (as of analysis date). The following is derived from the public marketing site.

CLI Tool (cos)

  • Binary: cos
  • Install: brew install CosineAI/tap/cos
  • Capabilities (from marketing): local-to-remote execution, MCP tool access, project context, multi-agent orchestration
  • Subcommands: unknown

Desktop Application

  • Available for macOS (download from cosine.sh), Windows, Linux
  • Features (from marketing):
    • Multi-window workflows for active engineering
    • Visible agent state (transparency during execution)
    • Reviewable changes (diff view)
    • Bridge to cloud execution (hand off local work to cloud)

Cloud Platform

  • Session types: Parallel sessions for independent tasks
  • Collaboration: Shared workspace for engineers, PMs, stakeholders
  • Long-running tasks: Cloud keeps going when local session ends

Lumen Models

  • Lumen Outpost — the benchmarked variant (may be the production model name)
  • Training: 8-step data pipeline, behavioral RL
  • Specializations: niche languages (C, R, Matlab, Fortran, Verilog, Rust)

MCP Integration

  • CLI supports MCP tool access (mentioned in marketing copy)
  • Unknown which MCP servers are bundled vs. user-provided

VS Code Extension

  • A VS Code extension is mentioned in the marketing copy under "Work where you already are"
  • No public extension listing found
05

Prompts

Cosine — Prompts

Note

Cosine is closed-source. No internal system prompts or configuration files are publicly available.

Model Training Philosophy (from public documentation)

The following are the documented model training objectives for the Lumen family — these represent the closest public articulation of Cosine's "prompt engineering" at the model level:

From cosine.sh homepage:

"Eliminating Slop: Trained to avoid duplication, dead code, and unnecessary complexity. Optimized for minimal diffs that match your codebase's style and stay maintainable."

"Beyond Mainstream Languages: Post-trained for enterprise and niche languages including C, R, Matlab, Fortran, Verilog, and Rust. Our 8-step data pipeline turns real production code into verifiable training trajectories."

"Perfecting Vibe: Behavioral RL on scope discipline, honesty, evidence, and plan calibration. Less filler, more clarity, the judgment of a competent colleague."

Technique analysis: These represent RLHF/behavioral RL objectives rather than system prompt instructions — Cosine's approach is to bake these properties into the model weights rather than prompt them at inference time. The "judgment of a competent colleague" framing suggests persona-level behavioral training.

Benchmark Methodology (proxies for prompting priorities)

The existence of proprietary benchmarks (Niche-Bench, Vibe-Bench, Slop-Bench) suggests Cosine uses these as training signal proxies. The "Cost per successful task" metric implies an efficiency-optimized evaluation criterion rather than pure quality maximization.

09

Uniqueness

Cosine — Uniqueness

Differs From Seeds

Closest to kiro (both closed-source, commercial IDE/agent products), but Cosine's distinguishing architectural bet is having first-party proprietary models (Lumen family) rather than routing to OpenAI/Anthropic/Google. No seed framework builds its own models. The "production-first" training philosophy (minimal diffs, slop elimination, niche language post-training) is a model-level differentiation that no framework in the corpus — open or closed — attempts. The three-surface strategy (CLI + Desktop + Cloud with local→cloud handoff) is similar to Factory Droid but Cosine adds the proprietary model layer.

Distinctive Opinion

The copilot era is over. Production-grade AI coding requires models trained specifically on real engineering outcomes (maintainability, niche languages, scope discipline) rather than benchmark performance on mainstream tasks.

Positioning

  • UK sovereign AI strategy backing — unique government relationship in the corpus
  • Primary competitors: Devin, Factory Droid (cloud coding agents), Cursor (IDE)
  • Key differentiator vs. all others: first-party Lumen models trained on production-code trajectories

Observable Failure Modes (implied from positioning)

  • Closed-source model dependency — customers can't audit model behavior
  • "Air-gapped" deployment requires Cosine to maintain separate model infrastructure per customer
  • Niche language benchmarks may not generalize to all enterprise codebases

Cross-References

  • Factory Droid — similar three-surface strategy (CLI + desktop app + cloud)
  • Devin — similar cloud execution model for long-running tasks
  • UK Government sovereign AI strategy — analogous government backing to US DoD/DARPA programs
04

Workflow

Cosine — Workflow

High-Level Workflow (from marketing)

1. Desktop: Start session in Cosine app
   - Multi-window for active engineering
   - Agent state visible during execution
   - Review changes before applying
   
2. Cloud: Hand off to parallel cloud execution
   - Long-running tasks continue in cloud
   - Team collaborates on plan/review/ship
   - Multiple parallel sessions

3. CLI: Terminal-native workflow
   - cos <task>
   - Local execution with remote handoff option
   - MCP tools available from terminal

Planning Workflow

  • Cosine appears to have a collaborative planning feature where team members (engineers, PMs, stakeholders) can view and coordinate on work in the cloud workspace

Approval Gates

Unknown — not documented publicly. Marketing copy mentions "reviewable changes" in Desktop suggesting a human review step.

Typical Use Cases (from marketing)

  • "Hand off complex coding tasks without sacrificing maintainability or visibility"
  • Task delegation for multi-session parallel work
  • Long-running migrations/refactors handed to cloud

Session Handoff

  • Local (Desktop) → Cloud handoff for long-running work
  • CLI → Cloud handoff implied
06

Memory Context

Cosine — Memory & Context

Unknown — Cosine is closed-source and has not published documentation about its context management, memory, or state storage mechanisms.

Inferred from Marketing

  • Project context: The CLI is described as providing "project context" from the terminal, suggesting some form of codebase indexing
  • Cloud persistence: Long-running cloud sessions imply session state is persisted server-side
  • Team collaboration: Shared cloud workspace implies some shared state/context for team members

Cross-Session Handoff

Unknown — inferred yes (cloud execution persists beyond local session, but no technical documentation)

07

Orchestration

Cosine — Orchestration

Largely unknown — closed-source.

Inferred from Marketing Copy

  • Parallel execution: "Run work in parallel" in Cloud description implies multi-agent or multi-session parallel dispatch
  • Local→Cloud handoff: Sessions can be handed off from Desktop/CLI to Cloud for long-running execution
  • Multi-agent orchestration: CLI marketing mentions "multi-agent orchestration from the command line"

Execution Mode

Interactive-loop + background-daemon (inferred) — Desktop is interactive; Cloud supports long-running background tasks

Isolation

Unknown — likely cloud sandboxing for cloud execution; local for Desktop/CLI

08

Ui Cli Surface

Cosine — UI & CLI Surface

CLI (cos)

  • Binary: cos
  • Install: brew install CosineAI/tap/cos (macOS)
  • Windows/Linux: Separate installers (docs at cosine.sh)
  • Capabilities: Terminal-native workflow, local-to-remote handoff, MCP tools, multi-agent orchestration

Desktop Application

  • Platform: macOS, Windows, Linux
  • Download: cosine.sh (direct download)
  • Features: Multi-window workflows, visible agent state, reviewable changes, cloud bridge

Cloud Web Interface

  • URL: cosine.sh (login at cloud endpoint)
  • Features: Parallel sessions, team collaboration (engineers + PMs + stakeholders), plan/review/ship

VS Code Extension

  • Mentioned as "work where you already are" surface
  • No public extension listing confirmed

Documentation

  • Docs URL: cosine.sh/docs (mentioned in marketing)
  • Content not publicly accessible without login

Related frameworks

same archetype · same primary tool · same memory type

Goose (Block/AAIF) ★ 46k

General-purpose AI agent (not just code) with security-first tool inspection, recipe-based shareable configurations, and 15+ LLM…

Vibe Kanban ★ 27k

Eliminate the overhead of planning, switching between agent terminals, and reviewing diffs by providing a single web dashboard…

1Code ★ 5.5k

Cursor-like desktop experience for Claude Code and Codex with cloud background agents, event-driven automations, and a full…

Crystal (stravu) ★ 3.1k

Manage multiple parallel AI coding sessions in isolated git worktrees from a single desktop GUI.

Maestro (RunMaestro) ★ 3.0k

Orchestrate unlimited parallel AI agent sessions with a keyboard-first desktop app including Group Chat coordination and Auto Run…

AgentsMesh ★ 2.1k

Multi-tenant workforce platform that gives every team member a squad of AI coding agents coordinated through channels, pod…