Open Cowork

open-cowork · OpenCoworkAI/open-cowork · ★ 1.4k · last commit 2026-05-25

Primitive shape 5 total

Skills 5

Summary

Open Cowork — Summary

Open Cowork is a free, open-source AI agent desktop application for Windows and macOS that wraps Claude Code (and other AI models) in a user-friendly GUI with one-click installation — no coding required. It is the community-maintained open-source implementation of Anthropic's Claude Cowork product. Key capabilities: VM-level sandbox isolation (WSL2 on Windows, Lima on macOS), a built-in Skills system for PPTX/DOCX/XLSX/PDF document generation, MCP integration for browser/Notion/desktop app connectivity, GUI automation via computer use, and remote control via Feishu (Lark) and Slack messaging bridges. Built on Electron + React/TypeScript, it ships as signed installers for both platforms (or via Homebrew on macOS). The .claude/skills/ directory ships five ready-made skills (pptx, docx, pdf, xlsx, skill-creator) each packaged as a SKILL.md with proprietary-licensed support scripts.

differs_from_seeds: Open Cowork is closest to agent-os in that it wraps an existing AI coding tool (Claude Code) for non-technical users, but Open Cowork goes far beyond agent-os's markdown-scaffold approach. It is a full Electron desktop application with a Feishu/Slack remote control bridge, VM isolation, and a Skills library with embedded document-generation tool chains. Unlike all seeds, Open Cowork targets knowledge workers (document creation, folder organization) rather than software developers; its distinctive position is the Slack/IM remote control bridge that makes it the only seed-era framework enabling asynchronous mobile tasking.

Overview

Open Cowork — Overview

Origin

Open Cowork is maintained by OpenCoworkAI on GitHub. It emerged as the open-source implementation of Anthropic's Claude Cowork product — a desktop AI agent that works on local files and apps. The project targets non-technical users on Windows and macOS who want AI assistance without any CLI or coding knowledge.

Philosophy

From the README:

"Open Cowork is a free, open-source AI agent desktop application for Windows and macOS. It wraps Claude Code, OpenAI, Gemini, DeepSeek, and other AI models into a user-friendly GUI with one-click installation — no coding required."

Key positioning versus named competitors:

	MCP & Skills	Remote Control	GUI Operation
Claude Cowork	✓	✗	✗
OpenClaw	✓	✓	✗
OpenCowork	✓	✓	✓

The distinctive claim is that Open Cowork adds GUI automation (computer use) and remote control via IM (Feishu/Lark, Slack) on top of what Claude Cowork and OpenClaw offer.

Target Users

Knowledge workers, not developers:

Folder organization and cleanup
Generate PPT/Word/Excel from existing files
GUI automation of desktop apps
Remote task assignment via Feishu/Slack from mobile

Security Philosophy

Sandbox is multi-level:

Basic: Path Guard — file ops restricted to chosen workspace folder
Enhanced (Windows): WSL2 — all Bash commands run in an isolated Linux VM
Enhanced (macOS): Lima — commands run in Ubuntu VM with /Users mounted
Fallback to native with path restrictions if no VM is available

Model Support

Multi-model: Claude (Anthropic), OpenAI-compatible APIs, GLM (Zhipu), MiniMax, Kimi — configured via API key + base URL + model name in Settings.

Architecture

Open Cowork — Architecture

Distribution

Type: Desktop application (Electron)
Platforms: Windows (.exe), macOS (.dmg, Apple Silicon + Intel)
License: MIT
Install options:
- macOS: brew tap OpenCoworkAI/tap && brew install --cask --no-quarantine open-cowork
- Direct installer download from GitHub Releases
- Build from source: npm install && npm run rebuild && npm run dev

Technology Stack

Frontend: Electron + React + TypeScript + Vite + Tailwind CSS
Backend: Embedded Node.js process (Electron main process)
AI layer: Wraps Claude Code CLI (primary) + direct API calls for other models
VM isolation: WSL2 (Windows) or Lima (macOS) — auto-detected, auto-managed
Skills: .claude/skills/<name>/SKILL.md format loaded by Claude Code

Directory Structure

open-cowork/
├── src/              # Electron main process + React renderer
├── .claude/          # Claude Code skills directory
│   └── skills/
│       ├── pptx/     # PowerPoint skill (html2pptx workflow)
│       ├── docx/     # Word document skill
│       ├── pdf/      # PDF generation/conversion
│       ├── xlsx/     # Excel spreadsheet skill
│       └── skill-creator/  # Skill for creating new skills
├── scripts/          # Build + install scripts
├── docs/             # Documentation
├── electron-builder.yml  # Desktop installer config
├── package.json      # npm dependencies
└── vite.config.ts    # Frontend build config

AI Model Routing

Claude Code (default): Anthropic, OpenRouter, Zhipu, MiniMax, Kimi
Direct API: OpenAI-compatible endpoints
GUI operation: Gemini-3-Pro recommended (superior computer-use capabilities)
Feishu/Lark bridge: asynchronous remote control

VM Isolation

Windows: WSL2 auto-detected; all Bash commands routed to Linux VM; workspace bidirectionally synced
macOS: Lima auto-detected; creates/manages claude-sandbox Lima VM; /Users mounted inside
Fallback: Native execution with path-based workspace restriction

Components

Open Cowork — Components

Skills (`.claude/skills/`)

Five built-in skills, each a directory with a SKILL.md and support scripts:

Skill	Purpose
`pptx`	PowerPoint creation/editing via html2pptx pipeline + OOXML direct editing
`docx`	Word document creation and manipulation
`pdf`	PDF generation and conversion
`xlsx`	Excel spreadsheet creation and editing
`skill-creator`	Meta-skill for creating new custom skills

All skills ship with YAML frontmatter (name, description, license). The pptx skill includes:

html2pptx.md — full html2pptx workflow documentation
css.md — CSS styling reference for presentations
ooxml.md — OOXML direct editing for complex cases
ooxml/scripts/ — helper Python scripts (unpack.py, etc.)

Note: Skills are licensed as Proprietary (with full terms in LICENSE.txt per skill), not MIT.

MCP Connectors

Configured by user in Settings UI:

Browser connector
Notion connector
Custom third-party MCP servers

Remote Control Bridge

Feishu (Lark): Event subscription bridge — incoming messages trigger agent tasks
Slack: Remote control via Slack workspace messages

GUI Automation

Computer use via model inference (screenshot → click/type). Recommended model: Gemini-3-Pro.

Real-Time Trace Panel

Shows AI reasoning and tool execution live in the UI sidebar.

Settings UI

Graphical configuration for:

API key + base URL + model
Workspace folder selection
MCP connectors
VM isolation mode
Skills management (custom skill creation/deletion)

Commands

No slash-commands. The interface is chat-based (user types natural language prompts).

Hooks

No Claude Code lifecycle hooks configured. The app wraps Claude Code but does not inject settings.json hooks.

Scripts

Build scripts in scripts/ (npm build, installer packaging). No agent-lifecycle scripts.

Prompts

Open Cowork — Prompts

Prompt Architecture

Open Cowork's behavioral "prompts" live in the SKILL.md files under .claude/skills/. These are loaded by Claude Code when a task matches the skill's trigger conditions.

Verbatim Excerpt 1 — PPTX SKILL.md Header

---
name: pptx
description: "Presentation creation, editing, and analysis. When Claude needs to work with presentations (.pptx files) for: (1) Creating new presentations, (2) Modifying or editing content, (3) Working with layouts, (4) Adding comments or speaker notes, or any other presentation tasks"
license: Proprietary. LICENSE.txt has complete terms
---

# PPTX creation, editing, and analysis

## Overview

Create, edit, or analyze the contents of .pptx files when requested. A .pptx file is essentially a ZIP archive containing XML files and other resources. Different tools and workflows are available for different tasks.

## CRITICAL: Read All Documentation First

**Before starting any presentation task**, read ALL relevant documentation files completely to understand the full workflow:

1. **For creating new presentations**: Read [`html2pptx.md`](html2pptx.md) and [`css.md`](css.md) in their entirety
2. **For editing existing presentations**: Read [`ooxml.md`](ooxml.md) in its entirety
3. **For template-based creation**: Read the relevant sections of this file plus [`css.md`](css.md)

**NEVER set any range limits when reading these files.** Understanding the complete workflow, constraints, and best practices before starting is essential for producing high-quality presentations. Partial knowledge leads to errors, inconsistent styling, and visual defects that require rework.

Technique: Front-loads a hard "CRITICAL" gate before any action — forces the agent to read full documentation before proceeding. Similar to superpowers' "Iron Law" pattern (pre-built rationalization: "Partial knowledge leads to errors, inconsistent styling, and visual defects that require rework").

Verbatim Excerpt 2 — PPTX XML Access Instructions

## Raw XML access

Use raw XML access for: comments, speaker notes, slide layouts, animations, design elements, and complex formatting. To access these features, unpack a presentation and read its raw XML contents.

#### Unpacking a file

`python3 ooxml/scripts/unpack.py <office_file> <output_dir>`

#### Key file structures

- `ppt/presentation.xml` - Main presentation metadata and slide references
- `ppt/slides/slide{N}.xml` - Individual slide contents
- `ppt/notesSlides/notesSlide{N}.xml` - Speaker notes for each slide
- `ppt/slideLayouts/` - Layout templates for slides
- `ppt/theme/theme1.xml` - Theme and styling information

Technique: Tool-path specification — tells the model exactly which scripts to call and what file paths to check. Converts ambiguous "edit this presentation" into a deterministic workflow. This is a production-grade skill design.

Skill Trigger

Skills are triggered by Claude Code's autonomous skill-matching — when the task description matches the SKILL.md description field, Claude Code reads the full SKILL.md and follows its instructions. No explicit /pptx command required.

Uniqueness

Open Cowork — Uniqueness

differs_from_seeds

Open Cowork is closest to agent-os in that it wraps Claude Code for less-technical users with an install-and-go philosophy. However, agent-os is a bare bash script that installs markdown templates, while Open Cowork is a full Electron desktop application with VM isolation, a visual Trace Panel, and an IM remote control bridge. The Feishu/Slack remote control bridge is architecturally absent from every seed — it is the only entry in this batch (or the seeds) that lets users assign tasks from a mobile IM app and receive results without being at their desktop. The proprietary-licensed SKILL.md packages (pptx, docx, pdf, xlsx) are a production-quality document generation toolkit absent from all seeds. Compared to superpowers (skills-only framework injected via Claude Code plugin), Open Cowork adds a full GUI layer, VM isolation, and IM bridges that superpowers completely lacks. Open Cowork is also the only framework in this batch explicitly targeting knowledge workers (PPT creation, folder cleanup) rather than software developers.

Positioning

Open Cowork positions itself explicitly as the open-source Claude Cowork with three features Claude Cowork lacks: remote IM control, GUI automation (computer use), and a community Skills library.

Observable Failure Modes

Proprietary skill licenses: The PPTX/DOCX/XLSX/PDF skills ship with proprietary licenses (not MIT). Users cannot freely modify or redistribute the tool scripts.
VM dependency for sandboxing: Meaningful isolation requires WSL2 or Lima — not available by default on all machines, especially enterprise-locked Windows environments.
Single-model per session: Model selection is global, not per-task. Using Gemini for GUI work and Claude for coding requires switching settings manually.
No session persistence: No JSONL session store. Closing the app loses conversation history.
Feishu/Slack bridge complexity: The README documents the feature but setup requires external service configuration (bot tokens, event subscriptions) with no guided wizard in the current version.

Workflow

Open Cowork — Workflow

User Workflow

Step	Description	Artifact
1. Install	Download installer or brew install	Desktop app running
2. Configure	Enter API key, model, base URL in Settings	Config saved
3. Select workspace	Choose folder where AI is allowed to work	Workspace path
4. Prompt	Type task in chat input (can drag & drop files/images)	Task message
5. Agent executes	Claude Code (or chosen model) runs with selected tools	Tool call trace
6. VM isolation	Bash commands routed through WSL2/Lima if available	Isolated execution
7. Result	Files written to workspace, document downloaded, or remote reply sent	Output files

Remote Control Workflow (Feishu/Slack)

Connect Feishu/Lark or Slack bridge in Settings
Send message from mobile/web to bot
Agent processes task asynchronously
Reply delivered back to IM channel

Document Generation Workflow (via PPTX skill)

User asks: "Create a PowerPoint from my financial_report.csv"
Agent triggers PPTX skill (matched by SKILL.md description field)
Skill's html2pptx pipeline runs: CSV → HTML → PPTX via Python scripts
PPTX file written to workspace
User downloads file from UI

Approval Gates

No formal approval gates. The chat interface is the only interaction point. VM isolation provides a safety boundary instead of user approval prompts.

Spec Format

None. Tasks are natural language chat messages, not structured specs.

Memory Context

Open Cowork — Memory & Context

Memory Type

File-based — workspace folder is the persistent state. Claude Code's own context management handles in-session memory.

Context Compaction

Handled by Claude Code's built-in compaction (if using Claude Code backend) or by the model's native context limits (if using direct API mode).

Cross-Session State

Files written to the workspace persist across sessions
No framework-level session store (no JSONL sessions, no SQLite)
The IM bridge (Feishu/Slack) creates conversational history within the IM platform

Remote Context Injection

Open Cowork injects workspace context into prompts via the Claude Code CLAUDE.md mechanism. The skills system adds skill-specific instructions to the context when triggered.

Real-Time Trace

The Trace Panel in the UI shows live reasoning and tool execution. This is observability, not memory — not persisted.

No Explicit Memory Architecture

Open Cowork does not ship a memory system. It relies entirely on:

Files in the workspace folder for persistent state
The IM platform's conversation history for cross-device continuity
Claude Code's native context management for in-session state

Orchestration

Open Cowork — Orchestration

Multi-Agent Support

No explicit multi-agent orchestration. Open Cowork is a single-agent desktop application — one Claude Code session per workspace.

Execution Mode

Interactive loop — the user sends a message, the agent responds. This is not a background daemon; the agent runs while the app is open and a task is active.

Isolation Mechanism

VM (virtual machine) — WSL2 on Windows, Lima on macOS. All Bash commands execute inside an isolated Linux VM. This is more than a git-worktree or sandbox API — it is actual OS-level process isolation.

Fallback: Path Guard (file operations restricted to workspace directory) when no VM is available.

Multi-Model

Yes — different models can be configured for different use cases:

Claude/Anthropic: coding, general tasks
GLM/MiniMax/Kimi: cost-optimized Chinese models
Gemini-3-Pro: recommended for GUI automation (computer use)
OpenAI-compatible: fallback

Model selection is per-session in Settings, not per-task.

Orchestration Pattern

None for multi-agent. Single agent, sequential tool execution.

Remote Control Bridge

The Feishu/Slack bridge is event-driven — incoming IM messages trigger agent sessions. This is closest to event-driven execution mode for the remote control use case.

Consensus Mechanism

None.

Ui Cli Surface

Open Cowork — UI / CLI Surface

Desktop Application

Full Electron desktop app — the primary surface.

Feature	Description
Chat interface	Natural language task entry with drag & drop file/image support
Real-time Trace Panel	Live view of AI reasoning and tool calls
Settings UI	API key, model, base URL, MCP connectors, workspace, VM mode
Skills management	Install/delete custom skills
MCP connector UI	Add/configure MCP servers (browser, Notion, custom)

Install

macOS: brew tap OpenCoworkAI/tap && brew install --cask --no-quarantine open-cowork
Windows/macOS: Direct .exe/.dmg installer from GitHub Releases
Build from source: npm install && npm run rebuild && npm run dev

No CLI Binary

No dedicated CLI binary. The app is GUI-first.

Remote Control Surfaces

Feishu (Lark): IM bridge — assign tasks via Feishu messages, receive results back
Slack: Same pattern via Slack workspace integration

Port / Web Dashboard

No local web dashboard. The Electron app is the full UI. There is no localhost:NNNN web surface.

Multimodal Input

Drag and drop files and images directly into chat. Vision-capable models (Gemini-3-Pro, Claude 3.x) can process images.

Build System

Electron + Vite + React + TypeScript
electron-builder.yml for signed platform installers
npm run build produces .exe (Windows) and .dmg (macOS)

Related frameworks

same archetype · same primary tool · same memory type

Goose (Block/AAIF) ★ 46k

A12 UI passthrough

General-purpose AI agent (not just code) with security-first tool inspection, recipe-based shareable configurations, and 15+ LLM…

Vibe Kanban ★ 27k

A12 UI passthrough

Eliminate the overhead of planning, switching between agent terminals, and reviewing diffs by providing a single web dashboard…

1Code ★ 5.5k

A12 UI passthrough

Cursor-like desktop experience for Claude Code and Codex with cloud background agents, event-driven automations, and a full…

Crystal (stravu) ★ 3.1k

A12 UI passthrough

Manage multiple parallel AI coding sessions in isolated git worktrees from a single desktop GUI.

Maestro (RunMaestro) ★ 3.0k

A12 UI passthrough

Orchestrate unlimited parallel AI agent sessions with a keyboard-first desktop app including Group Chat coordination and Auto Run…

AgentsMesh ★ 2.1k

A12 UI passthrough

Multi-tenant workforce platform that gives every team member a squad of AI coding agents coordinated through channels, pod…

Distribution

Type: desktop-app
License: MIT
Install: one-liner
Version: unknown (active development, no semver tag in README)

Surfaces

CLI binary: No
CLI subcmds: 0
Local UI: desktop-app
Tech stack: Electron, React, TypeScript, Vite, Tailwind CSS

Components

Commands: 0
Skills: 5
Subagents: 0
Hooks: 0
MCP servers: 0
MCP tools: 0
Scripts: 3
Templates: 5

Workflow

Phases: 7
Approval gates: 0
Spec format: none
Spec storage: none
Delta or full: none

Orchestration

Multi-agent: No
Pattern: none
Max concurrent: 1
Isolation: microvm
Consensus: none
Prompt chaining: No

Multi-model

Multi-model: Yes
BYOK: Yes
Modal: text+vision

Execution

Mode: interactive-loop
Crash recovery: No
Compaction: Yes
Session handoff: No
Streaming: Yes

Memory

Type: file-based
Persistence: project
Search: none
State files: 1 file

Quality

TDD: No
TDD mechanism: none
Self-review: none

Git / Observability

Auto commit: No
Auto PR: No
Auto merge: No
Worktree/feat: No
Audit log: No
Audit format: none
Replay: No

Tools

Primary: claude-code
Targets: 6
Portability: medium

Signals

Stars: 1.4k
Last commit: 2026-05-25
Maintainer: active
Quality score: 2.5/10

Summary

Open Cowork — Summary

Overview

Open Cowork — Overview

Origin

Philosophy

Target Users

Security Philosophy

Model Support

Architecture

Open Cowork — Architecture

Distribution

Technology Stack

Directory Structure

AI Model Routing

VM Isolation

Components

Open Cowork — Components

Skills (.claude/skills/)

MCP Connectors

Remote Control Bridge

GUI Automation

Real-Time Trace Panel

Settings UI

Commands

Hooks

Scripts

Prompts

Open Cowork — Prompts

Prompt Architecture

Verbatim Excerpt 1 — PPTX SKILL.md Header

Verbatim Excerpt 2 — PPTX XML Access Instructions

Skill Trigger

Uniqueness

Open Cowork — Uniqueness

differs_from_seeds

Positioning

Observable Failure Modes

Workflow

Open Cowork — Workflow

User Workflow

Remote Control Workflow (Feishu/Slack)

Document Generation Workflow (via PPTX skill)

Approval Gates

Spec Format

Memory Context

Open Cowork — Memory & Context

Memory Type

Context Compaction

Cross-Session State

Remote Context Injection

Real-Time Trace

No Explicit Memory Architecture

Orchestration

Open Cowork — Orchestration

Multi-Agent Support

Execution Mode

Isolation Mechanism

Multi-Model

Orchestration Pattern

Remote Control Bridge

Consensus Mechanism

Ui Cli Surface

Open Cowork — UI / CLI Surface

Desktop Application

Install

No CLI Binary

Remote Control Surfaces

Port / Web Dashboard

Multimodal Input

Build System

Related frameworks

Skills (`.claude/skills/`)