Skip to content
/

Open Cowork

open-cowork · OpenCoworkAI/open-cowork · ★ 1.4k · last commit 2026-05-25

Primitive shape 5 total
Skills 5
00

Summary

Open Cowork — Summary

Open Cowork is a free, open-source AI agent desktop application for Windows and macOS that wraps Claude Code (and other AI models) in a user-friendly GUI with one-click installation — no coding required. It is the community-maintained open-source implementation of Anthropic's Claude Cowork product. Key capabilities: VM-level sandbox isolation (WSL2 on Windows, Lima on macOS), a built-in Skills system for PPTX/DOCX/XLSX/PDF document generation, MCP integration for browser/Notion/desktop app connectivity, GUI automation via computer use, and remote control via Feishu (Lark) and Slack messaging bridges. Built on Electron + React/TypeScript, it ships as signed installers for both platforms (or via Homebrew on macOS). The .claude/skills/ directory ships five ready-made skills (pptx, docx, pdf, xlsx, skill-creator) each packaged as a SKILL.md with proprietary-licensed support scripts.

differs_from_seeds: Open Cowork is closest to agent-os in that it wraps an existing AI coding tool (Claude Code) for non-technical users, but Open Cowork goes far beyond agent-os's markdown-scaffold approach. It is a full Electron desktop application with a Feishu/Slack remote control bridge, VM isolation, and a Skills library with embedded document-generation tool chains. Unlike all seeds, Open Cowork targets knowledge workers (document creation, folder organization) rather than software developers; its distinctive position is the Slack/IM remote control bridge that makes it the only seed-era framework enabling asynchronous mobile tasking.

01

Overview

Open Cowork — Overview

Origin

Open Cowork is maintained by OpenCoworkAI on GitHub. It emerged as the open-source implementation of Anthropic's Claude Cowork product — a desktop AI agent that works on local files and apps. The project targets non-technical users on Windows and macOS who want AI assistance without any CLI or coding knowledge.

Philosophy

From the README:

"Open Cowork is a free, open-source AI agent desktop application for Windows and macOS. It wraps Claude Code, OpenAI, Gemini, DeepSeek, and other AI models into a user-friendly GUI with one-click installation — no coding required."

Key positioning versus named competitors:

MCP & Skills Remote Control GUI Operation
Claude Cowork
OpenClaw
OpenCowork

The distinctive claim is that Open Cowork adds GUI automation (computer use) and remote control via IM (Feishu/Lark, Slack) on top of what Claude Cowork and OpenClaw offer.

Target Users

Knowledge workers, not developers:

  • Folder organization and cleanup
  • Generate PPT/Word/Excel from existing files
  • GUI automation of desktop apps
  • Remote task assignment via Feishu/Slack from mobile

Security Philosophy

Sandbox is multi-level:

  1. Basic: Path Guard — file ops restricted to chosen workspace folder
  2. Enhanced (Windows): WSL2 — all Bash commands run in an isolated Linux VM
  3. Enhanced (macOS): Lima — commands run in Ubuntu VM with /Users mounted
  4. Fallback to native with path restrictions if no VM is available

Model Support

Multi-model: Claude (Anthropic), OpenAI-compatible APIs, GLM (Zhipu), MiniMax, Kimi — configured via API key + base URL + model name in Settings.

02

Architecture

Open Cowork — Architecture

Distribution

  • Type: Desktop application (Electron)
  • Platforms: Windows (.exe), macOS (.dmg, Apple Silicon + Intel)
  • License: MIT
  • Install options:
    • macOS: brew tap OpenCoworkAI/tap && brew install --cask --no-quarantine open-cowork
    • Direct installer download from GitHub Releases
    • Build from source: npm install && npm run rebuild && npm run dev

Technology Stack

  • Frontend: Electron + React + TypeScript + Vite + Tailwind CSS
  • Backend: Embedded Node.js process (Electron main process)
  • AI layer: Wraps Claude Code CLI (primary) + direct API calls for other models
  • VM isolation: WSL2 (Windows) or Lima (macOS) — auto-detected, auto-managed
  • Skills: .claude/skills/<name>/SKILL.md format loaded by Claude Code

Directory Structure

open-cowork/
├── src/              # Electron main process + React renderer
├── .claude/          # Claude Code skills directory
│   └── skills/
│       ├── pptx/     # PowerPoint skill (html2pptx workflow)
│       ├── docx/     # Word document skill
│       ├── pdf/      # PDF generation/conversion
│       ├── xlsx/     # Excel spreadsheet skill
│       └── skill-creator/  # Skill for creating new skills
├── scripts/          # Build + install scripts
├── docs/             # Documentation
├── electron-builder.yml  # Desktop installer config
├── package.json      # npm dependencies
└── vite.config.ts    # Frontend build config

AI Model Routing

  • Claude Code (default): Anthropic, OpenRouter, Zhipu, MiniMax, Kimi
  • Direct API: OpenAI-compatible endpoints
  • GUI operation: Gemini-3-Pro recommended (superior computer-use capabilities)
  • Feishu/Lark bridge: asynchronous remote control

VM Isolation

  • Windows: WSL2 auto-detected; all Bash commands routed to Linux VM; workspace bidirectionally synced
  • macOS: Lima auto-detected; creates/manages claude-sandbox Lima VM; /Users mounted inside
  • Fallback: Native execution with path-based workspace restriction
03

Components

Open Cowork — Components

Skills (.claude/skills/)

Five built-in skills, each a directory with a SKILL.md and support scripts:

Skill Purpose
pptx PowerPoint creation/editing via html2pptx pipeline + OOXML direct editing
docx Word document creation and manipulation
pdf PDF generation and conversion
xlsx Excel spreadsheet creation and editing
skill-creator Meta-skill for creating new custom skills

All skills ship with YAML frontmatter (name, description, license). The pptx skill includes:

  • html2pptx.md — full html2pptx workflow documentation
  • css.md — CSS styling reference for presentations
  • ooxml.md — OOXML direct editing for complex cases
  • ooxml/scripts/ — helper Python scripts (unpack.py, etc.)

Note: Skills are licensed as Proprietary (with full terms in LICENSE.txt per skill), not MIT.

MCP Connectors

Configured by user in Settings UI:

  • Browser connector
  • Notion connector
  • Custom third-party MCP servers

Remote Control Bridge

  • Feishu (Lark): Event subscription bridge — incoming messages trigger agent tasks
  • Slack: Remote control via Slack workspace messages

GUI Automation

Computer use via model inference (screenshot → click/type). Recommended model: Gemini-3-Pro.

Real-Time Trace Panel

Shows AI reasoning and tool execution live in the UI sidebar.

Settings UI

Graphical configuration for:

  • API key + base URL + model
  • Workspace folder selection
  • MCP connectors
  • VM isolation mode
  • Skills management (custom skill creation/deletion)

Commands

No slash-commands. The interface is chat-based (user types natural language prompts).

Hooks

No Claude Code lifecycle hooks configured. The app wraps Claude Code but does not inject settings.json hooks.

Scripts

Build scripts in scripts/ (npm build, installer packaging). No agent-lifecycle scripts.

05

Prompts

Open Cowork — Prompts

Prompt Architecture

Open Cowork's behavioral "prompts" live in the SKILL.md files under .claude/skills/. These are loaded by Claude Code when a task matches the skill's trigger conditions.

Verbatim Excerpt 1 — PPTX SKILL.md Header

---
name: pptx
description: "Presentation creation, editing, and analysis. When Claude needs to work with presentations (.pptx files) for: (1) Creating new presentations, (2) Modifying or editing content, (3) Working with layouts, (4) Adding comments or speaker notes, or any other presentation tasks"
license: Proprietary. LICENSE.txt has complete terms
---

# PPTX creation, editing, and analysis

## Overview

Create, edit, or analyze the contents of .pptx files when requested. A .pptx file is essentially a ZIP archive containing XML files and other resources. Different tools and workflows are available for different tasks.

## CRITICAL: Read All Documentation First

**Before starting any presentation task**, read ALL relevant documentation files completely to understand the full workflow:

1. **For creating new presentations**: Read [`html2pptx.md`](html2pptx.md) and [`css.md`](css.md) in their entirety
2. **For editing existing presentations**: Read [`ooxml.md`](ooxml.md) in its entirety
3. **For template-based creation**: Read the relevant sections of this file plus [`css.md`](css.md)

**NEVER set any range limits when reading these files.** Understanding the complete workflow, constraints, and best practices before starting is essential for producing high-quality presentations. Partial knowledge leads to errors, inconsistent styling, and visual defects that require rework.

Technique: Front-loads a hard "CRITICAL" gate before any action — forces the agent to read full documentation before proceeding. Similar to superpowers' "Iron Law" pattern (pre-built rationalization: "Partial knowledge leads to errors, inconsistent styling, and visual defects that require rework").

Verbatim Excerpt 2 — PPTX XML Access Instructions

## Raw XML access

Use raw XML access for: comments, speaker notes, slide layouts, animations, design elements, and complex formatting. To access these features, unpack a presentation and read its raw XML contents.

#### Unpacking a file

`python3 ooxml/scripts/unpack.py <office_file> <output_dir>`

#### Key file structures

- `ppt/presentation.xml` - Main presentation metadata and slide references
- `ppt/slides/slide{N}.xml` - Individual slide contents
- `ppt/notesSlides/notesSlide{N}.xml` - Speaker notes for each slide
- `ppt/slideLayouts/` - Layout templates for slides
- `ppt/theme/theme1.xml` - Theme and styling information

Technique: Tool-path specification — tells the model exactly which scripts to call and what file paths to check. Converts ambiguous "edit this presentation" into a deterministic workflow. This is a production-grade skill design.

Skill Trigger

Skills are triggered by Claude Code's autonomous skill-matching — when the task description matches the SKILL.md description field, Claude Code reads the full SKILL.md and follows its instructions. No explicit /pptx command required.

09

Uniqueness

Open Cowork — Uniqueness

differs_from_seeds

Open Cowork is closest to agent-os in that it wraps Claude Code for less-technical users with an install-and-go philosophy. However, agent-os is a bare bash script that installs markdown templates, while Open Cowork is a full Electron desktop application with VM isolation, a visual Trace Panel, and an IM remote control bridge. The Feishu/Slack remote control bridge is architecturally absent from every seed — it is the only entry in this batch (or the seeds) that lets users assign tasks from a mobile IM app and receive results without being at their desktop. The proprietary-licensed SKILL.md packages (pptx, docx, pdf, xlsx) are a production-quality document generation toolkit absent from all seeds. Compared to superpowers (skills-only framework injected via Claude Code plugin), Open Cowork adds a full GUI layer, VM isolation, and IM bridges that superpowers completely lacks. Open Cowork is also the only framework in this batch explicitly targeting knowledge workers (PPT creation, folder cleanup) rather than software developers.

Positioning

Open Cowork positions itself explicitly as the open-source Claude Cowork with three features Claude Cowork lacks: remote IM control, GUI automation (computer use), and a community Skills library.

Observable Failure Modes

  • Proprietary skill licenses: The PPTX/DOCX/XLSX/PDF skills ship with proprietary licenses (not MIT). Users cannot freely modify or redistribute the tool scripts.
  • VM dependency for sandboxing: Meaningful isolation requires WSL2 or Lima — not available by default on all machines, especially enterprise-locked Windows environments.
  • Single-model per session: Model selection is global, not per-task. Using Gemini for GUI work and Claude for coding requires switching settings manually.
  • No session persistence: No JSONL session store. Closing the app loses conversation history.
  • Feishu/Slack bridge complexity: The README documents the feature but setup requires external service configuration (bot tokens, event subscriptions) with no guided wizard in the current version.
04

Workflow

Open Cowork — Workflow

User Workflow

Step Description Artifact
1. Install Download installer or brew install Desktop app running
2. Configure Enter API key, model, base URL in Settings Config saved
3. Select workspace Choose folder where AI is allowed to work Workspace path
4. Prompt Type task in chat input (can drag & drop files/images) Task message
5. Agent executes Claude Code (or chosen model) runs with selected tools Tool call trace
6. VM isolation Bash commands routed through WSL2/Lima if available Isolated execution
7. Result Files written to workspace, document downloaded, or remote reply sent Output files

Remote Control Workflow (Feishu/Slack)

  1. Connect Feishu/Lark or Slack bridge in Settings
  2. Send message from mobile/web to bot
  3. Agent processes task asynchronously
  4. Reply delivered back to IM channel

Document Generation Workflow (via PPTX skill)

  1. User asks: "Create a PowerPoint from my financial_report.csv"
  2. Agent triggers PPTX skill (matched by SKILL.md description field)
  3. Skill's html2pptx pipeline runs: CSV → HTML → PPTX via Python scripts
  4. PPTX file written to workspace
  5. User downloads file from UI

Approval Gates

No formal approval gates. The chat interface is the only interaction point. VM isolation provides a safety boundary instead of user approval prompts.

Spec Format

None. Tasks are natural language chat messages, not structured specs.

06

Memory Context

Open Cowork — Memory & Context

Memory Type

File-based — workspace folder is the persistent state. Claude Code's own context management handles in-session memory.

Context Compaction

Handled by Claude Code's built-in compaction (if using Claude Code backend) or by the model's native context limits (if using direct API mode).

Cross-Session State

  • Files written to the workspace persist across sessions
  • No framework-level session store (no JSONL sessions, no SQLite)
  • The IM bridge (Feishu/Slack) creates conversational history within the IM platform

Remote Context Injection

Open Cowork injects workspace context into prompts via the Claude Code CLAUDE.md mechanism. The skills system adds skill-specific instructions to the context when triggered.

Real-Time Trace

The Trace Panel in the UI shows live reasoning and tool execution. This is observability, not memory — not persisted.

No Explicit Memory Architecture

Open Cowork does not ship a memory system. It relies entirely on:

  1. Files in the workspace folder for persistent state
  2. The IM platform's conversation history for cross-device continuity
  3. Claude Code's native context management for in-session state
07

Orchestration

Open Cowork — Orchestration

Multi-Agent Support

No explicit multi-agent orchestration. Open Cowork is a single-agent desktop application — one Claude Code session per workspace.

Execution Mode

Interactive loop — the user sends a message, the agent responds. This is not a background daemon; the agent runs while the app is open and a task is active.

Isolation Mechanism

VM (virtual machine) — WSL2 on Windows, Lima on macOS. All Bash commands execute inside an isolated Linux VM. This is more than a git-worktree or sandbox API — it is actual OS-level process isolation.

Fallback: Path Guard (file operations restricted to workspace directory) when no VM is available.

Multi-Model

Yes — different models can be configured for different use cases:

  • Claude/Anthropic: coding, general tasks
  • GLM/MiniMax/Kimi: cost-optimized Chinese models
  • Gemini-3-Pro: recommended for GUI automation (computer use)
  • OpenAI-compatible: fallback

Model selection is per-session in Settings, not per-task.

Orchestration Pattern

None for multi-agent. Single agent, sequential tool execution.

Remote Control Bridge

The Feishu/Slack bridge is event-driven — incoming IM messages trigger agent sessions. This is closest to event-driven execution mode for the remote control use case.

Consensus Mechanism

None.

08

Ui Cli Surface

Open Cowork — UI / CLI Surface

Desktop Application

Full Electron desktop app — the primary surface.

Feature Description
Chat interface Natural language task entry with drag & drop file/image support
Real-time Trace Panel Live view of AI reasoning and tool calls
Settings UI API key, model, base URL, MCP connectors, workspace, VM mode
Skills management Install/delete custom skills
MCP connector UI Add/configure MCP servers (browser, Notion, custom)

Install

  • macOS: brew tap OpenCoworkAI/tap && brew install --cask --no-quarantine open-cowork
  • Windows/macOS: Direct .exe/.dmg installer from GitHub Releases
  • Build from source: npm install && npm run rebuild && npm run dev

No CLI Binary

No dedicated CLI binary. The app is GUI-first.

Remote Control Surfaces

  • Feishu (Lark): IM bridge — assign tasks via Feishu messages, receive results back
  • Slack: Same pattern via Slack workspace integration

Port / Web Dashboard

No local web dashboard. The Electron app is the full UI. There is no localhost:NNNN web surface.

Multimodal Input

Drag and drop files and images directly into chat. Vision-capable models (Gemini-3-Pro, Claude 3.x) can process images.

Build System

  • Electron + Vite + React + TypeScript
  • electron-builder.yml for signed platform installers
  • npm run build produces .exe (Windows) and .dmg (macOS)

Related frameworks

same archetype · same primary tool · same memory type

Goose (Block/AAIF) ★ 46k

General-purpose AI agent (not just code) with security-first tool inspection, recipe-based shareable configurations, and 15+ LLM…

Vibe Kanban ★ 27k

Eliminate the overhead of planning, switching between agent terminals, and reviewing diffs by providing a single web dashboard…

1Code ★ 5.5k

Cursor-like desktop experience for Claude Code and Codex with cloud background agents, event-driven automations, and a full…

Crystal (stravu) ★ 3.1k

Manage multiple parallel AI coding sessions in isolated git worktrees from a single desktop GUI.

Maestro (RunMaestro) ★ 3.0k

Orchestrate unlimited parallel AI agent sessions with a keyboard-first desktop app including Group Chat coordination and Auto Run…

AgentsMesh ★ 2.1k

Multi-tenant workforce platform that gives every team member a squad of AI coding agents coordinated through channels, pod…