computer/docs/Developer-Guide.md at afce3b9e66b704fc25857c7ec0c57f72d2fc817f

mirror/computer

Fork 0

mirror of https://github.com/trycua/computer.git synced 2025-12-31 02:19:58 -06:00

Files

f-trycua afce3b9e66 Add dev container, fix lints

2025-03-19 23:28:38 +01:00

6.3 KiB

Raw Blame History

Developer Guide

Project Structure

The project is organized as a monorepo with these main packages:

libs/core/ - Base package with telemetry support
libs/computer/ - Computer-use interface (CUI) library
libs/agent/ - AI agent library with multi-provider support
libs/som/ - Set-of-Mark parser
libs/computer-server/ - Server component for VM
libs/lume/ - Lume CLI
libs/pylume/ - Python bindings for Lume

Each package has its own virtual environment and dependencies, managed through PDM.

Local Development Setup

Install Lume CLI:

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/trycua/cua/main/libs/lume/scripts/install.sh)"

Clone the repository:

git clone https://github.com/trycua/cua.git
cd cua

Create a .env.local file in the root directory with your API keys:

# Required for Anthropic provider
ANTHROPIC_API_KEY=your_anthropic_key_here

# Required for OpenAI provider
OPENAI_API_KEY=your_openai_key_here

Run the build script to set up all packages:

./scripts/build.sh

This will:

Create a virtual environment for the project
Install all packages in development mode
Set up the correct Python path
Install development tools

Open the workspace in VSCode or Cursor:

# For Cua Python development
code .vscode/py.code-workspace

# For Lume (Swift) development
code .vscode/lume.code-workspace

Using the workspace file is strongly recommended as it:

Sets up correct Python environments for each package
Configures proper import paths
Enables debugging configurations
Maintains consistent settings across packages

Docker Development Environment

As an alternative to running directly on your host machine, you can use Docker for development. This approach has several advantages:

Ensures consistent development environment across different machines
Isolates dependencies from your host system
Works well for cross-platform development
Avoids conflicts with existing Python installations

Prerequisites

Docker installed on your machine
Lume server running on your host (port 3000): lume serve

Setup and Usage

Build the development Docker image:

./scripts/run-docker-dev.sh build

Run an example in the container:

./scripts/run-docker-dev.sh run computer_examples.py

Get an interactive shell in the container:

./scripts/run-docker-dev.sh run --interactive

Stop any running containers:

./scripts/run-docker-dev.sh stop

How it Works

The Docker development environment:

Installs all required Python dependencies in the container
Mounts your source code from the host at runtime
Automatically configures the connection to use host.docker.internal:3000 for accessing the Lume server on your host machine
Preserves your code changes without requiring rebuilds (source code is mounted as a volume)

Note

: The Docker container doesn't include the macOS-specific Lume executable. Instead, it connects to the Lume server running on your host machine via host.docker.internal:3000. Make sure to start the Lume server on your host before running examples in the container.

Cleanup and Reset

If you need to clean up the environment (non-docker) and start fresh:

./scripts/cleanup.sh

This will:

Remove all virtual environments
Clean Python cache files and directories
Remove build artifacts
Clean PDM-related files
Reset environment configurations

Package Virtual Environments

The build script creates a shared virtual environment for all packages. The workspace configuration automatically handles import paths with the correct Python path settings.

Running Examples

The Python workspace includes launch configurations for all packages:

"Run Computer Examples" - Runs computer examples
"Run Computer API Server" - Runs the computer-server
"Run Omni Agent Examples" - Runs agent examples
"SOM" configurations - Various settings for running SOM

To run examples:

Open the workspace file (.vscode/py.code-workspace)
Press F5 or use the Run/Debug view
Select the desired configuration

The workspace also includes compound launch configurations:

"Run Computer Examples + Server" - Runs both the Computer Examples and Server simultaneously

Release and Publishing Process

This monorepo contains multiple Python packages that can be published to PyPI. The packages have dependencies on each other in the following order:

pylume - Base package for VM management
cua-computer - Computer control interface (depends on pylume)
cua-som - Parser for UI elements (independent, formerly omniparser)
cua-agent - AI agent (depends on cua-computer and optionally cua-som)
computer-server - Server component installed on the sandbox

Workflow Structure

The publishing process is managed by these GitHub workflow files:

Package-specific workflows:
- .github/workflows/publish-pylume.yml
- .github/workflows/publish-computer.yml
- .github/workflows/publish-som.yml
- .github/workflows/publish-agent.yml
- .github/workflows/publish-computer-server.yml
Coordinator workflow:
- .github/workflows/publish-all.yml - Manages global releases and manual selections

Version Management

Special Considerations for Pylume

The pylume package requires special handling as it incorporates the binary executable from the lume repository:

When releasing pylume, ensure the version matches a corresponding release in the lume repository
The workflow automatically downloads the matching lume binary and includes it in the pylume package
If you need to release a new version of pylume, make sure to coordinate with a matching lume release

Development Workspaces

This monorepo includes multiple VS Code workspace configurations to optimize the development experience based on which components you're working with:

Available Workspace Files

py.code-workspace: For Python package development (Computer, Agent, SOM, etc.)
lume.code-workspace: For Swift-based Lume development

To open a specific workspace:

# For Python development
code .vscode/py.code-workspace

# For Lume (Swift) development
code .vscode/lume.code-workspace

6.3 KiB Raw Blame History