6.3 KiB
Developer Guide
Project Structure
The project is organized as a monorepo with these main packages:
libs/core/- Base package with telemetry supportlibs/computer/- Computer-use interface (CUI) librarylibs/agent/- AI agent library with multi-provider supportlibs/som/- Set-of-Mark parserlibs/computer-server/- Server component for VMlibs/lume/- Lume CLIlibs/pylume/- Python bindings for Lume
Each package has its own virtual environment and dependencies, managed through PDM.
Local Development Setup
- Install Lume CLI:
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/trycua/cua/main/libs/lume/scripts/install.sh)"
- Clone the repository:
git clone https://github.com/trycua/cua.git
cd cua
- Create a
.env.localfile in the root directory with your API keys:
# Required for Anthropic provider
ANTHROPIC_API_KEY=your_anthropic_key_here
# Required for OpenAI provider
OPENAI_API_KEY=your_openai_key_here
- Run the build script to set up all packages:
./scripts/build.sh
This will:
- Create a virtual environment for the project
- Install all packages in development mode
- Set up the correct Python path
- Install development tools
- Open the workspace in VSCode or Cursor:
# For Cua Python development
code .vscode/py.code-workspace
# For Lume (Swift) development
code .vscode/lume.code-workspace
Using the workspace file is strongly recommended as it:
- Sets up correct Python environments for each package
- Configures proper import paths
- Enables debugging configurations
- Maintains consistent settings across packages
Docker Development Environment
As an alternative to running directly on your host machine, you can use Docker for development. This approach has several advantages:
- Ensures consistent development environment across different machines
- Isolates dependencies from your host system
- Works well for cross-platform development
- Avoids conflicts with existing Python installations
Prerequisites
- Docker installed on your machine
- Lume server running on your host (port 3000):
lume serve
Setup and Usage
- Build the development Docker image:
./scripts/run-docker-dev.sh build
- Run an example in the container:
./scripts/run-docker-dev.sh run computer_examples.py
- Get an interactive shell in the container:
./scripts/run-docker-dev.sh run --interactive
- Stop any running containers:
./scripts/run-docker-dev.sh stop
How it Works
The Docker development environment:
- Installs all required Python dependencies in the container
- Mounts your source code from the host at runtime
- Automatically configures the connection to use host.docker.internal:3000 for accessing the Lume server on your host machine
- Preserves your code changes without requiring rebuilds (source code is mounted as a volume)
Note
: The Docker container doesn't include the macOS-specific Lume executable. Instead, it connects to the Lume server running on your host machine via host.docker.internal:3000. Make sure to start the Lume server on your host before running examples in the container.
Cleanup and Reset
If you need to clean up the environment (non-docker) and start fresh:
./scripts/cleanup.sh
This will:
- Remove all virtual environments
- Clean Python cache files and directories
- Remove build artifacts
- Clean PDM-related files
- Reset environment configurations
Package Virtual Environments
The build script creates a shared virtual environment for all packages. The workspace configuration automatically handles import paths with the correct Python path settings.
Running Examples
The Python workspace includes launch configurations for all packages:
- "Run Computer Examples" - Runs computer examples
- "Run Computer API Server" - Runs the computer-server
- "Run Omni Agent Examples" - Runs agent examples
- "SOM" configurations - Various settings for running SOM
To run examples:
- Open the workspace file (
.vscode/py.code-workspace) - Press F5 or use the Run/Debug view
- Select the desired configuration
The workspace also includes compound launch configurations:
- "Run Computer Examples + Server" - Runs both the Computer Examples and Server simultaneously
Release and Publishing Process
This monorepo contains multiple Python packages that can be published to PyPI. The packages have dependencies on each other in the following order:
pylume- Base package for VM managementcua-computer- Computer control interface (depends on pylume)cua-som- Parser for UI elements (independent, formerly omniparser)cua-agent- AI agent (depends on cua-computer and optionally cua-som)computer-server- Server component installed on the sandbox
Workflow Structure
The publishing process is managed by these GitHub workflow files:
-
Package-specific workflows:
.github/workflows/publish-pylume.yml.github/workflows/publish-computer.yml.github/workflows/publish-som.yml.github/workflows/publish-agent.yml.github/workflows/publish-computer-server.yml
-
Coordinator workflow:
.github/workflows/publish-all.yml- Manages global releases and manual selections
Version Management
Special Considerations for Pylume
The pylume package requires special handling as it incorporates the binary executable from the lume repository:
- When releasing
pylume, ensure the version matches a corresponding release in the lume repository - The workflow automatically downloads the matching lume binary and includes it in the pylume package
- If you need to release a new version of pylume, make sure to coordinate with a matching lume release
Development Workspaces
This monorepo includes multiple VS Code workspace configurations to optimize the development experience based on which components you're working with:
Available Workspace Files
- py.code-workspace: For Python package development (Computer, Agent, SOM, etc.)
- lume.code-workspace: For Swift-based Lume development
To open a specific workspace:
# For Python development
code .vscode/py.code-workspace
# For Lume (Swift) development
code .vscode/lume.code-workspace