Merge branch 'main' into feature/agent/uitars-mlx

This commit is contained in:
Dillon DuPont
2025-05-10 17:12:12 -04:00
3 changed files with 473 additions and 157 deletions

291
README.md
View File

@@ -5,188 +5,245 @@
<img alt="Cua logo" height="150" src="img/logo_black.png">
</picture>
<!-- <h1>Cua</h1> -->
[![Python](https://img.shields.io/badge/Python-333333?logo=python&logoColor=white&labelColor=333333)](#)
[![Swift](https://img.shields.io/badge/Swift-F05138?logo=swift&logoColor=white)](#)
[![macOS](https://img.shields.io/badge/macOS-000000?logo=apple&logoColor=F0F0F0)](#)
[![Discord](https://img.shields.io/badge/Discord-%235865F2.svg?&logo=discord&logoColor=white)](https://discord.com/invite/mVnXXpdE85)
</div>
**TL;DR**: **c/ua** (pronounced "koo-ah", short for Computer-Use Agent) is a framework that enables AI agents to control full operating systems within high-performance, lightweight virtual containers. It delivers up to 97% native speed on Apple Silicon and works with any vision language models.
**c/ua** (pronounced "koo-ah") enables AI agents to control full operating systems in high-performance virtual containers with near-native speed on Apple Silicon.
## What is c/ua?
**c/ua** offers two primary capabilities in a single integrated framework:
1. **High-Performance Virtualization** - Create and run macOS/Linux virtual machines on Apple Silicon with near-native performance (up to 97% of native speed) using the **Lume CLI** with `Apple's Virtualization.Framework`.
<div align="center">
<video src="https://github.com/user-attachments/assets/06e1974f-8f73-477d-b18a-715d83148e45" width="800" controls></video></div>
2. **Computer-Use Interface & Agent** - A framework that allows AI systems to observe and control these virtual environments - interacting with applications, browsing the web, writing code, and performing complex workflows.
# 🚀 Quick Start
## Why Use c/ua?
Get started with a Computer-Use Agent UI and a VM with a single command:
- **Security & Isolation**: Run AI agents in fully isolated virtual environments instead of giving them access to your main system
- **Performance**: [Near-native performance](https://browser.geekbench.com/v6/cpu/compare/11283746?baseline=11102709) on Apple Silicon
- **Flexibility**: Run macOS or Linux environments with the same framework
- **Reproducibility**: Create consistent, deterministic environments for AI agent workflows
- **LLM Integration**: Built-in support for connecting to various LLM providers
## System Requirements
```bash
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/trycua/cua/main/scripts/playground.sh)"
```
This script will:
- Install Lume CLI for VM management (if needed)
- Pull the latest macOS CUA image (if needed)
- Set up Python environment and install/update required packages
- Launch the Computer-Use Agent UI
#### Supported [Agent Loops](https://github.com/trycua/cua/blob/main/libs/agent/README.md#agent-loops)
- [UITARS-1.5](https://github.com/trycua/cua/blob/main/libs/agent/README.md#agent-loops) - Run locally on Apple Silicon with MLX, or use cloud providers
- [OpenAI CUA](https://github.com/trycua/cua/blob/main/libs/agent/README.md#agent-loops) - Use OpenAI's Computer-Use Preview model
- [Anthropic CUA](https://github.com/trycua/cua/blob/main/libs/agent/README.md#agent-loops) - Use Anthropic's Computer-Use capabilities
- [OmniParser](https://github.com/trycua/cua/blob/main/libs/agent/README.md#agent-loops) - Control UI with [Set-of-Marks prompting](https://som-gpt4v.github.io/) using any vision model
### System Requirements
- Mac with Apple Silicon (M1/M2/M3/M4 series)
- macOS 15 (Sequoia) or newer
- Python 3.10+ (required for the Computer, Agent, and MCP libraries). We recommend using Conda (or Anaconda) to create an ad hoc Python environment.
- Disk space for VM images (30GB+ recommended)
## Quick Start
### Option 1: Lume CLI Only (VM Management)
If you only need the virtualization capabilities:
# 💻 For Developers
### Step 1: Install Lume CLI
```bash
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/trycua/cua/main/libs/lume/scripts/install.sh)"
```
Optionally, if you don't want Lume to run as a background service:
Lume CLI manages high-performance macOS/Linux VMs with near-native speed on Apple Silicon.
### Step 2: Pull the macOS CUA Image
```bash
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/trycua/cua/main/libs/lume/scripts/install.sh) --no-background-service"
lume pull macos-sequoia-cua:latest
```
**Note:** If you choose this option, you'll need to manually start the Lume API service whenever needed by running `lume serve` in your terminal. This applies to Option 2 after completing step 1.
The macOS CUA image contains the default Mac apps and the Computer Server for easy automation.
For Lume usage instructions, refer to the [Lume documentation](./libs/lume/README.md).
### Step 3: Install Python SDK
### Option 2: Full Computer-Use Agent Capabilities
If you want to use AI agents with virtualized environments:
```bash
pip install cua-computer "cua-agent[all]"
```
1. Install the Lume CLI:
```bash
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/trycua/cua/main/libs/lume/scripts/install.sh)"
```
Alternatively, see the [Developer Guide](./docs/Developer-Guide.md) for building from source.
2. Pull the latest macOS CUA image:
```bash
lume pull macos-sequoia-cua:latest
```
### Step 4: Use in Your Code
3. Install the Python libraries:
```bash
pip install cua-computer cua-agent[all]
```
```python
from computer import Computer
from agent import ComputerAgent, LLM
4. Use the libraries in your Python code:
```python
from computer import Computer
from agent import ComputerAgent, LLM, AgentLoop, LLMProvider
async def main():
# Start a local macOS VM with a 1024x768 display
async with Computer(os_type="macos", display="1024x768") as computer:
async with Computer(os_type="macos", display="1024x768") as macos_computer:
agent = ComputerAgent(
computer=macos_computer,
loop=AgentLoop.OPENAI, # or AgentLoop.UITARS, AgentLoop.OMNI, or AgentLoop.UITARS, or AgentLoop.ANTHROPIC
model=LLM(provider=LLMProvider.OPENAI) # or LLM(provider=LLMProvider.MLXVLM, name="mlx-community/UI-TARS-1.5-7B-4bit")
)
# Example: Direct control of a macOS VM with Computer
await computer.interface.left_click(100, 200)
await computer.interface.type_text("Hello, world!")
screenshot_bytes = await computer.interface.screenshot()
# Example: Create and run an agent locally using mlx-community/UI-TARS-1.5-7B-6bit
agent = ComputerAgent(
computer=computer,
loop="UITARS",
model=LLM(provider="MLX", name="mlx-community/UI-TARS-1.5-7B-6bit")
)
await agent.run("Find the trycua/cua repository on GitHub and follow the quick start guide")
tasks = [
"Look for a repository named trycua/cua on GitHub.",
]
main()
```
for task in tasks:
async for result in agent.run(task):
print(result)
```
Explore the [Agent Notebook](./notebooks/) for a ready-to-run example.
For ready-to-use examples, check out our [Notebooks](./notebooks/) collection.
5. Optionally, you can use the Agent with a Gradio UI:
### Lume CLI Reference
```python
from utils import load_dotenv_files
load_dotenv_files()
from agent.ui.gradio.app import create_gradio_ui
app = create_gradio_ui()
app.launch(share=False)
```
```bash
# Install Lume CLI
curl -fsSL https://raw.githubusercontent.com/trycua/cua/main/libs/lume/scripts/install.sh | bash
### Option 3: Build from Source (Nightly)
If you want to contribute to the project or need the latest nightly features:
# List all VMs
lume ls
```bash
# Clone the repository
git clone https://github.com/trycua/cua.git
cd cua
# Open the project in VSCode
code ./.vscode/py.code-workspace
# Pull a VM image
lume pull macos-sequoia-cua:latest
# Build the project
./scripts/build.sh
```
See our [Developer-Guide](./docs/Developer-Guide.md) for more information.
# Create a new VM
lume create my-vm --os macos --cpu 4 --memory 8GB --disk-size 50GB
## Monorepo Libraries
# Run a VM (creates and starts if it doesn't exist)
lume run macos-sequoia-cua:latest
| Library | Description | Installation | Version |
|---------|-------------|--------------|---------|
| [**Lume**](./libs/lume/README.md) | CLI for running macOS/Linux VMs with near-native performance using Apple's `Virtualization.Framework`. | [![Download](https://img.shields.io/badge/Download-333333?style=for-the-badge&logo=github&logoColor=white)](https://github.com/trycua/cua/releases/latest/download/lume.pkg.tar.gz) | [![GitHub release](https://img.shields.io/github/v/release/trycua/cua?color=333333)](https://github.com/trycua/cua/releases) |
| [**Computer**](./libs/computer/README.md) | Computer-Use Interface (CUI) framework for interacting with macOS/Linux sandboxes | `pip install cua-computer` | [![PyPI](https://img.shields.io/pypi/v/cua-computer?color=333333)](https://pypi.org/project/cua-computer/) |
| [**Agent**](./libs/agent/README.md) | Computer-Use Agent (CUA) framework for running agentic workflows in macOS/Linux dedicated sandboxes | `pip install cua-agent` | [![PyPI](https://img.shields.io/pypi/v/cua-agent?color=333333)](https://pypi.org/project/cua-agent/) |
# Stop a VM
lume stop macos-sequoia-cua_latest
## Docs
# Delete a VM
lume delete macos-sequoia-cua_latest
```
For the best onboarding experience with the packages in this monorepo, we recommend starting with the [Computer](./libs/computer/README.md) documentation to cover the core functionality of the Computer sandbox, then exploring the [Agent](./libs/agent/README.md) documentation to understand Cua's AI agent capabilities, and finally working through the Notebook examples.
For advanced container-like virtualization, check out [Lumier](./libs/lumier/README.md) - a Docker interface for macOS and Linux VMs.
- [Lume](./libs/lume/README.md)
- [Computer](./libs/computer/README.md)
- [Agent](./libs/agent/README.md)
- [Notebooks](./notebooks/)
## Resources
- [How to use the MCP Server with Claude Desktop or other MCP clients](./libs/mcp-server/README.md) - One of the easiest ways to get started with C/ua
- [How to use OpenAI Computer-Use, Anthropic, OmniParser, or UI-TARS for your Computer-Use Agent](./libs/agent/README.md)
- [How to use Lume CLI for managing desktops](./libs/lume/README.md)
- [Training Computer-Use Models: Collecting Human Trajectories with C/ua (Part 1)](https://www.trycua.com/blog/training-computer-use-models-trajectories-1)
- [Build Your Own Operator on macOS (Part 1)](https://www.trycua.com/blog/build-your-own-operator-on-macos-1)
## Modules
| Module | Description | Installation |
|--------|-------------|---------------|
| [**Lume**](./libs/lume/README.md) | VM management for macOS/Linux using Apple's Virtualization.Framework | `curl -fsSL https://raw.githubusercontent.com/trycua/cua/main/libs/lume/scripts/install.sh \| bash` |
| [**Computer**](./libs/computer/README.md) | Interface for controlling virtual machines | `pip install cua-computer` |
| [**Agent**](./libs/agent/README.md) | AI agent framework for automating tasks | `pip install cua-agent` |
| [**MCP Server**](./libs/mcp-server/README.md) | MCP server for using CUA with Claude Desktop | `pip install cua-mcp-server` |
| [**SOM**](./libs/som/README.md) | Self-of-Mark library for Agent | `pip install cua-som` |
| [**PyLume**](./libs/pylume/README.md) | Python bindings for Lume | `pip install pylume` |
| [**Computer Server**](./libs/computer-server/README.md) | Server component for Computer | `pip install cua-computer-server` |
| [**Core**](./libs/core/README.md) | Core utilities | `pip install cua-core` |
## Computer Interface Reference
For complete examples, see [computer_examples.py](./examples/computer_examples.py) or [computer_nb.ipynb](./notebooks/computer_nb.ipynb)
```python
# Mouse Actions
await computer.interface.left_click(x, y) # Left click at coordinates
await computer.interface.right_click(x, y) # Right click at coordinates
await computer.interface.double_click(x, y) # Double click at coordinates
await computer.interface.move_cursor(x, y) # Move cursor to coordinates
await computer.interface.drag_to(x, y, duration) # Drag to coordinates
await computer.interface.get_cursor_position() # Get current cursor position
# Keyboard Actions
await computer.interface.type_text("Hello") # Type text
await computer.interface.press_key("enter") # Press a single key
await computer.interface.hotkey("command", "c") # Press key combination
# Screen Actions
await computer.interface.screenshot() # Take a screenshot
await computer.interface.get_screen_size() # Get screen dimensions
# Clipboard Actions
await computer.interface.set_clipboard(text) # Set clipboard content
await computer.interface.copy_to_clipboard() # Get clipboard content
# File System Operations
await computer.interface.file_exists(path) # Check if file exists
await computer.interface.directory_exists(path) # Check if directory exists
await computer.interface.run_command(cmd) # Run shell command
# Accessibility
await computer.interface.get_accessibility_tree() # Get accessibility tree
```
## ComputerAgent Reference
For complete examples, see [agent_examples.py](./examples/agent_examples.py) or [agent_nb.ipynb](./notebooks/agent_nb.ipynb)
```python
# Import necessary components
from agent import ComputerAgent, LLM, AgentLoop, LLMProvider
# UI-TARS-1.5 agent for local execution with MLX
ComputerAgent(loop=AgentLoop.UITARS, model=LLM(provider=LLMProvider.MLX, name="mlx-community/UI-TARS-1.5-7B-6bit"))
# OpenAI Computer-Use agent using OPENAI_API_KEY
ComputerAgent(loop=AgentLoop.OPENAI, model=LLM(provider=LLMProvider.OPENAI, name="computer-use-preview"))
# Anthropic Claude agent using ANTHROPIC_API_KEY
ComputerAgent(loop=AgentLoop.ANTHROPIC, model=LLM(provider=LLMProvider.ANTHROPIC))
# OmniParser loop for UI control using Set-of-Marks (SOM) prompting and any vision LLM
ComputerAgent(loop=AgentLoop.OMNI, model=LLM(provider=LLMProvider.OLLAMA, name="gemma3:12b-it-q4_K_M"))
# OpenRouter example using OAICOMPAT provider
ComputerAgent(
loop=AgentLoop.OMNI,
model=LLM(
provider=LLMProvider.OAICOMPAT,
name="openai/gpt-4o-mini",
provider_base_url="https://openrouter.ai/api/v1"
),
api_key="your-openrouter-api-key"
)
```
## Demos
Demos of the Computer-Use Agent in action. Share your most impressive demos in Cua's [Discord community](https://discord.com/invite/mVnXXpdE85)!
Check out these demos of the Computer-Use Agent in action:
<details open>
<summary><b>MCP Server: Work with Claude Desktop and Tableau </b></summary>
<summary><b>MCP Server: Work with Claude Desktop and Tableau</b></summary>
<br>
<div align="center">
<video src="https://github.com/user-attachments/assets/9f573547-5149-493e-9a72-396f3cff29df
" width="800" controls></video>
<video src="https://github.com/user-attachments/assets/9f573547-5149-493e-9a72-396f3cff29df" width="800" controls></video>
</div>
</details>
<details open>
<summary><b>AI-Gradio: multi-app workflow requiring browser, VS Code and terminal access</b></summary>
<details>
<summary><b>AI-Gradio: Multi-app workflow with browser, VS Code and terminal</b></summary>
<br>
<div align="center">
<video src="https://github.com/user-attachments/assets/723a115d-1a07-4c8e-b517-88fbdf53ed0f" width="800" controls></video>
</div>
</details>
<details open>
<details>
<summary><b>Notebook: Fix GitHub issue in Cursor</b></summary>
<br>
<div align="center">
<video src="https://github.com/user-attachments/assets/f67f0107-a1e1-46dc-aa9f-0146eb077077" width="800" controls></video>
</div>
</details>
## Accessory Libraries
## Community
| Library | Description | Installation | Version |
|---------|-------------|--------------|---------|
| [**Core**](./libs/core/README.md) | Core functionality and utilities used by other Cua packages | `pip install cua-core` | [![PyPI](https://img.shields.io/pypi/v/cua-core?color=333333)](https://pypi.org/project/cua-core/) |
| [**PyLume**](./libs/pylume/README.md) | Python bindings for Lume | `pip install pylume` | [![PyPI](https://img.shields.io/pypi/v/pylume?color=333333)](https://pypi.org/project/pylume/) |
| [**Computer Server**](./libs/computer-server/README.md) | Server component for the Computer-Use Interface (CUI) framework | `pip install cua-computer-server` | [![PyPI](https://img.shields.io/pypi/v/cua-computer-server?color=333333)](https://pypi.org/project/cua-computer-server/) |
| [**SOM**](./libs/som/README.md) | Self-of-Mark library for Agent | `pip install cua-som` | [![PyPI](https://img.shields.io/pypi/v/cua-som?color=333333)](https://pypi.org/project/cua-som/) |
## Contributing
We welcome and greatly appreciate contributions to Cua! Whether you're improving documentation, adding new features, fixing bugs, or adding new VM images, your efforts help make lume better for everyone. For detailed instructions on how to contribute, please refer to our [Contributing Guidelines](CONTRIBUTING.md).
Join our [Discord community](https://discord.com/invite/mVnXXpdE85) to discuss ideas or get assistance.
Join our [Discord community](https://discord.com/invite/mVnXXpdE85) to discuss ideas, get assistance, or share your demos!
## License
@@ -194,11 +251,17 @@ Cua is open-sourced under the MIT License - see the [LICENSE](LICENSE) file for
Microsoft's OmniParser, which is used in this project, is licensed under the Creative Commons Attribution 4.0 International License (CC-BY-4.0) - see the [OmniParser LICENSE](https://github.com/microsoft/OmniParser/blob/master/LICENSE) file for details.
## Contributing
We welcome contributions to CUA! Please refer to our [Contributing Guidelines](CONTRIBUTING.md) for details.
## Trademarks
Apple, macOS, and Apple Silicon are trademarks of Apple Inc. Ubuntu and Canonical are registered trademarks of Canonical Ltd. Microsoft is a registered trademark of Microsoft Corporation. This project is not affiliated with, endorsed by, or sponsored by Apple Inc., Canonical Ltd., or Microsoft Corporation.
## Stargazers over time
## Stargazers
Thank you to all our supporters!
[![Stargazers over time](https://starchart.cc/trycua/cua.svg?variant=adaptive)](https://starchart.cc/trycua/cua)

View File

@@ -494,6 +494,83 @@ def create_gradio_ui(
"Open Safari, search for 'macOS automation tools', and save the first three results as bookmarks",
"Configure SSH keys and set up a connection to a remote server",
]
# Function to generate Python code based on configuration and tasks
def generate_python_code(agent_loop_choice, provider, model_name, tasks, provider_url, recent_images=3, save_trajectory=True):
"""Generate Python code for the current configuration and tasks.
Args:
agent_loop_choice: The agent loop type (e.g., UITARS, OPENAI, ANTHROPIC, OMNI)
provider: The provider type (e.g., OPENAI, ANTHROPIC, OLLAMA, OAICOMPAT)
model_name: The model name
tasks: List of tasks to execute
provider_url: The provider base URL for OAICOMPAT providers
recent_images: Number of recent images to keep in context
save_trajectory: Whether to save the agent trajectory
Returns:
Formatted Python code as a string
"""
# Format the tasks as a Python list
tasks_str = ""
for task in tasks:
if task and task.strip():
tasks_str += f' "{task}",\n'
# Create the Python code template
code = f'''import asyncio
from computer import Computer
from agent import ComputerAgent, LLM, AgentLoop, LLMProvider
async def main():
async with Computer() as macos_computer:
agent = ComputerAgent(
computer=macos_computer,
loop=AgentLoop.{agent_loop_choice},
only_n_most_recent_images={recent_images},
save_trajectory={save_trajectory},'''
# Add the model configuration based on provider
if provider == LLMProvider.OAICOMPAT:
code += f'''
model=LLM(
provider=LLMProvider.OAICOMPAT,
name="{model_name}",
provider_base_url="{provider_url}"
)'''
code += """
)
"""
# Add tasks section if there are tasks
if tasks_str:
code += f'''
# Prompts for the computer-use agent
tasks = [
{tasks_str.rstrip()}
]
for task in tasks:
print(f"Executing task: {{task}}")
async for result in agent.run(task):
print(result)'''
else:
# If no tasks, just add a placeholder for a single task
code += f'''
# Execute a single task
task = "Search for information about CUA on GitHub"
print(f"Executing task: {{task}}")
async for result in agent.run(task):
print(result)'''
# Add the main block
code += '''
if __name__ == "__main__":
asyncio.run(main())'''
return code
# Function to update model choices based on agent loop selection
def update_model_choices(loop):
@@ -551,50 +628,20 @@ def create_gradio_ui(
"""
)
# Add installation prerequisites as a collapsible section
with gr.Accordion("Prerequisites & Installation", open=False):
gr.Markdown(
"""
## Prerequisites
Before using the Computer-Use Agent, you need to set up the Lume daemon and pull the macOS VM image.
### 1. Install Lume daemon
While a lume binary is included with Computer, we recommend installing the standalone version with brew, and starting the lume daemon service:
```bash
sudo /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/trycua/cua/main/libs/lume/scripts/install.sh)"
```
### 2. Start the Lume daemon service
In a separate terminal:
```bash
lume serve
```
### 3. Pull the pre-built macOS image
```bash
lume pull macos-sequoia-cua:latest
```
Initial download requires 80GB storage, but reduces to ~30GB after first run due to macOS's sparse file system.
VMs are stored in `~/.lume`, and locally cached images are stored in `~/.lume/cache`.
### 4. Test the sandbox
```bash
lume run macos-sequoia-cua:latest
```
For more detailed instructions, visit the [CUA GitHub repository](https://github.com/trycua/cua).
"""
# Add accordion for Python code
with gr.Accordion("Python Code", open=False):
code_display = gr.Code(
language="python",
value=generate_python_code(
initial_loop,
LLMProvider.OPENAI,
"gpt-4o",
[],
"https://openrouter.ai/api/v1"
),
interactive=False,
)
with gr.Accordion("Configuration", open=True):
# Configuration options
agent_loop = gr.Dropdown(
@@ -657,6 +704,7 @@ def create_gradio_ui(
info="Number of recent images to keep in context",
interactive=True,
)
# Right column for chat interface
with gr.Column(scale=2):
@@ -914,6 +962,62 @@ def create_gradio_ui(
queue=False, # Process immediately without queueing
)
# Function to update the code display based on configuration and chat history
def update_code_display(agent_loop, model_choice_val, custom_model_val, chat_history, provider_base_url, recent_images_val, save_trajectory_val):
# Extract messages from chat history
messages = []
if chat_history:
for msg in chat_history:
if msg.get("role") == "user":
messages.append(msg.get("content", ""))
# Determine provider and model name based on selection
model_string = custom_model_val if model_choice_val == "Custom model..." else model_choice_val
provider, model_name, _ = get_provider_and_model(model_string, agent_loop)
# Generate and return the code
return generate_python_code(
agent_loop,
provider,
model_name,
messages,
provider_base_url,
recent_images_val,
save_trajectory_val
)
# Update code display when configuration changes
agent_loop.change(
update_code_display,
inputs=[agent_loop, model_choice, custom_model, chatbot_history, provider_base_url, recent_images, save_trajectory],
outputs=[code_display]
)
model_choice.change(
update_code_display,
inputs=[agent_loop, model_choice, custom_model, chatbot_history, provider_base_url, recent_images, save_trajectory],
outputs=[code_display]
)
custom_model.change(
update_code_display,
inputs=[agent_loop, model_choice, custom_model, chatbot_history, provider_base_url, recent_images, save_trajectory],
outputs=[code_display]
)
chatbot_history.change(
update_code_display,
inputs=[agent_loop, model_choice, custom_model, chatbot_history, provider_base_url, recent_images, save_trajectory],
outputs=[code_display]
)
recent_images.change(
update_code_display,
inputs=[agent_loop, model_choice, custom_model, chatbot_history, provider_base_url, recent_images, save_trajectory],
outputs=[code_display]
)
save_trajectory.change(
update_code_display,
inputs=[agent_loop, model_choice, custom_model, chatbot_history, provider_base_url, recent_images, save_trajectory],
outputs=[code_display]
)
return demo

149
scripts/playground.sh Executable file
View File

@@ -0,0 +1,149 @@
#!/bin/bash
set -e
echo "🚀 Setting up CUA playground environment..."
# Check for Apple Silicon Mac
if [[ $(uname -s) != "Darwin" || $(uname -m) != "arm64" ]]; then
echo "❌ This script requires an Apple Silicon Mac (M1/M2/M3/M4)."
exit 1
fi
# Check for macOS 15 (Sequoia) or newer
OSVERSION=$(sw_vers -productVersion)
if [[ $(echo "$OSVERSION 15.0" | tr " " "\n" | sort -V | head -n 1) != "15.0" ]]; then
echo "❌ This script requires macOS 15 (Sequoia) or newer. You have $OSVERSION."
exit 1
fi
# Create a temporary directory for our work
TMP_DIR=$(mktemp -d)
cd "$TMP_DIR"
# Function to clean up on exit
cleanup() {
cd ~
rm -rf "$TMP_DIR"
}
trap cleanup EXIT
# Install Lume if not already installed
if ! command -v lume &> /dev/null; then
echo "📦 Installing Lume CLI..."
curl -fsSL https://raw.githubusercontent.com/trycua/cua/main/libs/lume/scripts/install.sh | bash
# Add lume to PATH for this session if it's not already there
if ! command -v lume &> /dev/null; then
export PATH="$PATH:$HOME/.lume/bin"
fi
fi
# Pull the macOS CUA image if not already present
if ! lume ls | grep -q "macos-sequoia-cua"; then
# Check available disk space
IMAGE_SIZE_GB=30
AVAILABLE_SPACE_KB=$(df -k $HOME | tail -1 | awk '{print $4}')
AVAILABLE_SPACE_GB=$(($AVAILABLE_SPACE_KB / 1024 / 1024))
echo "📊 The macOS CUA image will use approximately ${IMAGE_SIZE_GB}GB of disk space."
echo " You currently have ${AVAILABLE_SPACE_GB}GB available on your system."
# Prompt for confirmation
read -p " Continue? [y]/n: " CONTINUE
CONTINUE=${CONTINUE:-y}
if [[ $CONTINUE =~ ^[Yy]$ ]]; then
echo "📥 Pulling macOS CUA image (this may take a while)..."
lume pull macos-sequoia-cua:latest
else
echo "❌ Installation cancelled."
exit 1
fi
fi
# Create a Python virtual environment
echo "🐍 Setting up Python environment..."
PYTHON_CMD="python3"
# Check if Python 3.11+ is available
PYTHON_VERSION=$($PYTHON_CMD --version 2>&1 | cut -d" " -f2)
PYTHON_MAJOR=$(echo $PYTHON_VERSION | cut -d. -f1)
PYTHON_MINOR=$(echo $PYTHON_VERSION | cut -d. -f2)
if [ "$PYTHON_MAJOR" -lt 3 ] || ([ "$PYTHON_MAJOR" -eq 3 ] && [ "$PYTHON_MINOR" -lt 11 ]); then
echo "❌ Python 3.11+ is required. You have $PYTHON_VERSION."
echo "Please install Python 3.11+ and try again."
exit 1
fi
# Create a virtual environment
VENV_DIR="$HOME/.cua-venv"
if [ ! -d "$VENV_DIR" ]; then
$PYTHON_CMD -m venv "$VENV_DIR"
fi
# Activate the virtual environment
source "$VENV_DIR/bin/activate"
# Install required packages
echo "📦 Installing CUA packages..."
pip install -U pip
pip install cua-computer cua-agent[all]
# Create a simple demo script
DEMO_DIR="$HOME/.cua-demo"
mkdir -p "$DEMO_DIR"
cat > "$DEMO_DIR/run_demo.py" << 'EOF'
import asyncio
import os
from computer import Computer
from agent import ComputerAgent, LLM, AgentLoop, LLMProvider
from agent.ui.gradio.app import create_gradio_ui
# Try to load API keys from environment
api_key = os.environ.get("OPENAI_API_KEY", "")
if not api_key:
print("\n⚠ No OpenAI API key found. You'll need to provide one in the UI.")
# Launch the Gradio UI and open it in the browser
app = create_gradio_ui()
app.launch(share=False, inbrowser=True)
EOF
# Create a convenience script to run the demo
cat > "$DEMO_DIR/start_demo.sh" << EOF
#!/bin/bash
source "$VENV_DIR/bin/activate"
cd "$DEMO_DIR"
python run_demo.py
EOF
chmod +x "$DEMO_DIR/start_demo.sh"
echo "✅ Setup complete!"
echo "🖥️ You can start the CUA playground by running: $DEMO_DIR/start_demo.sh"
# Check if the VM is running
echo "🔍 Checking if the macOS CUA VM is running..."
VM_RUNNING=$(lume ls | grep "macos-sequoia-cua" | grep "running" || echo "")
if [ -z "$VM_RUNNING" ]; then
echo "🚀 Starting the macOS CUA VM in the background..."
lume run macos-sequoia-cua:latest &
# Wait a moment for the VM to initialize
sleep 5
echo "✅ VM started successfully."
else
echo "✅ macOS CUA VM is already running."
fi
# Ask if the user wants to start the demo now
echo
read -p "Would you like to start the CUA playground now? (y/n) " -n 1 -r
echo
if [[ $REPLY =~ ^[Yy]$ ]]; then
echo "🚀 Starting the CUA playground..."
echo ""
"$DEMO_DIR/start_demo.sh"
fi