Move references from readme.md to docs. Update lucide. Move guides from readme.md to docs.

This commit is contained in:
Morgan Dean
2025-07-04 14:53:13 -07:00
parent a11d86fb16
commit 7ed96d5ff6
15 changed files with 386 additions and 298 deletions
+12 -243
View File
@@ -47,146 +47,25 @@
</details>
</details><br/>
# 🚀 Quick Start with a Computer-Use Agent UI
# 🚀 Quick Start
**Need to automate desktop tasks? Launch the Computer-Use Agent UI with a single command.**
Read our guide on getting started with a Computer-Use Agent:
[Computer-Use Agent Quickstart](https://docs.trycua.com/home/guides/usage-guide)
### Option 1: Fully-managed install with Docker (recommended)
Get started using C/ua services on your machine:
[C/ua Usage Guide](https://docs.trycua.com/home/guides/cua-usage-guide)
*Docker-based guided install for quick use*
Set up a development environment with the Dev Container:
[Dev Container Setup](https://docs.trycua.com/home/guides/dev-container-setup)
**macOS/Linux/Windows (via WSL):**
## Lume
```bash
# Requires Docker
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/trycua/cua/main/scripts/playground-docker.sh)"
```
This script will guide you through setup using Docker containers and launch the Computer-Use Agent UI.
---
### Option 2: [Dev Container](./.devcontainer/README.md)
*Best for contributors and development*
This repository includes a [Dev Container](./.devcontainer/README.md) configuration that simplifies setup to a few steps:
1. **Install the Dev Containers extension ([VS Code](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers) or [WindSurf](https://docs.windsurf.com/windsurf/advanced#dev-containers-beta))**
2. **Open the repository in the Dev Container:**
- Press `Ctrl+Shift+P` (or `⌘+Shift+P` on macOS)
- Select `Dev Containers: Clone Repository in Container Volume...` and paste the repository URL: `https://github.com/trycua/cua.git` (if not cloned) or `Dev Containers: Open Folder in Container...` (if git cloned).
> **Note**: On WindSurf, the post install hook might not run automatically. If so, run `/bin/bash .devcontainer/post-install.sh` manually.
3. **Open the VS Code workspace:** Once the post-install.sh is done running, open the `.vscode/py.code-workspace` workspace and press ![Open Workspace](https://github.com/user-attachments/assets/923bdd43-8c8f-4060-8d78-75bfa302b48c)
.
4. **Run the Agent UI example:** Click ![Run Agent UI](https://github.com/user-attachments/assets/7a61ef34-4b22-4dab-9864-f86bf83e290b)
to start the Gradio UI. If prompted to install **debugpy (Python Debugger)** to enable remote debugging, select 'Yes' to proceed.
5. **Access the Gradio UI:** The Gradio UI will be available at `http://localhost:7860` and will automatically forward to your host machine.
---
### Option 3: PyPI
*Direct Python package installation*
```bash
# conda create -yn cua python==3.12
pip install -U "cua-computer[all]" "cua-agent[all]"
python -m agent.ui # Start the agent UI
```
Or check out the [Usage Guide](#-usage-guide) to learn how to use our Python SDK in your own code.
---
## Supported [Agent Loops](https://github.com/trycua/cua/blob/main/libs/python/agent/README.md#agent-loops)
- [UITARS-1.5](https://github.com/trycua/cua/blob/main/libs/python/agent/README.md#agent-loops) - Run locally on Apple Silicon with MLX, or use cloud providers
- [OpenAI CUA](https://github.com/trycua/cua/blob/main/libs/python/agent/README.md#agent-loops) - Use OpenAI's Computer-Use Preview model
- [Anthropic CUA](https://github.com/trycua/cua/blob/main/libs/python/agent/README.md#agent-loops) - Use Anthropic's Computer-Use capabilities
- [OmniParser-v2.0](https://github.com/trycua/cua/blob/main/libs/python/agent/README.md#agent-loops) - Control UI with [Set-of-Marks prompting](https://som-gpt4v.github.io/) using any vision model
## 🖥️ Compatibility
For detailed compatibility information including host OS support, VM emulation capabilities, and model provider compatibility, see the [Compatibility Matrix](./COMPATIBILITY.md).
<br/>
<br/>
# 🐍 Usage Guide
Follow these steps to use C/ua in your own Python code. See [Developer Guide](./docs/Developer-Guide.md) for building from source.
### Step 1: Install Lume CLI
```bash
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/trycua/cua/main/libs/lume/scripts/install.sh)"
```
Lume CLI manages high-performance macOS/Linux VMs with near-native speed on Apple Silicon.
### Step 2: Pull the macOS CUA Image
```bash
lume pull macos-sequoia-cua:latest
```
The macOS CUA image contains the default Mac apps and the Computer Server for easy automation.
### Step 3: Install Python SDK
```bash
pip install "cua-computer[all]" "cua-agent[all]"
```
### Step 4: Use in Your Code
```python
from computer import Computer
from agent import ComputerAgent, LLM
async def main():
# Start a local macOS VM
computer = Computer(os_type="macos")
await computer.run()
# Or with C/ua Cloud Container
computer = Computer(
os_type="linux",
api_key="your_cua_api_key_here",
name="your_container_name_here"
)
# Example: Direct control of a macOS VM with Computer
await computer.interface.left_click(100, 200)
await computer.interface.type_text("Hello, world!")
screenshot_bytes = await computer.interface.screenshot()
# Example: Create and run an agent locally using mlx-community/UI-TARS-1.5-7B-6bit
agent = ComputerAgent(
computer=computer,
loop="uitars",
model=LLM(provider="mlxvlm", name="mlx-community/UI-TARS-1.5-7B-6bit")
)
async for result in agent.run("Find the trycua/cua repository on GitHub and follow the quick start guide"):
print(result)
if __name__ == "__main__":
asyncio.run(main())
```
For ready-to-use examples, check out our [Notebooks](./notebooks/) collection.
### Lume CLI Reference
For managing and creating virtual machines on macOS, check out [Lume](./libs/lume/README.md).
```bash
# Install Lume CLI and background service
curl -fsSL https://raw.githubusercontent.com/trycua/cua/main/libs/lume/scripts/install.sh | bash
# List all VMs
lume ls
# Pull a VM image
lume pull macos-sequoia-cua:latest
@@ -198,12 +77,9 @@ lume run macos-sequoia-cua:latest
# Stop a VM
lume stop macos-sequoia-cua_latest
# Delete a VM
lume delete macos-sequoia-cua_latest
```
### Lumier CLI Reference
## Lumier
For advanced container-like virtualization, check out [Lumier](./libs/lumier/README.md) - a Docker interface for macOS and Linux VMs.
@@ -226,7 +102,7 @@ docker run -it --rm \
trycua/lumier:latest
```
## Resources
# Resources
- [How to use the MCP Server with Claude Desktop or other MCP clients](./libs/python/mcp-server/README.md) - One of the easiest ways to get started with C/ua
- [How to use OpenAI Computer-Use, Anthropic, OmniParser, or UI-TARS for your Computer-Use Agent](./libs/python/agent/README.md)
@@ -234,7 +110,7 @@ docker run -it --rm \
- [Training Computer-Use Models: Collecting Human Trajectories with C/ua (Part 1)](https://www.trycua.com/blog/training-computer-use-models-trajectories-1)
- [Build Your Own Operator on macOS (Part 1)](https://www.trycua.com/blog/build-your-own-operator-on-macos-1)
## Modules
# Modules
| Module | Description | Installation |
|--------|-------------|---------------|
@@ -249,113 +125,6 @@ docker run -it --rm \
| [**Core (Python)**](./libs/python/core/README.md) | Python Core utilities | `pip install cua-core` |
| [**Core (Typescript)**](./libs/typescript/core/README.md) | Typescript Core utilities | `npm install @trycua/core` |
## Computer Interface Reference
For complete examples, see [computer_examples.py](./examples/computer_examples.py) or [computer_nb.ipynb](./notebooks/computer_nb.ipynb)
```python
# Shell Actions
result = await computer.interface.run_command(cmd) # Run shell command
# result.stdout, result.stderr, result.returncode
# Mouse Actions
await computer.interface.left_click(x, y) # Left click at coordinates
await computer.interface.right_click(x, y) # Right click at coordinates
await computer.interface.double_click(x, y) # Double click at coordinates
await computer.interface.move_cursor(x, y) # Move cursor to coordinates
await computer.interface.drag_to(x, y, duration) # Drag to coordinates
await computer.interface.get_cursor_position() # Get current cursor position
await computer.interface.mouse_down(x, y, button="left") # Press and hold a mouse button
await computer.interface.mouse_up(x, y, button="left") # Release a mouse button
# Keyboard Actions
await computer.interface.type_text("Hello") # Type text
await computer.interface.press_key("enter") # Press a single key
await computer.interface.hotkey("command", "c") # Press key combination
await computer.interface.key_down("command") # Press and hold a key
await computer.interface.key_up("command") # Release a key
# Scrolling Actions
await computer.interface.scroll(x, y) # Scroll the mouse wheel
await computer.interface.scroll_down(clicks) # Scroll down
await computer.interface.scroll_up(clicks) # Scroll up
# Screen Actions
await computer.interface.screenshot() # Take a screenshot
await computer.interface.get_screen_size() # Get screen dimensions
# Clipboard Actions
await computer.interface.set_clipboard(text) # Set clipboard content
await computer.interface.copy_to_clipboard() # Get clipboard content
# File System Operations
await computer.interface.file_exists(path) # Check if file exists
await computer.interface.directory_exists(path) # Check if directory exists
await computer.interface.read_text(path, encoding="utf-8") # Read file content
await computer.interface.write_text(path, content, encoding="utf-8") # Write file content
await computer.interface.read_bytes(path) # Read file content as bytes
await computer.interface.write_bytes(path, content) # Write file content as bytes
await computer.interface.delete_file(path) # Delete file
await computer.interface.create_dir(path) # Create directory
await computer.interface.delete_dir(path) # Delete directory
await computer.interface.list_dir(path) # List directory contents
# Accessibility
await computer.interface.get_accessibility_tree() # Get accessibility tree
# Python Virtual Environment Operations
await computer.venv_install("demo_venv", ["requests", "macos-pyxa"]) # Install packages in a virtual environment
await computer.venv_cmd("demo_venv", "python -c 'import requests; print(requests.get(`https://httpbin.org/ip`).json())'") # Run a shell command in a virtual environment
await computer.venv_exec("demo_venv", python_function_or_code, *args, **kwargs) # Run a Python function in a virtual environment and return the result / raise an exception
# Example: Use sandboxed functions to execute code in a C/ua Container
from computer.helpers import sandboxed
@sandboxed("demo_venv")
def greet_and_print(name):
"""Get the HTML of the current Safari tab"""
import PyXA
safari = PyXA.Application("Safari")
html = safari.current_document.source()
print(f"Hello from inside the container, {name}!")
return {"greeted": name, "safari_html": html}
# When a @sandboxed function is called, it will execute in the container
result = await greet_and_print("C/ua")
# Result: {"greeted": "C/ua", "safari_html": "<html>...</html>"}
# stdout and stderr are also captured and printed / raised
print("Result from sandboxed function:", result)
```
## ComputerAgent Reference
For complete examples, see [agent_examples.py](./examples/agent_examples.py) or [agent_nb.ipynb](./notebooks/agent_nb.ipynb)
```python
# Import necessary components
from agent import ComputerAgent, LLM, AgentLoop, LLMProvider
# UI-TARS-1.5 agent for local execution with MLX
ComputerAgent(loop=AgentLoop.UITARS, model=LLM(provider=LLMProvider.MLXVLM, name="mlx-community/UI-TARS-1.5-7B-6bit"))
# OpenAI Computer-Use agent using OPENAI_API_KEY
ComputerAgent(loop=AgentLoop.OPENAI, model=LLM(provider=LLMProvider.OPENAI, name="computer-use-preview"))
# Anthropic Claude agent using ANTHROPIC_API_KEY
ComputerAgent(loop=AgentLoop.ANTHROPIC, model=LLM(provider=LLMProvider.ANTHROPIC))
# OmniParser loop for UI control using Set-of-Marks (SOM) prompting and any vision LLM
ComputerAgent(loop=AgentLoop.OMNI, model=LLM(provider=LLMProvider.OLLAMA, name="gemma3:12b-it-q4_K_M"))
# OpenRouter example using OAICOMPAT provider
ComputerAgent(
loop=AgentLoop.OMNI,
model=LLM(
provider=LLMProvider.OAICOMPAT,
name="openai/gpt-4o-mini",
provider_base_url="https://openrouter.ai/api/v1"
),
api_key="your-openrouter-api-key"
)
```
## Community
Join our [Discord community](https://discord.com/invite/mVnXXpdE85) to discuss ideas, get assistance, or share your demos!
+24 -1
View File
@@ -11,4 +11,27 @@ The Agent library provides programmatic interfaces for AI agent interactions.
## API Documentation
Coming soon.
```python
# Import necessary components
from agent import ComputerAgent, LLM, AgentLoop, LLMProvider
# UI-TARS-1.5 agent for local execution with MLX
ComputerAgent(loop=AgentLoop.UITARS, model=LLM(provider=LLMProvider.MLXVLM, name="mlx-community/UI-TARS-1.5-7B-6bit"))
# OpenAI Computer-Use agent using OPENAI_API_KEY
ComputerAgent(loop=AgentLoop.OPENAI, model=LLM(provider=LLMProvider.OPENAI, name="computer-use-preview"))
# Anthropic Claude agent using ANTHROPIC_API_KEY
ComputerAgent(loop=AgentLoop.ANTHROPIC, model=LLM(provider=LLMProvider.ANTHROPIC))
# OmniParser loop for UI control using Set-of-Marks (SOM) prompting and any vision LLM
ComputerAgent(loop=AgentLoop.OMNI, model=LLM(provider=LLMProvider.OLLAMA, name="gemma3:12b-it-q4_K_M"))
# OpenRouter example using OAICOMPAT provider
ComputerAgent(
loop=AgentLoop.OMNI,
model=LLM(
provider=LLMProvider.OAICOMPAT,
name="openai/gpt-4o-mini",
provider_base_url="https://openrouter.ai/api/v1"
),
api_key="your-openrouter-api-key"
)
```
+74 -2
View File
@@ -8,6 +8,78 @@ The Computer API reference documentation is currently under development.
The Computer library provides programmatic interfaces for computer automation and control.
## API Documentation
## Reference
Coming soon.
```python
# Shell Actions
result = await computer.interface.run_command(cmd) # Run shell command
# result.stdout, result.stderr, result.returncode
# Mouse Actions
await computer.interface.left_click(x, y) # Left click at coordinates
await computer.interface.right_click(x, y) # Right click at coordinates
await computer.interface.double_click(x, y) # Double click at coordinates
await computer.interface.move_cursor(x, y) # Move cursor to coordinates
await computer.interface.drag_to(x, y, duration) # Drag to coordinates
await computer.interface.get_cursor_position() # Get current cursor position
await computer.interface.mouse_down(x, y, button="left") # Press and hold a mouse button
await computer.interface.mouse_up(x, y, button="left") # Release a mouse button
# Keyboard Actions
await computer.interface.type_text("Hello") # Type text
await computer.interface.press_key("enter") # Press a single key
await computer.interface.hotkey("command", "c") # Press key combination
await computer.interface.key_down("command") # Press and hold a key
await computer.interface.key_up("command") # Release a key
# Scrolling Actions
await computer.interface.scroll(x, y) # Scroll the mouse wheel
await computer.interface.scroll_down(clicks) # Scroll down
await computer.interface.scroll_up(clicks) # Scroll up
# Screen Actions
await computer.interface.screenshot() # Take a screenshot
await computer.interface.get_screen_size() # Get screen dimensions
# Clipboard Actions
await computer.interface.set_clipboard(text) # Set clipboard content
await computer.interface.copy_to_clipboard() # Get clipboard content
# File System Operations
await computer.interface.file_exists(path) # Check if file exists
await computer.interface.directory_exists(path) # Check if directory exists
await computer.interface.read_text(path, encoding="utf-8") # Read file content
await computer.interface.write_text(path, content, encoding="utf-8") # Write file content
await computer.interface.read_bytes(path) # Read file content as bytes
await computer.interface.write_bytes(path, content) # Write file content as bytes
await computer.interface.delete_file(path) # Delete file
await computer.interface.create_dir(path) # Create directory
await computer.interface.delete_dir(path) # Delete directory
await computer.interface.list_dir(path) # List directory contents
# Accessibility
await computer.interface.get_accessibility_tree() # Get accessibility tree
# Python Virtual Environment Operations
await computer.venv_install("demo_venv", ["requests", "macos-pyxa"]) # Install packages in a virtual environment
await computer.venv_cmd("demo_venv", "python -c 'import requests; print(requests.get(`https://httpbin.org/ip`).json())'") # Run a shell command in a virtual environment
await computer.venv_exec("demo_venv", python_function_or_code, *args, **kwargs) # Run a Python function in a virtual environment and return the result / raise an exception
# Example: Use sandboxed functions to execute code in a C/ua Container
from computer.helpers import sandboxed
@sandboxed("demo_venv")
def greet_and_print(name):
"""Get the HTML of the current Safari tab"""
import PyXA
safari = PyXA.Application("Safari")
html = safari.current_document.source()
print(f"Hello from inside the container, {name}!")
return {"greeted": name, "safari_html": html}
# When a @sandboxed function is called, it will execute in the container
result = await greet_and_print("C/ua")
# Result: {"greeted": "C/ua", "safari_html": "<html>...</html>"}
# stdout and stderr are also captured and printed / raised
print("Result from sandboxed function:", result)
```
@@ -1,6 +1,7 @@
---
title: Compatibility
description: Compatibility information for running c/ua services.
icon: MonitorCheck
---
# Host OS Compatibility
@@ -20,32 +20,7 @@ Run the following command to setup the Docker containers and launch the Computer
_Best for contributors and active development._
This repository includes a [Dev Container](./.devcontainer/README.md) configuration that simplifies setup to a few steps:
1. **Install the Dev Containers extension ([VSCode](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers) or [WindSurf](https://docs.windsurf.com/windsurf/advanced#dev-containers-beta))**
2. **Open the repository in the Dev Container:**
- Press `Ctrl+Shift+P` (or `⌘+Shift+P` on macOS)
- **If you have _not_ cloned the repo:**
- Select `Dev Containers: Clone Repository in Container Volume...` and paste the repository URL:
```
https://github.com/trycua/cua.git
```
- **If you have already cloned the repo:**
- Select `Dev Containers: Open Folder in Container...` and choose your local folder.
> **Note**: On WindSurf, the post install hook might not run automatically. If it doesn't, run:
>
> ```
> /bin/bash .devcontainer/post-install.sh
> ```
3. **Open the VS Code workspace:** Once the post-install.sh is done running, open the python workspace located at `.vscode/py.code-workspace`.
4. **Run the Agent UI example:** Click <img src="https://github.com/user-attachments/assets/7a61ef34-4b22-4dab-9864-f86bf83e290b" className='inline-block mt-1 mb-1 rounded-md mx-1'/>
to start the Gradio UI. If prompted to install **debugpy (Python Debugger)** for remote debugging, select 'Yes' to proceed.
5. **Access the Gradio UI:** The Gradio UI will now be accessible http://localhost:7860.
Visit the [Dev Container](./dev-container-setup) guide to use the configuration that simplifies development setup to a few steps.
## PyPI
@@ -58,7 +33,7 @@ pip install -U "cua-computer[all]" "cua-agent[all]"
python -m agent.ui # Start the agent UI
```
Or check out the [Usage Guide](#-usage-guide) to learn how to use our Python SDK in your own code.
Or check out the [Usage Guide](./cua-usage-guide) to learn how to use our Python SDK in your own code.
---
@@ -73,4 +48,4 @@ Or check out the [Usage Guide](#-usage-guide) to learn how to use our Python SDK
# Compatibility
For detailed compatibility information including host OS support, VM emulation capabilities, and model provider compatibility, see the [Compatibility Guide](./compatibility).
For detailed compatibility information including host OS support, VM emulation capabilities, and model provider compatibility, see the [Compatibility Guide](../compatibility).
@@ -0,0 +1,82 @@
---
title: Dev Container Setup
description: Learn how to set up the Dev Container configuration that simplifies the development setup.
---
## Quick Start
![Guide-Animation](https://github.com/user-attachments/assets/447eaeeb-0eec-4354-9a82-44446e202e06)
1. **Install the Dev Containers extension ([VSCode](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers) or [WindSurf](https://docs.windsurf.com/windsurf/advanced#dev-containers-beta))**
2. **Open the repository in the Dev Container:**
- Press `Ctrl+Shift+P` (or `⌘+Shift+P` on macOS)
- **If you have _not_ cloned the repo:**
- Select `Dev Containers: Clone Repository in Container Volume...` and paste the repository URL:
```
https://github.com/trycua/cua.git
```
- **If you have already cloned the repo:** - Select `Dev Containers: Open Folder in Container...` and choose your local folder.
<Callout title="Windsurf Caveats">
The post install hook might not run automatically if you're using
Windsurf. If it didn't run, execute it manually:
<pre>
<code>/bin/bash .devcontainer/post-install.sh</code>
</pre>
</Callout>
3. **Open the VS Code workspace:** Once the post-install.sh is done running, open the python workspace located at `.vscode/py.code-workspace`.
4. **Run the Agent UI example:** Click <img src="https://github.com/user-attachments/assets/7a61ef34-4b22-4dab-9864-f86bf83e290b" className='inline-block mt-1 mb-1 rounded-md mx-1'/>
to start the Gradio UI. If prompted to install **debugpy (Python Debugger)** for remote debugging, select 'Yes' to proceed.
5. **Access the Gradio UI:** The Gradio UI will now be accessible http://localhost:7860.
## What's Included
The dev container automatically:
- ✅ Sets up Python 3.11 environment
- ✅ Installs all system dependencies (build tools, OpenGL, etc.)
- ✅ Configures Python paths for all packages
- ✅ Installs Python extensions (Black, Ruff, Pylance)
- ✅ Forwards port 7860 for the Gradio web UI
- ✅ Mounts your source code for live editing
- ✅ Creates the required `.env.local` file
## Running Examples
After the container is built, you can run examples directly:
```bash
# Run the agent UI (Gradio web interface)
python examples/agent_ui_examples.py
# Run computer examples
python examples/computer_examples.py
# Run computer UI examples
python examples/computer_ui_examples.py
```
The Gradio UI will be available at `http://localhost:7860` and will automatically forward to your host machine.
## Environment Variables
You'll need to add your API keys to `.env.local`:
```bash
# Required for Anthropic provider
ANTHROPIC_API_KEY=your_anthropic_key_here
# Required for OpenAI provider
OPENAI_API_KEY=your_openai_key_here
```
## Notes
- The container connects to `host.docker.internal:7777` for Lume server communication
- All Python packages are pre-installed and configured
- Source code changes are reflected immediately (no rebuild needed)
- The container uses the same Dockerfile as the regular Docker development environment
@@ -1,6 +1,7 @@
---
title: FAQ
description: Frequently Asked Questions
description: C/ua's frequently asked questions. Find answers to the most common issues or questions when using C/ua tools.
icon: CircleQuestionMark
---
### Why a local sandbox?
@@ -132,4 +133,4 @@ Where `<PID>` is the process ID shown in the output of the `lsof` command. After
### What information does Cua track?
Cua tracks anonymized usage and error report statistics; we ascribe to Posthog's approach as detailed [here](https://posthog.com/blog/open-source-telemetry-ethical). If you would like to opt out of sending anonymized info, you can set `telemetry_enabled` to false in the Computer or Agent constructor. Check out our [Telemetry](Telemetry.md) documentation for more details.
Cua tracks anonymized usage and error report statistics; we ascribe to Posthog's approach as detailed [here](https://posthog.com/blog/open-source-telemetry-ethical). If you would like to opt out of sending anonymized info, you can set `telemetry_enabled` to false in the Computer or Agent constructor. Check out our [telemetry](./telemetry) documentation for more details.
+61 -4
View File
@@ -1,13 +1,70 @@
---
title: Home
description: c/ua Documentation Home
icon: House
---
## What is C/ua?
C/ua is a collection of libraries and tools for building Computer-Use AI agents.
C/ua is a collection of cross-platform libraries and tools for building Computer-Use AI agents.
## Quick Start
<Cards>
<Card title="Libraries" href="/libraries" />
<Card title="Cloud Service" href="/cloud" />
<Card
href="https://docs.trycua.com/home/cua/computer-use-agent-quickstart"
title="Computer-Use Agent UI">
Read our guide on getting started with a Computer-Use Agent.
</Card>
<Card
href="https://docs.trycua.com/home/cua/cua-usage-guide"
title="C/ua Usage Guide">
Get started using C/ua services on your machine.
</Card>
<Card
href="https://docs.trycua.com/home/cua/dev-container-setup"
title="Dev Container Setup">
Set up a development environment with the Dev Container.
</Card>
</Cards>
## Resources
- [How to use the MCP Server with Claude Desktop or other MCP clients](./libraries/mcp-server) - One of the easiest ways to get started with C/ua
- [How to use OpenAI Computer-Use, Anthropic, OmniParser, or UI-TARS for your Computer-Use Agent](./libraries/agent)
- [How to use Lume CLI for managing desktops](./libraries/lume)
- [Training Computer-Use Models: Collecting Human Trajectories with C/ua (Part 1)](https://www.trycua.com/blog/training-computer-use-models-trajectories-1)
- [Build Your Own Operator on macOS (Part 1)](https://www.trycua.com/blog/build-your-own-operator-on-macos-1)
## Modules
| Module | Description | Installation |
| ------------------------------------------------------ | -------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------- |
| [**Lume**](./libraries/lume.mdx) | VM management for macOS/Linux using Apple's Virtualization.Framework | `curl -fsSL https://raw.githubusercontent.com/trycua/cua/main/libs/lume/scripts/install.sh \| bash` |
| [**Lumier**](./libraries/lumier.mdx) | Docker interface for macOS and Linux VMs | `docker pull trycua/lumier:latest` |
| [**Computer**](./libraries/computer.mdx) | Python Interface for controlling virtual machines | `pip install "cua-computer[all]"`<br/><br/>`npm install @trycua/computer` |
| [**Agent**](./libraries/agent.mdx) | AI agent framework for automating tasks | `pip install "cua-agent[all]"` |
| [**MCP Server**](./libraries/mcp-server.mdx) | MCP server for using CUA with Claude Desktop | `pip install cua-mcp-server` |
| [**SOM**](./libs/python/som/README.md) | Self-of-Mark library for Agent | `pip install cua-som` |
| [**Computer Server**](./libraries/computer-server.mdx) | Server component for Computer | `pip install cua-computer-server` |
| [**Core**](./libraries/core.mdx) | Python Core utilities | `pip install cua-core`<br/><br/>`npm install @trycua/core` |
## Community
Join our [Discord community](https://discord.com/invite/mVnXXpdE85) to discuss ideas, get assistance, or share your demos!
## License
Cua is open-sourced under the MIT License - see the [LICENSE](https://github.com/trycua/cua/blob/main/LICENSE.md) file for details.
Microsoft's OmniParser, which is used in this project, is licensed under the Creative Commons Attribution 4.0 International License (CC-BY-4.0) - see the [OmniParser LICENSE](https://github.com/microsoft/OmniParser/blob/master/LICENSE) file for details.
## Contributing
We welcome contributions to CUA! Please refer to our [Contributing Guidelines](https://github.com/trycua/cua/blob/main/CONTRIBUTING.md) for details.
## Trademarks
Apple, macOS, and Apple Silicon are trademarks of Apple Inc. Ubuntu and Canonical are registered trademarks of Canonical Ltd. Microsoft is a registered trademark of Microsoft Corporation. This project is not affiliated with, endorsed by, or sponsored by Apple Inc., Canonical Ltd., or Microsoft Corporation.
+15 -2
View File
@@ -10,8 +10,7 @@ title: Agent
margin: '0 auto',
width: '100%',
justifyContent: 'center',
}}
>
}}>
<a href="#">
<img
src="https://img.shields.io/badge/Python-333333?logo=python&logoColor=white&labelColor=blue"
@@ -38,6 +37,20 @@ title: Agent
</a>
</div>
<div
align="center"
style={{
display: 'flex',
margin: '0 0',
width: '100%',
justifyContent: 'center',
}}
>
<a href="/api/computer">
Reference
</a>
</div>
**cua-agent** is a general Computer-Use framework for running multi-app agentic workflows targeting macOS and Linux sandbox created with C/ua, supporting local (Ollama) and cloud model providers (OpenAI, Anthropic, Groq, DeepSeek, Qwen).
### Get started with Agent
+18 -8
View File
@@ -2,7 +2,7 @@
title: Computer
---
import { Tabs, Tab } from "fumadocs-ui/components/tabs";
import { Tabs, Tab } from 'fumadocs-ui/components/tabs';
<div
align="center"
@@ -39,8 +39,21 @@ import { Tabs, Tab } from "fumadocs-ui/components/tabs";
/>
</a>
</div>
<div
align="center"
style={{
display: 'flex',
margin: '0 0',
width: '100%',
justifyContent: 'center',
}}
>
<a href="/api/computer">
Reference
</a>
</div>
**cua-computer** is a Computer-Use Interface (CUI) framework powering Cua for interacting with local macOS and Linux sandboxes. It's PyAutoGUI-compatible and pluggable with any AI agent systems (Cua, Langchain, CrewAI, AutoGen). Computer relies on [Lume](https://github.com/trycua/lume) for creating and managing sandbox environments.
**cua-computer** is a Computer-Use Interface (CUI) framework powering Cua for interacting with local macOS and Linux sandboxes. It's PyAutoGUI-compatible and pluggable with any AI agent systems (Cua, Langchain, CrewAI, AutoGen). Computer relies on [Lume](./lume.mdx) for creating and managing sandbox environments.
<div align="center">
<img src="/img/computer.png" />
@@ -219,8 +232,7 @@ For examples, see [Computer UI Examples](https://github.com/trycua/cua/tree/main
<video
src="https://github.com/user-attachments/assets/de3c3477-62fe-413c-998d-4063e48de176"
controls
width="600"
></video>
width="600"></video>
</details>
Record yourself performing various computer tasks using the UI.
@@ -232,8 +244,7 @@ Record yourself performing various computer tasks using the UI.
<video
src="https://github.com/user-attachments/assets/5ad1df37-026a-457f-8b49-922ae805faef"
controls
width="600"
></video>
width="600"></video>
</details>
Save each task by picking a descriptive name and adding relevant tags (e.g., "office", "web-browsing", "coding").
@@ -249,8 +260,7 @@ Repeat steps 3 and 4 until you have a good amount of demonstrations covering dif
<video
src="https://github.com/user-attachments/assets/c586d460-3877-4b5f-a736-3248886d2134"
controls
width="600"
></video>
width="600"></video>
</details>
Upload your dataset to Huggingface by:
+6 -2
View File
@@ -4,9 +4,13 @@
"root": true,
"defaultOpen": true,
"pages": [
"---[House]c/ua---",
"index",
"compatibility",
"faq",
"telemetry",
"---[BookCopy]Guides---",
"...cua",
"---[Library]Libraries---",
"...libraries"
]
}
}
+81
View File
@@ -0,0 +1,81 @@
---
title: Telemetry
description: This document explains how telemetry works in CUA libraries and how you can control it.
icon: RadioTower
---
# Telemetry in CUA
CUA tracks anonymized usage and error report statistics; we ascribe to Posthog's approach as detailed [here](https://posthog.com/blog/open-source-telemetry-ethical). If you would like to opt out of sending anonymized info, you can set `telemetry_enabled` to false.
## What telemetry data we collect
CUA libraries collect minimal anonymous usage data to help improve our software. The telemetry data we collect is specifically limited to:
- Basic system information:
- Operating system (e.g., 'darwin', 'win32', 'linux')
- Python version (e.g., '3.11.0')
- Module initialization events:
- When a module (like 'computer' or 'agent') is imported
- Version of the module being used
We do NOT collect:
- Personal information
- Contents of files
- Specific text being typed
- Actual screenshots or screen contents
- User-specific identifiers
- API keys
- File contents
- Application data or content
- User interactions with the computer
- Information about files being accessed
## Controlling Telemetry
We are committed to transparency and user control over telemetry. There are two ways to control telemetry:
### 1. Environment Variable (Global Control)
Telemetry is enabled by default. To disable telemetry, set the `CUA_TELEMETRY_ENABLED` environment variable to a falsy value (`0`, `false`, `no`, or `off`):
```bash
# Disable telemetry before running your script
export CUA_TELEMETRY_ENABLED=false
# Or as part of the command
CUA_TELEMETRY_ENABLED=1 python your_script.py
```
Or from Python:
```python
import os
os.environ["CUA_TELEMETRY_ENABLED"] = "false"
```
### 2. Instance-Level Control
You can control telemetry for specific CUA instances by setting `telemetry_enabled` when creating them:
```python
# Disable telemetry for a specific Computer instance
computer = Computer(telemetry_enabled=False)
# Enable telemetry for a specific Agent instance
agent = ComputerAgent(telemetry_enabled=True)
```
You can check if telemetry is enabled for an instance:
```python
print(computer.telemetry_enabled) # Will print True or False
```
Note that telemetry settings must be configured during initialization and cannot be changed after the object is created.
## Transparency
We believe in being transparent about the data we collect. If you have any questions about our telemetry practices, please open an issue on our GitHub repository.
+1 -1
View File
@@ -12,7 +12,7 @@
"fumadocs-core": "15.5.1",
"fumadocs-mdx": "11.6.7",
"fumadocs-ui": "15.5.1",
"lucide-react": "^0.514.0",
"lucide-react": "^0.525.0",
"next": "15.3.3",
"react": "^19.1.0",
"react-dom": "^19.1.0"
+5 -5
View File
@@ -18,8 +18,8 @@ importers:
specifier: 15.5.1
version: 15.5.1(@types/react-dom@19.1.6(@types/react@19.1.8))(@types/react@19.1.8)(next@15.3.3(react-dom@19.1.0(react@19.1.0))(react@19.1.0))(react-dom@19.1.0(react@19.1.0))(react@19.1.0)(tailwindcss@4.1.10)
lucide-react:
specifier: ^0.514.0
version: 0.514.0(react@19.1.0)
specifier: ^0.525.0
version: 0.525.0(react@19.1.0)
next:
specifier: 15.3.3
version: 15.3.3(react-dom@19.1.0(react@19.1.0))(react@19.1.0)
@@ -1391,8 +1391,8 @@ packages:
resolution: {integrity: sha512-QIXZUBJUx+2zHUdQujWejBkcD9+cs94tLn0+YL8UrCh+D5sCXZ4c7LaEH48pNwRY3MLDgqUFyhlCyjJPf1WP0A==}
engines: {node: 20 || >=22}
lucide-react@0.514.0:
resolution: {integrity: sha512-HXD0OAMd+JM2xCjlwG1EGW9Nuab64dhjO3+MvdyD+pSUeOTBaVAPhQblKIYmmX4RyBYbdzW0VWnJpjJmxWGr6w==}
lucide-react@0.525.0:
resolution: {integrity: sha512-Tm1txJ2OkymCGkvwoHt33Y2JpN5xucVq1slHcgE6Lk0WjDfjgKWor5CdVER8U6DvcfMwh4M8XxmpTiyzfmfDYQ==}
peerDependencies:
react: ^16.5.1 || ^17.0.0 || ^18.0.0 || ^19.0.0
@@ -3160,7 +3160,7 @@ snapshots:
lru-cache@11.1.0: {}
lucide-react@0.514.0(react@19.1.0):
lucide-react@0.525.0(react@19.1.0):
dependencies:
react: 19.1.0