mirror of
https://github.com/trycua/lume.git
synced 2026-01-06 04:20:03 -06:00
Merge pull request #641 from trycua/f-trycua/readme-badge-colors
Update README badges with sky/emerald colors and larger logo
This commit is contained in:
24
README.md
24
README.md
@@ -1,14 +1,22 @@
|
||||
<div align="center">
|
||||
<picture>
|
||||
<source media="(prefers-color-scheme: dark)" alt="Cua logo" height="150" srcset="img/logo_white.png">
|
||||
<source media="(prefers-color-scheme: light)" alt="Cua logo" height="150" srcset="img/logo_black.png">
|
||||
<img alt="Cua logo" height="150" src="img/logo_black.png">
|
||||
</picture>
|
||||
<a href="https://cua.ai" target="_blank" rel="noopener noreferrer">
|
||||
<picture>
|
||||
<source media="(prefers-color-scheme: dark)" alt="Cua logo" width="150" srcset="img/logo_white.png">
|
||||
<source media="(prefers-color-scheme: light)" alt="Cua logo" width="150" srcset="img/logo_black.png">
|
||||
<img alt="Cua logo" width="500" src="img/logo_black.png">
|
||||
</picture>
|
||||
</a>
|
||||
|
||||
[](#)
|
||||
[](https://discord.com/invite/mVnXXpdE85)
|
||||
<br>
|
||||
<p align="center">Build and deploy AI agents that can reason, plan and act on any Computers</p>
|
||||
|
||||
<p align="center">
|
||||
<a href="https://cua.ai" target="_blank" rel="noopener noreferrer"><img src="https://img.shields.io/badge/cua.ai-0ea5e9" alt="cua.ai"></a>
|
||||
<a href="https://discord.gg/cua" target="_blank" rel="noopener noreferrer"><img src="https://img.shields.io/badge/Discord-Join%20Server-10b981?logo=discord&logoColor=white" alt="Discord"></a>
|
||||
<a href="https://x.com/trycua" target="_blank" rel="noopener noreferrer"><img src="https://img.shields.io/twitter/follow/trycua?style=social" alt="Twitter"></a>
|
||||
<a href="https://cua.ai/docs" target="_blank" rel="noopener noreferrer"><img src="https://img.shields.io/badge/Docs-0ea5e9.svg" alt="Documentation"></a>
|
||||
<br>
|
||||
<a href="https://trendshift.io/repositories/13685" target="_blank"><img src="https://trendshift.io/api/badge/repositories/13685" alt="trycua%2Fcua | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>
|
||||
</p>
|
||||
|
||||
</div>
|
||||
|
||||
|
||||
@@ -4,11 +4,7 @@ description: Supported computer-using agent loops and models
|
||||
---
|
||||
|
||||
<Callout>
|
||||
A corresponding{' '}
|
||||
<a href="https://github.com/trycua/cua/blob/main/notebooks/agent_nb.ipynb" target="_blank">
|
||||
Jupyter Notebook
|
||||
</a>{' '}
|
||||
is available for this documentation.
|
||||
A corresponding <a href="https://github.com/trycua/cua/blob/main/notebooks/agent_nb.ipynb" target="_blank">Jupyter Notebook</a> is available for this documentation.
|
||||
</Callout>
|
||||
|
||||
An agent can be thought of as a loop - it generates actions, executes them, and repeats until done:
|
||||
|
||||
@@ -3,14 +3,7 @@ title: Customize ComputerAgent
|
||||
---
|
||||
|
||||
<Callout>
|
||||
A corresponding{' '}
|
||||
<a
|
||||
href="https://github.com/trycua/cua/blob/main/notebooks/customizing_computeragent.ipynb"
|
||||
target="_blank"
|
||||
>
|
||||
Jupyter Notebook
|
||||
</a>{' '}
|
||||
is available for this documentation.
|
||||
A corresponding <a href="https://github.com/trycua/cua/blob/main/notebooks/customizing_computeragent.ipynb" target="_blank">Jupyter Notebook</a> is available for this documentation.
|
||||
</Callout>
|
||||
|
||||
The `ComputerAgent` interface provides an easy proxy to any computer-using model configuration, and it is a powerful framework for extending and building your own agentic systems.
|
||||
|
||||
@@ -4,11 +4,7 @@ description: Use ComputerAgent with HUD for benchmarking and evaluation
|
||||
---
|
||||
|
||||
<Callout>
|
||||
A corresponding{' '}
|
||||
<a href="https://github.com/trycua/cua/blob/main/notebooks/eval_osworld.ipynb" target="_blank">
|
||||
Jupyter Notebook
|
||||
</a>{' '}
|
||||
is available for this documentation.
|
||||
A corresponding <a href="https://github.com/trycua/cua/blob/main/notebooks/eval_osworld.ipynb" target="_blank">Jupyter Notebook</a> is available for this documentation.
|
||||
</Callout>
|
||||
|
||||
The HUD integration allows an agent to be benchmarked using the [HUD framework](https://www.hud.so/). Through the HUD integration, the agent controls a computer inside HUD, where tests are run to evaluate the success of each task.
|
||||
|
||||
17
docs/content/docs/agent-sdk/mcp-server/index.mdx
Normal file
17
docs/content/docs/agent-sdk/mcp-server/index.mdx
Normal file
@@ -0,0 +1,17 @@
|
||||
---
|
||||
title: MCP Server
|
||||
description: Run Cua agents through Claude Desktop and other MCP clients
|
||||
---
|
||||
|
||||
The MCP Server exposes Cua agents as tools for [Model Context Protocol](https://modelcontextprotocol.io/) clients like Claude Desktop. This lets you ask Claude to perform computer tasks directly from the chat interface.
|
||||
|
||||
```bash
|
||||
pip install cua-mcp-server
|
||||
```
|
||||
|
||||
## Key Features
|
||||
|
||||
- **Claude Desktop integration** - Use Cua agents directly in Claude's chat
|
||||
- **Multi-client support** - Concurrent sessions with automatic resource management
|
||||
- **Progress reporting** - Real-time updates during task execution
|
||||
- **VM safety** - Runs in sandboxed VMs by default
|
||||
@@ -14,6 +14,7 @@
|
||||
"usage-tracking",
|
||||
"telemetry",
|
||||
"benchmarks",
|
||||
"integrations"
|
||||
"integrations",
|
||||
"mcp-server"
|
||||
]
|
||||
}
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
---
|
||||
title: Commands
|
||||
title: Command Reference
|
||||
description: Complete reference for all CUA CLI commands
|
||||
---
|
||||
|
||||
68
docs/content/docs/cli-playbook/index.mdx
Normal file
68
docs/content/docs/cli-playbook/index.mdx
Normal file
@@ -0,0 +1,68 @@
|
||||
---
|
||||
title: Getting Started
|
||||
description: Install and set up the CUA CLI
|
||||
---
|
||||
|
||||
import { Tabs, Tab } from 'fumadocs-ui/components/tabs';
|
||||
import { Callout } from 'fumadocs-ui/components/callout';
|
||||
|
||||
The Cua CLI is a command-line tool for managing your Cua cloud sandboxes. Create, start, stop, and connect to sandboxes directly from your terminal.
|
||||
|
||||
## Installation
|
||||
|
||||
<Tabs items={['macOS / Linux', 'Windows']}>
|
||||
<Tab value="macOS / Linux">
|
||||
```bash
|
||||
curl -LsSf https://cua.ai/cli/install.sh | sh
|
||||
```
|
||||
</Tab>
|
||||
<Tab value="Windows">
|
||||
```powershell
|
||||
powershell -ExecutionPolicy ByPass -c "irm https://cua.ai/cli/install.ps1 | iex"
|
||||
```
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
This installs [Bun](https://bun.sh) and the CUA CLI. Verify with:
|
||||
|
||||
```bash
|
||||
cua --help
|
||||
```
|
||||
|
||||
## Authentication
|
||||
|
||||
Login to your CUA account:
|
||||
|
||||
```bash
|
||||
# Browser-based login
|
||||
cua auth login
|
||||
|
||||
# Or with API key
|
||||
cua auth login --api-key sk-your-api-key-here
|
||||
```
|
||||
|
||||
Generate a `.env` file for your project:
|
||||
|
||||
```bash
|
||||
cua auth env
|
||||
```
|
||||
|
||||
## Quick Start
|
||||
|
||||
```bash
|
||||
# Create a sandbox
|
||||
cua create --os linux --size small --region north-america
|
||||
|
||||
# List sandboxes
|
||||
cua list
|
||||
|
||||
# Open VNC in browser
|
||||
cua vnc my-sandbox
|
||||
|
||||
# Stop a sandbox
|
||||
cua stop my-sandbox
|
||||
```
|
||||
|
||||
## Next Steps
|
||||
|
||||
- [Command Reference](/cli-playbook/commands) - Full list of available commands
|
||||
5
docs/content/docs/cli-playbook/meta.json
Normal file
5
docs/content/docs/cli-playbook/meta.json
Normal file
@@ -0,0 +1,5 @@
|
||||
{
|
||||
"title": "Cloud CLI",
|
||||
"description": "Command-line interface for CUA Cloud",
|
||||
"pages": ["index", "commands"]
|
||||
}
|
||||
@@ -5,7 +5,7 @@ description: Computer commands and interface methods
|
||||
|
||||
This page describes the set of supported **commands** you can use to control a Cua Computer directly via the Python SDK.
|
||||
|
||||
These commands map to the same actions available in the [Computer Server API Commands Reference](../libraries/computer-server/Commands), and provide low-level, async access to system operations from your agent or automation code.
|
||||
These commands map to the same actions available in the [Computer Server API Commands Reference](/computer-sdk/computer-server/Commands), and provide low-level, async access to system operations from your agent or automation code.
|
||||
|
||||
## Shell Actions
|
||||
|
||||
|
||||
15
docs/content/docs/computer-sdk/computer-server/index.mdx
Normal file
15
docs/content/docs/computer-sdk/computer-server/index.mdx
Normal file
@@ -0,0 +1,15 @@
|
||||
---
|
||||
title: Computer Server
|
||||
description: HTTP/WebSocket server for remote computer control
|
||||
---
|
||||
|
||||
The Computer Server is an HTTP and WebSocket server that runs inside each Cua sandbox (VM or container). It exposes APIs for remote computer control - allowing the Computer SDK and agents to execute actions like clicking, typing, taking screenshots, and running commands on the sandboxed environment.
|
||||
|
||||
When you use `Computer(provider_type="cloud")` or any other provider, the Computer SDK communicates with this server running inside the sandbox to execute your automation commands.
|
||||
|
||||
## Key Features
|
||||
|
||||
- **REST API** - Execute commands, take screenshots, manage files
|
||||
- **WebSocket API** - Real-time streaming for continuous interaction
|
||||
- **Cross-platform** - Runs on Linux, macOS, and Windows sandboxes
|
||||
- **Secure** - Isolated inside the sandbox environment
|
||||
4
docs/content/docs/computer-sdk/computer-server/meta.json
Normal file
4
docs/content/docs/computer-sdk/computer-server/meta.json
Normal file
@@ -0,0 +1,4 @@
|
||||
{
|
||||
"title": "Computer Server",
|
||||
"pages": ["index", "Commands", "REST-API", "WebSocket-API"]
|
||||
}
|
||||
@@ -1,5 +1,5 @@
|
||||
---
|
||||
title: Computer UI (Deprecated)
|
||||
title: Computer UI
|
||||
---
|
||||
|
||||
<Callout type="warn" title="Deprecated">
|
||||
|
||||
@@ -7,6 +7,7 @@
|
||||
"tracing-api",
|
||||
"sandboxed-python",
|
||||
"custom-computer-handlers",
|
||||
"computer-ui"
|
||||
"computer-ui",
|
||||
"computer-server"
|
||||
]
|
||||
}
|
||||
|
||||
@@ -4,14 +4,7 @@ slug: sandboxed-python
|
||||
---
|
||||
|
||||
<Callout>
|
||||
A corresponding{' '}
|
||||
<a
|
||||
href="https://github.com/trycua/cua/blob/main/examples/sandboxed_functions_examples.py"
|
||||
target="_blank"
|
||||
>
|
||||
Python example
|
||||
</a>{' '}
|
||||
is available for this documentation.
|
||||
A corresponding <a href="https://github.com/trycua/cua/blob/main/examples/sandboxed_functions_examples.py" target="_blank">Python example</a> is available for this documentation.
|
||||
</Callout>
|
||||
|
||||
You can run Python functions securely inside a sandboxed virtual environment on a remote Cua Computer. This is useful for executing untrusted user code, isolating dependencies, or providing a safe environment for automation tasks.
|
||||
|
||||
@@ -1,9 +1,9 @@
|
||||
---
|
||||
title: Computer Tracing API
|
||||
title: Tracing
|
||||
description: Record computer interactions for debugging, training, and analysis
|
||||
---
|
||||
|
||||
# Computer Tracing API
|
||||
# Tracing
|
||||
|
||||
The Computer tracing API provides a powerful way to record computer interactions for debugging, training, analysis, and compliance purposes. Inspired by Playwright's tracing functionality, it offers flexible recording options and standardized output formats.
|
||||
|
||||
|
||||
@@ -19,8 +19,6 @@ import { Code, Terminal } from 'lucide-react';
|
||||
</Card>
|
||||
</div> */}
|
||||
|
||||
---
|
||||
|
||||
## Set Up Your Computer Environment
|
||||
|
||||
Choose how you want to run your Cua computer. This will be the environment where your automated tasks will execute.
|
||||
@@ -43,7 +41,7 @@ You can run your Cua computer in the cloud (recommended for easiest setup), loca
|
||||
**Option 1: Via Website**
|
||||
|
||||
1. Navigate to **Dashboard > Sandboxes > Create Sandbox**
|
||||
2. Create a **Small** sandbox, choosing **Linux**, **Windows**, or **macOS**
|
||||
2. Create a sandbox, choosing **Linux**, **Windows**, or **macOS**
|
||||
3. Note your sandbox name
|
||||
|
||||
**Option 2: Via CLI**
|
||||
|
||||
@@ -4,55 +4,39 @@ title: Introduction
|
||||
|
||||
import { Monitor, Code, BookOpen, Zap, Bot, Boxes, Rocket } from 'lucide-react';
|
||||
|
||||
<div className="rounded-lg border bg-card text-card-foreground shadow-sm px-4 py-2 mb-6">
|
||||
Cua is an open-source framework for building **Computer-Use Agents** - AI systems that see,
|
||||
understand, and interact with desktop applications through vision and action, just like humans do.
|
||||
<div className="not-prose relative rounded-xl overflow-hidden mb-8 w-full h-40 -mt-4">
|
||||
<img src="/docs/img/bg-light.jpg" alt="Cua" className="w-full h-full object-cover block dark:hidden rounded-xl" />
|
||||
<img src="/docs/img/bg-dark.jpg" alt="Cua" className="w-full h-full object-cover hidden dark:block rounded-xl" />
|
||||
<div className="absolute inset-0 flex items-center justify-center"><div className="bg-black/50 backdrop-blur-sm rounded-lg py-3 px-5 [&_p]:m-0 [&_p]:p-0"><span className="text-white text-lg font-medium" style={{display: 'block', margin: 0, padding: 0, lineHeight: '1.2'}}>Build AI agents that see, understand, and control any computer</span></div></div>
|
||||
</div>
|
||||
|
||||
## Why Cua?
|
||||
**Cua** ("koo-ah") is an open-source framework for Computer-Use Agents - enabling AI systems to autonomously operate computers through visual understanding and action execution. Used for research, evaluation, and production deployment of desktop, browser, and mobile automation agents.
|
||||
|
||||
Cua gives you everything you need to automate any desktop application without brittle selectors or APIs.
|
||||
## What are Computer-Use Agents?
|
||||
|
||||
Some highlights include:
|
||||
Computer-Use Agents (CUAs) are AI systems that can autonomously interact with computer interfaces through visual understanding and action execution. Unlike traditional automation tools that rely on brittle selectors or APIs, CUAs use vision-language models to perceive screen content and reason about interface interactions - enabling them to adapt to UI changes and handle complex, multi-step workflows across applications.
|
||||
|
||||
- **Model flexibility** - Connect to 100+ LLM providers through liteLLM's standard interface. Use models from Anthropic, OpenAI, Google, and more - or run them locally with Ollama, Hugging Face, or MLX.
|
||||
- **Composed agents** - Mix and match grounding models with planning models for optimal performance. Use specialized models like GTA, OpenCUA, or OmniParser for UI element detection paired with powerful reasoning models like Claude or GPT-4.
|
||||
- **Cross-platform sandboxes** - Run agents safely in isolated environments. Choose from Docker containers, macOS VMs with Lume, Windows Sandbox, or deploy to Cua Cloud with production-ready infrastructure.
|
||||
- **Computer SDK** - Control any application with a PyAutoGUI-like API. Click, type, scroll, take screenshots, manage windows, read/write files - everything you need for desktop automation.
|
||||
- **Agent SDK** - Build autonomous agents with trajectory tracing, prompt caching, cost tracking, and budget controls. Test agents on industry-standard benchmarks like OSWorld-Verified with one line of code.
|
||||
- **Human-in-the-loop** - Pause agent execution and await user input or approval before continuing. Use the `human/human` model string to let humans control the agent directly.
|
||||
- **Production essentials** - Ship reliable agents with built-in PII anonymization, cost tracking, trajectory logging, and integration with observability platforms like Laminar and HUD.
|
||||
## Key Features
|
||||
|
||||
## What can you build?
|
||||
With the **Computer SDK**, you can:
|
||||
- Automate **Windows, Linux, and macOS** VMs with a consistent, pyautogui-like API
|
||||
- Create & manage VMs locally or using **Cua Cloud**
|
||||
|
||||
- RPA automation that works with any application - even legacy software without APIs.
|
||||
- Form-filling agents that handle complex multi-step web workflows.
|
||||
- Testing automation that adapts to UI changes without brittle selectors.
|
||||
- Data extraction from desktop applications and document processing.
|
||||
- Cross-application workflows that combine multiple tools and services.
|
||||
- Research agents that browse, read, and synthesize information from the web.
|
||||
With the **Agent SDK**, you can:
|
||||
- Run computer-use models with a consistent schema
|
||||
- Benchmark on **OSWorld-Verified**, **SheetBench-V2**, and **ScreenSpot**
|
||||
- Combine UI grounding models with any LLM using **composed agents**
|
||||
- Use **100+ models** via API or local inference (Claude, GPT-4, Gemini, Ollama, MLX)
|
||||
|
||||
Explore real-world examples in our [blog posts](https://cua.ai/blog).
|
||||
## Get Started
|
||||
|
||||
## Get started
|
||||
Follow the [Quickstart guide](/get-started/quickstart) for step-by-step setup with Python or TypeScript.
|
||||
|
||||
Follow the [Quickstart guide](/docs/get-started/quickstart) for step-by-step setup with Python or TypeScript.
|
||||
Check out our [tutorials](https://cua.ai/blog), [examples](https://github.com/trycua/cua/tree/main/examples), and [notebooks](https://github.com/trycua/cua/tree/main/notebooks) to start building with Cua today.
|
||||
|
||||
If you're new to computer-use agents, check out our [tutorials](https://cua.ai/blog), [examples](https://github.com/trycua/cua/tree/main/examples), and [notebooks](https://github.com/trycua/cua/tree/main/notebooks) to start building with Cua today.
|
||||
|
||||
<div className="grid grid-cols-1 md:grid-cols-2 gap-6 mt-8">
|
||||
<Card icon={<Rocket />} href="/get-started/quickstart" title="Quickstart">
|
||||
Get up and running in 3 steps with Python or TypeScript.
|
||||
</Card>
|
||||
<Card icon={<Zap />} href="/agent-sdk/agent-loops" title="Agent Loops">
|
||||
Learn how agents work and how to build your own.
|
||||
</Card>
|
||||
<Card icon={<BookOpen />} href="/computer-sdk/computers" title="Computer SDK">
|
||||
Control desktop applications with the Computer SDK.
|
||||
</Card>
|
||||
<Card icon={<Monitor />} href="/example-usecases/form-filling" title="Example Use Cases">
|
||||
See Cua in action with real-world examples.
|
||||
</Card>
|
||||
<div className="grid grid-cols-2 md:grid-cols-4 gap-2 mt-4 text-sm">
|
||||
<Card icon={<Rocket className="w-4 h-4" />} href="/get-started/quickstart" title="Quickstart" />
|
||||
<Card icon={<Zap className="w-4 h-4" />} href="/agent-sdk/agent-loops" title="Agent Loops" />
|
||||
<Card icon={<BookOpen className="w-4 h-4" />} href="/computer-sdk/computers" title="Computer SDK" />
|
||||
<Card icon={<Monitor className="w-4 h-4" />} href="/example-usecases/form-filling" title="Examples" />
|
||||
</div>
|
||||
|
||||
We can't wait to see what you build with Cua ✨
|
||||
|
||||
@@ -1,21 +0,0 @@
|
||||
---
|
||||
title: Agent
|
||||
description: Reference for the current version of the Agent library.
|
||||
pypi: cua-agent
|
||||
github:
|
||||
- https://github.com/trycua/cua/tree/main/libs/python/agent
|
||||
---
|
||||
|
||||
The Agent library provides the ComputerAgent class and tools for building AI agents that automate workflows on Cua Computers.
|
||||
|
||||
## Agent Loops
|
||||
|
||||
See the [Agent Loops](../agent-sdk/agent-loops) documentation for how agents process information and take actions.
|
||||
|
||||
## Chat History
|
||||
|
||||
See the [Chat History](../agent-sdk/chat-history) documentation for managing conversational context and turn-by-turn interactions.
|
||||
|
||||
## Callbacks
|
||||
|
||||
See the [Callbacks](../agent-sdk/callbacks) documentation for extending and customizing agent behavior with custom hooks.
|
||||
@@ -1,24 +0,0 @@
|
||||
---
|
||||
title: Computer Server
|
||||
descrption: Reference for the current version of the Computer Server library.
|
||||
pypi: cua-computer-server
|
||||
github:
|
||||
- https://github.com/trycua/cua/tree/main/libs/python/computer-server
|
||||
---
|
||||
|
||||
<Callout>
|
||||
A corresponding{' '}
|
||||
<a
|
||||
href="https://github.com/trycua/cua/blob/main/notebooks/computer_server_nb.ipynb"
|
||||
target="_blank"
|
||||
>
|
||||
Jupyter Notebook
|
||||
</a>{' '}
|
||||
is available for this documentation.
|
||||
</Callout>
|
||||
|
||||
The Computer Server API reference documentation is currently under development.
|
||||
|
||||
## Overview
|
||||
|
||||
The Computer Server provides WebSocket and REST API endpoints for remote computer control and automation.
|
||||
@@ -1,23 +0,0 @@
|
||||
---
|
||||
title: Computer
|
||||
description: Reference for the current version of the Computer library.
|
||||
pypi: cua-computer
|
||||
npm: '@trycua/computer'
|
||||
github:
|
||||
- https://github.com/trycua/cua/tree/main/libs/python/computer
|
||||
- https://github.com/trycua/cua/tree/main/libs/typescript/computer
|
||||
---
|
||||
|
||||
The Computer library provides a Computer class for controlling and automating containers running the Computer Server.
|
||||
|
||||
## Connecting to Computers
|
||||
|
||||
See the [Cua Computers](../computer-sdk/computers) documentation for how to connect to different computer types (cloud, local, or host desktop).
|
||||
|
||||
## Computer Commands
|
||||
|
||||
See the [Commands](../computer-sdk/commands) documentation for all supported commands and interface methods (Shell, Mouse, Keyboard, File System, etc.).
|
||||
|
||||
## Sandboxed Python Functions
|
||||
|
||||
See the [Sandboxed Python](../computer-sdk/sandboxed-python) documentation for running Python functions securely in isolated environments on a remote Cua Computer.
|
||||
@@ -1,13 +0,0 @@
|
||||
---
|
||||
title: Core
|
||||
description: Reference for the current version of the Core library.
|
||||
pypi: cua-core
|
||||
npm: '@trycua/core'
|
||||
github:
|
||||
- https://github.com/trycua/cua/tree/main/libs/python/core
|
||||
- https://github.com/trycua/cua/tree/main/libs/typescript/core
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
The Core library provides foundational utilities and shared functionality across the CUA ecosystem.
|
||||
@@ -1,58 +0,0 @@
|
||||
---
|
||||
title: Cua CLI
|
||||
description: Command-line interface for managing Cua cloud sandboxes and authentication
|
||||
---
|
||||
|
||||
import { Tabs, Tab } from 'fumadocs-ui/components/tabs';
|
||||
|
||||
The Cua CLI is a command-line tool that provides an intuitive interface for managing your Cua cloud sandboxes and authentication. It offers a streamlined workflow for creating, managing, and connecting to cloud sandboxes.
|
||||
|
||||
## Key Features
|
||||
|
||||
- **Authentication Management**: Secure login with browser-based OAuth flow
|
||||
- **Sandbox Lifecycle**: Create, start, stop, restart, and delete cloud sandboxes
|
||||
- **Quick Access**: Direct links to VNC and playground interfaces
|
||||
- **Cross-Platform**: Works on macOS, Linux, and Windows
|
||||
- **Environment Integration**: Automatic `.env` file generation
|
||||
|
||||
## Quick Example
|
||||
|
||||
```bash
|
||||
# Install the CLI (installs Bun + CUA CLI)
|
||||
curl -LsSf https://cua.ai/cli/install.sh | sh
|
||||
|
||||
# Login to your CUA account
|
||||
cua auth login
|
||||
|
||||
# Create a new Linux sandbox
|
||||
cua sb create --os linux --size small --region north-america
|
||||
|
||||
# List your sandboxes
|
||||
cua sb list
|
||||
```
|
||||
|
||||
## Use Cases
|
||||
|
||||
### Development Workflow
|
||||
|
||||
- Quickly spin up cloud sandboxes for testing
|
||||
- Manage multiple sandboxes across different regions
|
||||
- Integrate with CI/CD pipelines
|
||||
|
||||
### Team Collaboration
|
||||
|
||||
- Share sandbox configurations and access
|
||||
- Standardize development environments
|
||||
- Quick onboarding for new team members
|
||||
|
||||
### Automation
|
||||
|
||||
- Script sandbox provisioning and management
|
||||
- Integrate with deployment workflows
|
||||
- Automate environment setup
|
||||
|
||||
## Next Steps
|
||||
|
||||
- [Install the CLI](/libraries/cua-cli/installation)
|
||||
- [Learn about available commands](/libraries/cua-cli/commands)
|
||||
- [Get started with the quickstart guide](/get-started/quickstart#cli-quickstart)
|
||||
@@ -1,130 +0,0 @@
|
||||
---
|
||||
title: Installation
|
||||
description: Install the CUA CLI on your system
|
||||
---
|
||||
|
||||
import { Tabs, Tab } from 'fumadocs-ui/components/tabs';
|
||||
import { Callout } from 'fumadocs-ui/components/callout';
|
||||
|
||||
## Quick Install
|
||||
|
||||
The fastest way to install the CUA CLI is using our installation scripts:
|
||||
|
||||
<Tabs items={['macOS / Linux', 'Windows']}>
|
||||
<Tab value="macOS / Linux">```bash curl -LsSf https://cua.ai/cli/install.sh | sh ```</Tab>
|
||||
<Tab value="Windows">
|
||||
```powershell powershell -ExecutionPolicy ByPass -c "irm https://cua.ai/cli/install.ps1 | iex"
|
||||
```
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
These scripts will automatically:
|
||||
|
||||
1. Install [Bun](https://bun.sh) (a fast JavaScript runtime)
|
||||
2. Install the CUA CLI via `bun add -g @trycua/cli`
|
||||
|
||||
<Callout type="info">
|
||||
The installation scripts will automatically detect your system and install the appropriate binary
|
||||
to your PATH.
|
||||
</Callout>
|
||||
|
||||
## Alternative: Install with Bun
|
||||
|
||||
You can also install the CLI directly using Bun:
|
||||
|
||||
```bash
|
||||
# Install Bun if you don't have it
|
||||
curl -fsSL https://bun.sh/install | bash
|
||||
|
||||
# Install CUA CLI
|
||||
bun add -g @trycua/cli
|
||||
```
|
||||
|
||||
<Callout type="info">
|
||||
Using Bun provides faster installation and better performance compared to npm. If you don't have
|
||||
Bun installed, the first command will install it for you.
|
||||
</Callout>
|
||||
|
||||
## Verify Installation
|
||||
|
||||
After installation, verify the CLI is working:
|
||||
|
||||
```bash
|
||||
cua --help
|
||||
```
|
||||
|
||||
You should see the CLI help output with available commands.
|
||||
|
||||
## First Time Setup
|
||||
|
||||
After installation, you'll need to authenticate with your CUA account:
|
||||
|
||||
```bash
|
||||
# Login with browser-based OAuth flow
|
||||
cua auth login
|
||||
|
||||
# Or provide your API key directly
|
||||
cua auth login --api-key sk-your-api-key-here
|
||||
```
|
||||
|
||||
## Updating
|
||||
|
||||
To update to the latest version:
|
||||
|
||||
<Tabs items={['Script Install', 'npm Install']}>
|
||||
<Tab value="Script Install">
|
||||
Re-run the installation script: ```bash # macOS/Linux curl -LsSf https://cua.ai/cli/install.sh |
|
||||
sh # Windows powershell -ExecutionPolicy ByPass -c "irm https://cua.ai/cli/install.ps1 | iex"
|
||||
```
|
||||
</Tab>
|
||||
<Tab value="npm Install">```bash npm update -g @trycua/cli ```</Tab>
|
||||
</Tabs>
|
||||
|
||||
## Uninstalling
|
||||
|
||||
<Tabs items={['Script Install', 'npm Install']}>
|
||||
<Tab value="Script Install">
|
||||
Remove the binary from your PATH: ```bash # macOS/Linux rm $(which cua) # Windows # Remove from
|
||||
your PATH or delete the executable ```
|
||||
</Tab>
|
||||
<Tab value="npm Install">```bash npm uninstall -g @trycua/cli ```</Tab>
|
||||
</Tabs>
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Command Not Found
|
||||
|
||||
If you get a "command not found" error after installation:
|
||||
|
||||
1. **Check your PATH**: Make sure the installation directory is in your PATH
|
||||
2. **Restart your terminal**: Close and reopen your terminal/command prompt
|
||||
3. **Manual PATH setup**: Add the installation directory to your PATH manually
|
||||
|
||||
### Permission Issues
|
||||
|
||||
If you encounter permission issues during installation:
|
||||
|
||||
<Tabs items={['macOS / Linux', 'Windows']}>
|
||||
<Tab value="macOS / Linux">
|
||||
Try running with sudo (not recommended for the curl method): ```bash # If using npm sudo npm
|
||||
install -g @trycua/cli ```
|
||||
</Tab>
|
||||
<Tab value="Windows">
|
||||
Run PowerShell as Administrator: ```powershell # Right-click PowerShell and "Run as
|
||||
Administrator" powershell -ExecutionPolicy ByPass -c "irm https://cua.ai/cli/install.ps1 | iex"
|
||||
```
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
### Network Issues
|
||||
|
||||
If the installation script fails due to network issues:
|
||||
|
||||
1. **Check your internet connection**
|
||||
2. **Try the npm installation method instead**
|
||||
3. **Check if your firewall is blocking the download**
|
||||
|
||||
## Next Steps
|
||||
|
||||
- [Learn about CLI commands](/libraries/cua-cli/commands)
|
||||
- [Follow the quickstart guide](/get-started/quickstart#cli-quickstart)
|
||||
@@ -1,5 +0,0 @@
|
||||
{
|
||||
"title": "CLI",
|
||||
"description": "Command-line interface for CUA",
|
||||
"pages": ["index", "installation", "commands"]
|
||||
}
|
||||
@@ -1,27 +0,0 @@
|
||||
---
|
||||
title: MCP Server
|
||||
description: Reference for the current version of the MCP Server library.
|
||||
pypi: cua-mcp-server
|
||||
github:
|
||||
- https://github.com/trycua/cua/tree/main/libs/python/mcp-server
|
||||
---
|
||||
|
||||
**cua-mcp-server** is a MCP server for the Computer-Use Agent (CUA), allowing you to run CUA through Claude Desktop or other MCP clients.
|
||||
|
||||
## Features
|
||||
|
||||
- **Multi-Client Support**: Concurrent sessions with automatic resource management
|
||||
- **Progress Reporting**: Real-time progress updates during task execution
|
||||
- **Error Handling**: Robust error recovery with screenshot capture
|
||||
- **Concurrent Execution**: Run multiple tasks in parallel for improved performance
|
||||
- **Session Management**: Automatic cleanup and resource pooling
|
||||
- **LiteLLM Integration**: Support for multiple model providers
|
||||
- **VM Safety**: Default VM execution with optional host system control
|
||||
|
||||
## Quick Start
|
||||
|
||||
1. **Install**: `pip install cua-mcp-server`
|
||||
2. **Configure**: Add to your MCP client configuration
|
||||
3. **Use**: Ask Claude to perform computer tasks
|
||||
|
||||
See the [Installation](/docs/libraries/mcp-server/installation) guide for detailed setup instructions.
|
||||
@@ -1,78 +0,0 @@
|
||||
---
|
||||
title: Configuration
|
||||
---
|
||||
|
||||
### Detection Parameters
|
||||
|
||||
#### Box Threshold (0.3)
|
||||
|
||||
Controls the confidence threshold for accepting detections:
|
||||
|
||||
<img
|
||||
src="/docs/img/som_box_threshold.png"
|
||||
alt="Illustration of confidence thresholds in object detection, with a high-confidence detection accepted and a low-confidence detection rejected."
|
||||
width="500px"
|
||||
/>
|
||||
- Higher values (0.3) yield more precise but fewer detections - Lower values (0.01) catch more
|
||||
potential icons but increase false positives - Default is 0.3 for optimal precision/recall balance
|
||||
|
||||
#### IOU Threshold (0.1)
|
||||
|
||||
Controls how overlapping detections are merged:
|
||||
|
||||
<img
|
||||
src="/docs/img/som_iou_threshold.png"
|
||||
alt="Diagram showing Intersection over Union (IOU) with low overlap between two boxes kept separate and high overlap leading to merging."
|
||||
width="500px"
|
||||
/>
|
||||
- Lower values (0.1) more aggressively remove overlapping boxes - Higher values (0.5) allow more
|
||||
overlapping detections - Default is 0.1 to handle densely packed UI elements
|
||||
|
||||
### OCR Configuration
|
||||
|
||||
- **Engine**: EasyOCR
|
||||
- Primary choice for all platforms
|
||||
- Fast initialization and processing
|
||||
- Built-in English language support
|
||||
- GPU acceleration when available
|
||||
|
||||
- **Settings**:
|
||||
- Timeout: 5 seconds
|
||||
- Confidence threshold: 0.5
|
||||
- Paragraph mode: Disabled
|
||||
- Language: English only
|
||||
|
||||
## Performance
|
||||
|
||||
### Hardware Acceleration
|
||||
|
||||
#### MPS (Metal Performance Shaders)
|
||||
|
||||
- Multi-scale detection (640px, 1280px, 1920px)
|
||||
- Test-time augmentation enabled
|
||||
- Half-precision (FP16)
|
||||
- Average detection time: ~0.4s
|
||||
- Best for production use when available
|
||||
|
||||
#### CPU
|
||||
|
||||
- Single-scale detection (1280px)
|
||||
- Full-precision (FP32)
|
||||
- Average detection time: ~1.3s
|
||||
- Reliable fallback option
|
||||
|
||||
### Example Output Structure
|
||||
|
||||
```
|
||||
examples/output/
|
||||
├── {timestamp}_no_ocr/
|
||||
│ ├── annotated_images/
|
||||
│ │ └── screenshot_analyzed.png
|
||||
│ ├── screen_details.txt
|
||||
│ └── summary.json
|
||||
└── {timestamp}_ocr/
|
||||
├── annotated_images/
|
||||
│ └── screenshot_analyzed.png
|
||||
├── screen_details.txt
|
||||
└── summary.json
|
||||
```
|
||||
@@ -1,66 +0,0 @@
|
||||
---
|
||||
title: Set-of-Mark
|
||||
description: Reference for the current version of the Set-of-Mark library.
|
||||
pypi: cua-som
|
||||
github:
|
||||
- https://github.com/trycua/cua/tree/main/libs/python/som
|
||||
---
|
||||
|
||||
<Callout>
|
||||
A corresponding{' '}
|
||||
<a href="https://github.com/trycua/cua/blob/main/examples/som_examples.py" target="_blank">
|
||||
Python example
|
||||
</a>{' '}
|
||||
is available for this documentation.
|
||||
</Callout>
|
||||
|
||||
## Overview
|
||||
|
||||
The SOM library provides visual element detection and interaction capabilities. It is based on the [Set-of-Mark](https://arxiv.org/abs/2310.11441) research paper and the [OmniParser](https://github.com/microsoft/OmniParser) model.
|
||||
|
||||
## API Documentation
|
||||
|
||||
### OmniParser Class
|
||||
|
||||
```python
|
||||
class OmniParser:
|
||||
def __init__(self, device: str = "auto"):
|
||||
"""Initialize the parser with automatic device detection"""
|
||||
|
||||
def parse(
|
||||
self,
|
||||
image: PIL.Image,
|
||||
box_threshold: float = 0.3,
|
||||
iou_threshold: float = 0.1,
|
||||
use_ocr: bool = True,
|
||||
ocr_engine: str = "easyocr"
|
||||
) -> ParseResult:
|
||||
"""Parse UI elements from an image"""
|
||||
```
|
||||
|
||||
### ParseResult Object
|
||||
|
||||
```python
|
||||
@dataclass
|
||||
class ParseResult:
|
||||
elements: List[UIElement] # Detected elements
|
||||
visualized_image: PIL.Image # Annotated image
|
||||
processing_time: float # Time in seconds
|
||||
|
||||
def to_dict(self) -> dict:
|
||||
"""Convert to JSON-serializable dictionary"""
|
||||
|
||||
def filter_by_type(self, elem_type: str) -> List[UIElement]:
|
||||
"""Filter elements by type ('icon' or 'text')"""
|
||||
```
|
||||
|
||||
### UIElement
|
||||
|
||||
```python
|
||||
class UIElement(BaseModel):
|
||||
id: Optional[int] = Field(None) # Element ID (1-indexed)
|
||||
type: Literal["icon", "text"] # Element type
|
||||
bbox: BoundingBox # Bounding box coordinates { x1, y1, x2, y2 }
|
||||
interactivity: bool = Field(default=False) # Whether the element is interactive
|
||||
confidence: float = Field(default=1.0) # Detection confidence
|
||||
```
|
||||
5
docs/content/docs/macos-vm-cli-playbook/meta.json
Normal file
5
docs/content/docs/macos-vm-cli-playbook/meta.json
Normal file
@@ -0,0 +1,5 @@
|
||||
{
|
||||
"title": "macOS VM CLI",
|
||||
"description": "CLI tools for macOS virtualization",
|
||||
"pages": ["lume", "lumier"]
|
||||
}
|
||||
@@ -10,9 +10,11 @@
|
||||
"...example-usecases",
|
||||
"---[BookCopy]Computer Playbook---",
|
||||
"...computer-sdk",
|
||||
"---[BookCopy]Agent Playbook---",
|
||||
"---[Bot]Agent Playbook---",
|
||||
"...agent-sdk",
|
||||
"---[CodeXml]API Reference---",
|
||||
"...libraries"
|
||||
"---[Terminal]Cloud CLI Playbook---",
|
||||
"...cli-playbook",
|
||||
"---[Terminal]macOS VM CLI Playbook---",
|
||||
"...macos-vm-cli-playbook"
|
||||
]
|
||||
}
|
||||
|
||||
BIN
docs/public/img/bg-dark.jpg
Normal file
BIN
docs/public/img/bg-dark.jpg
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 277 KiB |
BIN
docs/public/img/bg-light.jpg
Normal file
BIN
docs/public/img/bg-light.jpg
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 418 KiB |
@@ -2,6 +2,32 @@
|
||||
@import 'fumadocs-ui/css/neutral.css';
|
||||
@import 'fumadocs-ui/css/preset.css';
|
||||
|
||||
/* Custom Sky + Emerald theme */
|
||||
@theme {
|
||||
--color-fd-primary: hsl(199, 89%, 48%); /* sky-500 */
|
||||
--color-fd-primary-foreground: hsl(0, 0%, 100%);
|
||||
--color-fd-ring: hsl(199, 89%, 48%); /* sky-500 */
|
||||
--color-fd-muted: hsl(160, 84%, 95%); /* emerald-50 */
|
||||
--color-fd-accent: hsl(152, 76%, 92%); /* emerald-100 */
|
||||
}
|
||||
|
||||
.dark {
|
||||
--color-fd-primary: hsl(199, 89%, 48%); /* sky-500 */
|
||||
--color-fd-primary-foreground: hsl(0, 0%, 100%);
|
||||
--color-fd-ring: hsl(199, 89%, 48%); /* sky-500 */
|
||||
--color-fd-muted: hsl(199, 89%, 14%); /* sky-950 */
|
||||
--color-fd-accent: hsl(199, 89%, 20%); /* sky dark */
|
||||
}
|
||||
|
||||
.dark body {
|
||||
background-image: linear-gradient(
|
||||
rgba(14, 165, 233, 0.1),
|
||||
transparent 20rem,
|
||||
transparent
|
||||
);
|
||||
background-repeat: no-repeat;
|
||||
}
|
||||
|
||||
/* Fix TOC overflow on production builds */
|
||||
#nd-toc {
|
||||
overflow-y: auto;
|
||||
|
||||
Reference in New Issue
Block a user