diff --git a/docs/content/docs/home/quickstart-cli.mdx b/docs/content/docs/home/quickstart-cli.mdx index 794d925a..5718f240 100644 --- a/docs/content/docs/home/quickstart-cli.mdx +++ b/docs/content/docs/home/quickstart-cli.mdx @@ -1,52 +1,233 @@ --- title: Quickstart (CLI) -description: Get started with the c/ua Agent CLI in 5 steps +description: Get started with the c/ua Agent CLI in 4 steps icon: Rocket --- -Get up and running with the c/ua Agent CLI in 5 simple steps. +import { Step, Steps } from 'fumadocs-ui/components/steps'; +import { Tab, Tabs } from 'fumadocs-ui/components/tabs'; +import { Accordion, Accordions } from 'fumadocs-ui/components/accordion'; -## 1. Introduction +Get up and running with the c/ua Agent CLI in 4 simple steps. + + + + +## Introduction c/ua combines Computer (interface) + Agent (AI) for automating desktop apps. The Agent CLI provides a clean terminal interface to control your remote computer using natural language commands. -## 2. Create Your First c/ua Container + + + + +## Create Your First c/ua Container 1. Go to [trycua.com/signin](https://www.trycua.com/signin) 2. Navigate to **Dashboard > Containers > Create Instance** 3. Create a **Medium, Ubuntu 22** container 4. Note your container name and API key -## 3. Install c/ua + + + + +## Install c/ua + + + + + +### Install uv + + + ```bash -pip install "cua-agent[all]" cua-computer +# Use curl to download the script and execute it with sh: +curl -LsSf https://astral.sh/uv/install.sh | sh + +# If your system doesn't have curl, you can use wget: +# wget -qO- https://astral.sh/uv/install.sh | sh ``` -## 4. Run the Agent CLI + + -Choose your preferred AI model and run the CLI: +```powershell +# Use irm to download the script and execute it with iex: +powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex" +``` + + + + +### Install Python 3.12 + +```bash +uv python install 3.12 +# uv will install c/ua dependencies automatically when you use --with "cua-agent[cli]" +``` + + + + + +### Install conda + + + + +```bash +mkdir -p ~/miniconda3 +curl https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-arm64.sh -o ~/miniconda3/miniconda.sh +bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3 +rm ~/miniconda3/miniconda.sh +source ~/miniconda3/bin/activate +``` + + + + +```bash +mkdir -p ~/miniconda3 +wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda3/miniconda.sh +bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3 +rm ~/miniconda3/miniconda.sh +source ~/miniconda3/bin/activate +``` + + + + +```powershell +wget "https://repo.anaconda.com/miniconda/Miniconda3-latest-Windows-x86_64.exe" -outfile ".\miniconda.exe" +Start-Process -FilePath ".\miniconda.exe" -ArgumentList "/S" -Wait +del .\miniconda.exe +``` + + + + +### Create and activate Python 3.12 environment + +```bash +conda create -n cua python=3.12 +conda activate cua +``` + +### Install c/ua + +```bash +pip install "cua-agent[cli]" cua-computer +``` + + + + + +### Install c/ua + +```bash +pip install "cua-agent[cli]" cua-computer +``` + + + + + + + + + +## Run c/ua CLI + +Choose your preferred AI model: ### OpenAI Computer Use Preview + + + + +```bash +uv run --with "cua-agent[cli]" -m agent.cli openai/computer-use-preview +``` + + + + ```bash python -m agent.cli openai/computer-use-preview ``` + + + ### Anthropic Claude + + + + +```bash +uv run --with "cua-agent[cli]" -m agent.cli anthropic/claude-3-5-sonnet-20241022 +uv run --with "cua-agent[cli]" -m agent.cli anthropic/claude-opus-4-20250514 +uv run --with "cua-agent[cli]" -m agent.cli anthropic/claude-sonnet-4-20250514 +``` + + + + ```bash python -m agent.cli anthropic/claude-3-5-sonnet-20241022 python -m agent.cli anthropic/claude-opus-4-20250514 python -m agent.cli anthropic/claude-sonnet-4-20250514 ``` + + + ### Omniparser + LLMs + + + + +```bash +uv run --with "cua-agent[cli]" -m agent.cli omniparser+anthropic/claude-3-5-sonnet-20241022 +uv run --with "cua-agent[cli]" -m agent.cli omniparser+openai/gpt-4o +uv run --with "cua-agent[cli]" -m agent.cli omniparser+vertex_ai/gemini-pro +``` + + + + ```bash python -m agent.cli omniparser+anthropic/claude-3-5-sonnet-20241022 python -m agent.cli omniparser+openai/gpt-4o python -m agent.cli omniparser+vertex_ai/gemini-pro ``` + + + ### Local Models + + + + +```bash +# Hugging Face models (local) +uv run --with "cua-agent[cli]" -m agent.cli huggingface-local/ByteDance-Seed/UI-TARS-1.5-7B + +# MLX models (Apple Silicon) +uv run --with "cua-agent[cli]" -m agent.cli mlx/mlx-community/UI-TARS-1.5-7B-6bit + +# Ollama models +uv run --with "cua-agent[cli]" -m agent.cli omniparser+ollama_chat/llama3.2:latest +``` + + + + ```bash # Hugging Face models (local) python -m agent.cli huggingface-local/ByteDance-Seed/UI-TARS-1.5-7B @@ -58,7 +239,10 @@ python -m agent.cli mlx/mlx-community/UI-TARS-1.5-7B-6bit python -m agent.cli omniparser+ollama_chat/llama3.2:latest ``` -## 5. Interactive Setup + + + +### Interactive Setup If you haven't set up environment variables, the CLI will guide you through the setup: @@ -66,7 +250,7 @@ If you haven't set up environment variables, the CLI will guide you through the 2. **CUA API Key**: Enter your c/ua API key 3. **Provider API Key**: Enter your AI provider API key (OpenAI, Anthropic, etc.) -## 6. Start Chatting +### Start Chatting Once connected, you'll see: ``` @@ -83,6 +267,9 @@ You can ask your agent to perform actions like: - "Close the current window" - "Click on the search button" + + + --- For advanced Python usage and GUI interface, see the [Quickstart (GUI)](/docs/quickstart-ui) and [Quickstart for Developers](/docs/quickstart-devs). diff --git a/docs/content/docs/home/quickstart-devs.mdx b/docs/content/docs/home/quickstart-devs.mdx index 291c37e3..e2c0c9d7 100644 --- a/docs/content/docs/home/quickstart-devs.mdx +++ b/docs/content/docs/home/quickstart-devs.mdx @@ -4,45 +4,104 @@ description: Get started with c/ua in 5 steps icon: Rocket --- +import { Step, Steps } from 'fumadocs-ui/components/steps'; +import { Tab, Tabs } from 'fumadocs-ui/components/tabs'; + Get up and running with c/ua in 5 simple steps. -## 1. Introduction + + + +## Introduction c/ua combines Computer (interface) + Agent (AI) for automating desktop apps. Computer handles clicks/typing, Agent provides the intelligence. -## 2. Create Your First c/ua Container + + + + +## Create Your First c/ua Container 1. Go to [trycua.com/signin](https://www.trycua.com/signin) 2. Navigate to **Dashboard > Containers > Create Instance** 3. Create a **Medium, Ubuntu 22** container 4. Note your container name and API key -## 3. Install c/ua + -```bash -pip install "cua-agent[all]" cua-computer -``` + -## 4. Using Computer +## Install c/ua -```python -from computer import Computer + + + ```bash + pip install "cua-agent[all]" cua-computer + ``` + + + ```bash + npm install @trycua/computer + ``` + + -async with Computer( - os_type="linux", - provider_type="cloud", - name="your-container-name", - api_key="your-api-key" -) as computer: - # Take screenshot - screenshot = await computer.interface.screenshot() - - # Click and type - await computer.interface.left_click(100, 100) - await computer.interface.type("Hello!") -``` + -## 5. Using Agent + + +## Using Computer + + + + ```python + from computer import Computer + + async with Computer( + os_type="linux", + provider_type="cloud", + name="your-container-name", + api_key="your-api-key" + ) as computer: + # Take screenshot + screenshot = await computer.interface.screenshot() + + # Click and type + await computer.interface.left_click(100, 100) + await computer.interface.type("Hello!") + ``` + + + ```typescript + import { Computer, OSType } from '@trycua/computer'; + + const computer = new Computer({ + osType: OSType.LINUX, + name: "your-container-name", + apiKey: "your-api-key" + }); + + await computer.run(); + + try { + // Take screenshot + const screenshot = await computer.interface.screenshot(); + + // Click and type + await computer.interface.leftClick(100, 100); + await computer.interface.typeText("Hello!"); + } finally { + await computer.close(); + } + ``` + + + + + + + +## Using Agent ```python from agent import ComputerAgent @@ -61,6 +120,9 @@ async for result in agent.run(messages): print(item["content"][0]["text"]) ``` + + + ## Next Steps - Explore the [SDK documentation](/docs/sdk) for advanced features diff --git a/docs/content/docs/home/quickstart-ui.mdx b/docs/content/docs/home/quickstart-ui.mdx index 1ac62129..141e4ec0 100644 --- a/docs/content/docs/home/quickstart-ui.mdx +++ b/docs/content/docs/home/quickstart-ui.mdx @@ -1,35 +1,154 @@ --- title: Quickstart (GUI) -description: Get started with the c/ua Agent UI in 5 steps +description: Get started with the c/ua Agent UI in 3 steps icon: Rocket --- -Get up and running with the c/ua Agent UI in 5 simple steps. +import { Step, Steps } from 'fumadocs-ui/components/steps'; +import { Tab, Tabs } from 'fumadocs-ui/components/tabs'; +import { Accordion, Accordions } from 'fumadocs-ui/components/accordion'; -## 1. Introduction +Get up and running with the c/ua Agent UI in 3 simple steps. + + + + +## Introduction c/ua combines Computer (interface) + Agent (AI) for automating desktop apps. The Agent UI provides a simple chat interface to control your remote computer using natural language. -## 2. Create Your First c/ua Container + + + + +## Create Your First c/ua Container 1. Go to [trycua.com/signin](https://www.trycua.com/signin) 2. Navigate to **Dashboard > Containers > Create Instance** 3. Create a **Medium, Ubuntu 22** container 4. Note your container name and API key -## 3. Install c/ua + + + + +## Install and Run c/ua + + + + + +### Install uv + + + ```bash -pip install "cua-agent[all]" cua-computer +# Use curl to download the script and execute it with sh: +curl -LsSf https://astral.sh/uv/install.sh | sh + +# If your system doesn't have curl, you can use wget: +# wget -qO- https://astral.sh/uv/install.sh | sh ``` -## 4. Run the Agent UI + + + +```powershell +# Use irm to download the script and execute it with iex: +powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex" +``` + + + + +### Install Python 3.12 + +```bash +uv python install 3.12 +``` + +### Run c/ua + +```bash +uv run --with "cua-agent[ui]" -m agent.ui +``` + + + + + +### Install conda + + + + +```bash +mkdir -p ~/miniconda3 +curl https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-arm64.sh -o ~/miniconda3/miniconda.sh +bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3 +rm ~/miniconda3/miniconda.sh +source ~/miniconda3/bin/activate +``` + + + + +```bash +mkdir -p ~/miniconda3 +wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda3/miniconda.sh +bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3 +rm ~/miniconda3/miniconda.sh +source ~/miniconda3/bin/activate +``` + + + + +```powershell +wget "https://repo.anaconda.com/miniconda/Miniconda3-latest-Windows-x86_64.exe" -outfile ".\miniconda.exe" +Start-Process -FilePath ".\miniconda.exe" -ArgumentList "/S" -Wait +del .\miniconda.exe +``` + + + + +### Create and activate Python 3.12 environment + +```bash +conda create -n cua python=3.12 +conda activate cua +``` + +### Install and run c/ua + +```bash +pip install "cua-agent[ui]" cua-computer +python -m agent.ui +``` + + + + + +### Install c/ua + +```bash +pip install "cua-agent[ui]" cua-computer +``` + +### Run the Agent UI ```bash python -m agent.ui ``` -## 5. Start Chatting + + + + +### Start Chatting Open your browser to the displayed URL and start chatting with your computer-using agent. @@ -38,6 +157,9 @@ You can ask your agent to perform actions like: - "Take a screenshot and tell me what's on the screen" - "Type 'Hello world' into the terminal" + + + --- For advanced Python usage, see the [Quickstart for Developers](/docs/quickstart-devs).