mirror of
https://github.com/trycua/computer.git
synced 2026-01-08 06:20:00 -06:00
Merge pull request #582 from trycua/study-docs-structure
Fixes pre-launch week
This commit is contained in:
@@ -4,11 +4,7 @@ description: Supported computer-using agent loops and models
|
||||
---
|
||||
|
||||
<Callout>
|
||||
A corresponding{' '}
|
||||
<a href="https://github.com/trycua/cua/blob/main/notebooks/agent_nb.ipynb" target="_blank">
|
||||
Jupyter Notebook
|
||||
</a>{' '}
|
||||
is available for this documentation.
|
||||
A corresponding <a href="https://github.com/trycua/cua/blob/main/notebooks/agent_nb.ipynb" target="_blank">Jupyter Notebook</a> is available for this documentation.
|
||||
</Callout>
|
||||
|
||||
An agent can be thought of as a loop - it generates actions, executes them, and repeats until done:
|
||||
|
||||
@@ -1,16 +1,9 @@
|
||||
---
|
||||
title: Customizing Your ComputerAgent
|
||||
title: Customize ComputerAgent
|
||||
---
|
||||
|
||||
<Callout>
|
||||
A corresponding{' '}
|
||||
<a
|
||||
href="https://github.com/trycua/cua/blob/main/notebooks/customizing_computeragent.ipynb"
|
||||
target="_blank"
|
||||
>
|
||||
Jupyter Notebook
|
||||
</a>{' '}
|
||||
is available for this documentation.
|
||||
A corresponding <a href="https://github.com/trycua/cua/blob/main/notebooks/customizing_computeragent.ipynb" target="_blank">Jupyter Notebook</a> is available for this documentation.
|
||||
</Callout>
|
||||
|
||||
The `ComputerAgent` interface provides an easy proxy to any computer-using model configuration, and it is a powerful framework for extending and building your own agentic systems.
|
||||
|
||||
@@ -4,11 +4,7 @@ description: Use ComputerAgent with HUD for benchmarking and evaluation
|
||||
---
|
||||
|
||||
<Callout>
|
||||
A corresponding{' '}
|
||||
<a href="https://github.com/trycua/cua/blob/main/notebooks/eval_osworld.ipynb" target="_blank">
|
||||
Jupyter Notebook
|
||||
</a>{' '}
|
||||
is available for this documentation.
|
||||
A corresponding <a href="https://github.com/trycua/cua/blob/main/notebooks/eval_osworld.ipynb" target="_blank">Jupyter Notebook</a> is available for this documentation.
|
||||
</Callout>
|
||||
|
||||
The HUD integration allows an agent to be benchmarked using the [HUD framework](https://www.hud.so/). Through the HUD integration, the agent controls a computer inside HUD, where tests are run to evaluate the success of each task.
|
||||
|
||||
@@ -10,12 +10,10 @@
|
||||
"customizing-computeragent",
|
||||
"callbacks",
|
||||
"custom-tools",
|
||||
"custom-computer-handlers",
|
||||
"prompt-caching",
|
||||
"usage-tracking",
|
||||
"telemetry",
|
||||
"benchmarks",
|
||||
"migration-guide",
|
||||
"integrations"
|
||||
]
|
||||
}
|
||||
|
||||
@@ -1,7 +1,11 @@
|
||||
---
|
||||
title: Computer UI
|
||||
title: Computer UI (Deprecated)
|
||||
---
|
||||
|
||||
<Callout type="warn" title="Deprecated">
|
||||
The Computer UI is deprecated and will be replaced with a revamped playground experience soon. We recommend using VNC or Screen Sharing for precise control of the computer instead.
|
||||
</Callout>
|
||||
|
||||
The computer module includes a Gradio UI for creating and sharing demonstration data. We make it easy for people to build community datasets for better computer use models with an upload to Huggingface feature.
|
||||
|
||||
```bash
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
{
|
||||
"title": "Computer SDK",
|
||||
"description": "Build computer-using agents with the Computer SDK",
|
||||
"pages": ["computers", "commands", "computer-ui", "tracing-api", "sandboxed-python"]
|
||||
"pages": ["computers", "commands", "tracing-api", "sandboxed-python", "custom-computer-handlers", "computer-ui"]
|
||||
}
|
||||
|
||||
@@ -4,14 +4,7 @@ slug: sandboxed-python
|
||||
---
|
||||
|
||||
<Callout>
|
||||
A corresponding{' '}
|
||||
<a
|
||||
href="https://github.com/trycua/cua/blob/main/examples/sandboxed_functions_examples.py"
|
||||
target="_blank"
|
||||
>
|
||||
Python example
|
||||
</a>{' '}
|
||||
is available for this documentation.
|
||||
A corresponding <a href="https://github.com/trycua/cua/blob/main/examples/sandboxed_functions_examples.py" target="_blank">Python example</a> is available for this documentation.
|
||||
</Callout>
|
||||
|
||||
You can run Python functions securely inside a sandboxed virtual environment on a remote Cua Computer. This is useful for executing untrusted user code, isolating dependencies, or providing a safe environment for automation tasks.
|
||||
|
||||
@@ -3,7 +3,7 @@ title: Form Filling
|
||||
description: Enhance and Automate Interactions Between Form Filling and Local File Systems
|
||||
---
|
||||
|
||||
import { EditableCodeBlock, EditableValue, S } from '@/components/editable-code-block';
|
||||
import { Step, Steps } from 'fumadocs-ui/components/steps';
|
||||
import { Tab, Tabs } from 'fumadocs-ui/components/tabs';
|
||||
|
||||
## Overview
|
||||
@@ -12,9 +12,17 @@ Cua can be used to automate interactions between form filling and local file sys
|
||||
|
||||
This preset usecase uses [Cua Computer](/computer-sdk/computers) to interact with a web page and local file systems along with [Agent Loops](/agent-sdk/agent-loops) to run the agent in a loop with message history.
|
||||
|
||||
## Quickstart
|
||||
---
|
||||
|
||||
Create a `requirements.txt` file with the following dependencies:
|
||||
<Steps>
|
||||
|
||||
<Step>
|
||||
|
||||
### Set Up Your Environment
|
||||
|
||||
First, install the required dependencies:
|
||||
|
||||
Create a `requirements.txt` file:
|
||||
|
||||
```text
|
||||
cua-agent
|
||||
@@ -22,33 +30,32 @@ cua-computer
|
||||
python-dotenv>=1.0.0
|
||||
```
|
||||
|
||||
And install:
|
||||
Install the dependencies:
|
||||
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
Create a `.env` file with the following environment variables:
|
||||
Create a `.env` file with your API keys:
|
||||
|
||||
```text
|
||||
ANTHROPIC_API_KEY=your-api-key
|
||||
ANTHROPIC_API_KEY=your-anthropic-api-key
|
||||
CUA_API_KEY=sk_cua-api01...
|
||||
```
|
||||
|
||||
Select the environment you want to run the code in (_click on the underlined values in the code to edit them directly!_):
|
||||
</Step>
|
||||
|
||||
<Tabs items={['☁️ Cloud', '🐳 Docker', '🍎 Lume', '🪟 Windows Sandbox']}>
|
||||
<Tab value="☁️ Cloud">
|
||||
<Step>
|
||||
|
||||
<EditableCodeBlock
|
||||
key="cloud-tab"
|
||||
lang="python"
|
||||
defaultValues={{
|
||||
"sandbox-name": "m-linux-...",
|
||||
"api_key": "sk_cua-api01..."
|
||||
}}
|
||||
>
|
||||
{`import asyncio
|
||||
### Create Your Form Filling Script
|
||||
|
||||
Create a Python file (e.g., `form_filling.py`) and select your environment:
|
||||
|
||||
<Tabs items={['Cloud Sandbox', 'Linux on Docker', 'macOS Sandbox', 'Windows Sandbox']}>
|
||||
<Tab value="Cloud Sandbox">
|
||||
|
||||
```python
|
||||
import asyncio
|
||||
import logging
|
||||
import os
|
||||
import signal
|
||||
@@ -59,21 +66,21 @@ from computer import Computer, VMProviderType
|
||||
from dotenv import load_dotenv
|
||||
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
logger = logging.getLogger(**name**)
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
def handle_sigint(sig, frame):
|
||||
print("\\n\\nExecution interrupted by user. Exiting gracefully...")
|
||||
exit(0)
|
||||
print("\n\nExecution interrupted by user. Exiting gracefully...")
|
||||
exit(0)
|
||||
|
||||
async def fill_application():
|
||||
try:
|
||||
async with Computer(
|
||||
os_type="linux",
|
||||
provider_type=VMProviderType.CLOUD,
|
||||
name="`}<EditableValue placeholder="sandbox-name" />{`",
|
||||
api_key="`}<EditableValue placeholder="api_key" />{`",
|
||||
verbosity=logging.INFO,
|
||||
) as computer:
|
||||
try:
|
||||
async with Computer(
|
||||
os_type="linux",
|
||||
provider_type=VMProviderType.CLOUD,
|
||||
name="your-sandbox-name", # Replace with your sandbox name
|
||||
api_key=os.environ["CUA_API_KEY"],
|
||||
verbosity=logging.INFO,
|
||||
) as computer:
|
||||
|
||||
agent = ComputerAgent(
|
||||
model="anthropic/claude-sonnet-4-5-20250929",
|
||||
@@ -93,7 +100,7 @@ verbosity=logging.INFO,
|
||||
history = []
|
||||
|
||||
for i, task in enumerate(tasks, 1):
|
||||
print(f"\\n[Task {i}/{len(tasks)}] {task}")
|
||||
print(f"\n[Task {i}/{len(tasks)}] {task}")
|
||||
|
||||
# Add user message to history
|
||||
history.append({"role": "user", "content": task})
|
||||
@@ -116,7 +123,7 @@ verbosity=logging.INFO,
|
||||
|
||||
print(f"✅ Task {i}/{len(tasks)} completed")
|
||||
|
||||
print("\\n🎉 All tasks completed successfully!")
|
||||
print("\n🎉 All tasks completed successfully!")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error in fill_application: {e}")
|
||||
@@ -124,18 +131,18 @@ verbosity=logging.INFO,
|
||||
raise
|
||||
|
||||
def main():
|
||||
try:
|
||||
load_dotenv()
|
||||
try:
|
||||
load_dotenv()
|
||||
|
||||
if "ANTHROPIC_API_KEY" not in os.environ:
|
||||
raise RuntimeError(
|
||||
"Please set the ANTHROPIC_API_KEY environment variable.\\n"
|
||||
"Please set the ANTHROPIC_API_KEY environment variable.\n"
|
||||
"You can add it to a .env file in the project root."
|
||||
)
|
||||
|
||||
if "CUA_API_KEY" not in os.environ:
|
||||
raise RuntimeError(
|
||||
"Please set the CUA_API_KEY environment variable.\\n"
|
||||
"Please set the CUA_API_KEY environment variable.\n"
|
||||
"You can add it to a .env file in the project root."
|
||||
)
|
||||
|
||||
@@ -147,22 +154,15 @@ load_dotenv()
|
||||
logger.error(f"Error running automation: {e}")
|
||||
traceback.print_exc()
|
||||
|
||||
if **name** == "**main**":
|
||||
main()`}
|
||||
|
||||
</EditableCodeBlock>
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
```
|
||||
|
||||
</Tab>
|
||||
<Tab value="🍎 Lume">
|
||||
<Tab value="Linux on Docker">
|
||||
|
||||
<EditableCodeBlock
|
||||
key="lume-tab"
|
||||
lang="python"
|
||||
defaultValues={{
|
||||
"sandbox-name": "macos-sequoia-cua:latest"
|
||||
}}
|
||||
>
|
||||
{`import asyncio
|
||||
```python
|
||||
import asyncio
|
||||
import logging
|
||||
import os
|
||||
import signal
|
||||
@@ -173,20 +173,20 @@ from computer import Computer, VMProviderType
|
||||
from dotenv import load_dotenv
|
||||
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
logger = logging.getLogger(**name**)
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
def handle_sigint(sig, frame):
|
||||
print("\\n\\nExecution interrupted by user. Exiting gracefully...")
|
||||
exit(0)
|
||||
print("\n\nExecution interrupted by user. Exiting gracefully...")
|
||||
exit(0)
|
||||
|
||||
async def fill_application():
|
||||
try:
|
||||
async with Computer(
|
||||
os_type="macos",
|
||||
provider_type=VMProviderType.LUME,
|
||||
name="`}<EditableValue placeholder="sandbox-name" />{`",
|
||||
verbosity=logging.INFO,
|
||||
) as computer:
|
||||
try:
|
||||
async with Computer(
|
||||
os_type="linux",
|
||||
provider_type=VMProviderType.DOCKER,
|
||||
image="trycua/cua-xfce:latest", # or "trycua/cua-ubuntu:latest"
|
||||
verbosity=logging.INFO,
|
||||
) as computer:
|
||||
|
||||
agent = ComputerAgent(
|
||||
model="anthropic/claude-sonnet-4-5-20250929",
|
||||
@@ -206,7 +206,7 @@ verbosity=logging.INFO,
|
||||
history = []
|
||||
|
||||
for i, task in enumerate(tasks, 1):
|
||||
print(f"\\n[Task {i}/{len(tasks)}] {task}")
|
||||
print(f"\n[Task {i}/{len(tasks)}] {task}")
|
||||
|
||||
# Add user message to history
|
||||
history.append({"role": "user", "content": task})
|
||||
@@ -229,7 +229,7 @@ verbosity=logging.INFO,
|
||||
|
||||
print(f"✅ Task {i}/{len(tasks)} completed")
|
||||
|
||||
print("\\n🎉 All tasks completed successfully!")
|
||||
print("\n🎉 All tasks completed successfully!")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error in fill_application: {e}")
|
||||
@@ -237,12 +237,12 @@ verbosity=logging.INFO,
|
||||
raise
|
||||
|
||||
def main():
|
||||
try:
|
||||
load_dotenv()
|
||||
try:
|
||||
load_dotenv()
|
||||
|
||||
if "ANTHROPIC_API_KEY" not in os.environ:
|
||||
raise RuntimeError(
|
||||
"Please set the ANTHROPIC_API_KEY environment variable.\\n"
|
||||
"Please set the ANTHROPIC_API_KEY environment variable.\n"
|
||||
"You can add it to a .env file in the project root."
|
||||
)
|
||||
|
||||
@@ -254,20 +254,15 @@ load_dotenv()
|
||||
logger.error(f"Error running automation: {e}")
|
||||
traceback.print_exc()
|
||||
|
||||
if **name** == "**main**":
|
||||
main()`}
|
||||
|
||||
</EditableCodeBlock>
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
```
|
||||
|
||||
</Tab>
|
||||
<Tab value="🪟 Windows Sandbox">
|
||||
<Tab value="macOS Sandbox">
|
||||
|
||||
<EditableCodeBlock
|
||||
key="windows-tab"
|
||||
lang="python"
|
||||
defaultValues={{}}
|
||||
>
|
||||
{`import asyncio
|
||||
```python
|
||||
import asyncio
|
||||
import logging
|
||||
import os
|
||||
import signal
|
||||
@@ -278,19 +273,20 @@ from computer import Computer, VMProviderType
|
||||
from dotenv import load_dotenv
|
||||
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
logger = logging.getLogger(**name**)
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
def handle_sigint(sig, frame):
|
||||
print("\\n\\nExecution interrupted by user. Exiting gracefully...")
|
||||
exit(0)
|
||||
print("\n\nExecution interrupted by user. Exiting gracefully...")
|
||||
exit(0)
|
||||
|
||||
async def fill_application():
|
||||
try:
|
||||
async with Computer(
|
||||
os_type="windows",
|
||||
provider_type=VMProviderType.WINDOWS_SANDBOX,
|
||||
verbosity=logging.INFO,
|
||||
) as computer:
|
||||
try:
|
||||
async with Computer(
|
||||
os_type="macos",
|
||||
provider_type=VMProviderType.LUME,
|
||||
name="macos-sequoia-cua:latest",
|
||||
verbosity=logging.INFO,
|
||||
) as computer:
|
||||
|
||||
agent = ComputerAgent(
|
||||
model="anthropic/claude-sonnet-4-5-20250929",
|
||||
@@ -310,7 +306,7 @@ verbosity=logging.INFO,
|
||||
history = []
|
||||
|
||||
for i, task in enumerate(tasks, 1):
|
||||
print(f"\\n[Task {i}/{len(tasks)}] {task}")
|
||||
print(f"\n[Task {i}/{len(tasks)}] {task}")
|
||||
|
||||
# Add user message to history
|
||||
history.append({"role": "user", "content": task})
|
||||
@@ -333,7 +329,7 @@ verbosity=logging.INFO,
|
||||
|
||||
print(f"✅ Task {i}/{len(tasks)} completed")
|
||||
|
||||
print("\\n🎉 All tasks completed successfully!")
|
||||
print("\n🎉 All tasks completed successfully!")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error in fill_application: {e}")
|
||||
@@ -341,12 +337,12 @@ verbosity=logging.INFO,
|
||||
raise
|
||||
|
||||
def main():
|
||||
try:
|
||||
load_dotenv()
|
||||
try:
|
||||
load_dotenv()
|
||||
|
||||
if "ANTHROPIC_API_KEY" not in os.environ:
|
||||
raise RuntimeError(
|
||||
"Please set the ANTHROPIC_API_KEY environment variable.\\n"
|
||||
"Please set the ANTHROPIC_API_KEY environment variable.\n"
|
||||
"You can add it to a .env file in the project root."
|
||||
)
|
||||
|
||||
@@ -358,22 +354,15 @@ load_dotenv()
|
||||
logger.error(f"Error running automation: {e}")
|
||||
traceback.print_exc()
|
||||
|
||||
if **name** == "**main**":
|
||||
main()`}
|
||||
|
||||
</EditableCodeBlock>
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
```
|
||||
|
||||
</Tab>
|
||||
<Tab value="🐳 Docker">
|
||||
<Tab value="Windows Sandbox">
|
||||
|
||||
<EditableCodeBlock
|
||||
key="docker-tab"
|
||||
lang="python"
|
||||
defaultValues={{
|
||||
"sandbox-name": "trycua/cua-ubuntu:latest"
|
||||
}}
|
||||
>
|
||||
{`import asyncio
|
||||
```python
|
||||
import asyncio
|
||||
import logging
|
||||
import os
|
||||
import signal
|
||||
@@ -384,20 +373,19 @@ from computer import Computer, VMProviderType
|
||||
from dotenv import load_dotenv
|
||||
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
logger = logging.getLogger(**name**)
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
def handle_sigint(sig, frame):
|
||||
print("\\n\\nExecution interrupted by user. Exiting gracefully...")
|
||||
exit(0)
|
||||
print("\n\nExecution interrupted by user. Exiting gracefully...")
|
||||
exit(0)
|
||||
|
||||
async def fill_application():
|
||||
try:
|
||||
async with Computer(
|
||||
os_type="linux",
|
||||
provider_type=VMProviderType.DOCKER,
|
||||
name="`}<EditableValue placeholder="sandbox-name" />{`",
|
||||
verbosity=logging.INFO,
|
||||
) as computer:
|
||||
try:
|
||||
async with Computer(
|
||||
os_type="windows",
|
||||
provider_type=VMProviderType.WINDOWS_SANDBOX,
|
||||
verbosity=logging.INFO,
|
||||
) as computer:
|
||||
|
||||
agent = ComputerAgent(
|
||||
model="anthropic/claude-sonnet-4-5-20250929",
|
||||
@@ -417,7 +405,7 @@ verbosity=logging.INFO,
|
||||
history = []
|
||||
|
||||
for i, task in enumerate(tasks, 1):
|
||||
print(f"\\n[Task {i}/{len(tasks)}] {task}")
|
||||
print(f"\n[Task {i}/{len(tasks)}] {task}")
|
||||
|
||||
# Add user message to history
|
||||
history.append({"role": "user", "content": task})
|
||||
@@ -440,7 +428,7 @@ verbosity=logging.INFO,
|
||||
|
||||
print(f"✅ Task {i}/{len(tasks)} completed")
|
||||
|
||||
print("\\n🎉 All tasks completed successfully!")
|
||||
print("\n🎉 All tasks completed successfully!")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error in fill_application: {e}")
|
||||
@@ -448,12 +436,12 @@ verbosity=logging.INFO,
|
||||
raise
|
||||
|
||||
def main():
|
||||
try:
|
||||
load_dotenv()
|
||||
try:
|
||||
load_dotenv()
|
||||
|
||||
if "ANTHROPIC_API_KEY" not in os.environ:
|
||||
raise RuntimeError(
|
||||
"Please set the ANTHROPIC_API_KEY environment variable.\\n"
|
||||
"Please set the ANTHROPIC_API_KEY environment variable.\n"
|
||||
"You can add it to a .env file in the project root."
|
||||
)
|
||||
|
||||
@@ -465,16 +453,41 @@ load_dotenv()
|
||||
logger.error(f"Error running automation: {e}")
|
||||
traceback.print_exc()
|
||||
|
||||
if **name** == "**main**":
|
||||
main()`}
|
||||
|
||||
</EditableCodeBlock>
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
```
|
||||
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
</Step>
|
||||
|
||||
<Step>
|
||||
|
||||
### Run Your Script
|
||||
|
||||
Execute your form filling automation:
|
||||
|
||||
```bash
|
||||
python form_filling.py
|
||||
```
|
||||
|
||||
The agent will:
|
||||
1. Download the PDF resume from Overleaf
|
||||
2. Extract information from the PDF
|
||||
3. Fill out the JotForm with the extracted information
|
||||
|
||||
Monitor the output to see the agent's progress through each task.
|
||||
|
||||
</Step>
|
||||
|
||||
</Steps>
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
- Learn more about [Cua computers](/computer-sdk/computers) and [computer commands](/computer-sdk/commands)
|
||||
- Read about [Agent loops](/agent-sdk/agent-loops), [tools](/agent-sdk/custom-tools), and [supported model providers](/agent-sdk/supported-model-providers/)
|
||||
- Experiment with different [Models and Providers](/agent-sdk/supported-model-providers/)
|
||||
- Join our [Discord community](https://discord.com/invite/mVnXXpdE85) for help
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
{
|
||||
"title": "Cookbook",
|
||||
"description": "Real-world examples of building with Cua",
|
||||
"pages": ["form-filling", "post-event-contact-export"]
|
||||
"pages": ["windows-app-behind-vpn", "form-filling", "post-event-contact-export"]
|
||||
}
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
615
docs/content/docs/example-usecases/windows-app-behind-vpn.mdx
Normal file
615
docs/content/docs/example-usecases/windows-app-behind-vpn.mdx
Normal file
@@ -0,0 +1,615 @@
|
||||
---
|
||||
title: Windows App behind VPN
|
||||
description: Automate legacy Windows desktop applications behind VPN with Cua
|
||||
---
|
||||
|
||||
import { Step, Steps } from 'fumadocs-ui/components/steps';
|
||||
import { Tab, Tabs } from 'fumadocs-ui/components/tabs';
|
||||
|
||||
## Overview
|
||||
|
||||
This guide demonstrates how to automate Windows desktop applications (like eGecko HR/payroll systems) that run behind corporate VPN. This is a common enterprise scenario where legacy desktop applications require manual data entry, report generation, or workflow execution.
|
||||
|
||||
**Use cases:**
|
||||
- HR/payroll processing (employee onboarding, payroll runs, benefits administration)
|
||||
- Desktop ERP systems behind corporate networks
|
||||
- Legacy financial applications requiring VPN access
|
||||
- Compliance reporting from on-premise systems
|
||||
|
||||
**Architecture:**
|
||||
- Client-side Cua agent (Python SDK or Playground UI)
|
||||
- Windows VM/Sandbox with VPN client configured
|
||||
- RDP/remote desktop connection to target environment
|
||||
- Desktop application automation via computer vision and UI control
|
||||
|
||||
<Callout type="info">
|
||||
**Production Deployment**: For production use, consider workflow mining and custom finetuning to create vertical-specific actions (e.g., "Run payroll", "Onboard employee") instead of generic UI automation. This provides better audit trails and higher success rates.
|
||||
</Callout>
|
||||
|
||||
---
|
||||
|
||||
## Video Demo
|
||||
|
||||
<div className="rounded-lg border bg-card text-card-foreground shadow-sm p-4 mb-6">
|
||||
<video src="https://github.com/user-attachments/assets/8ab07646-6018-4128-87ce-53180cfea696" controls className="w-full rounded">
|
||||
Your browser does not support the video tag.
|
||||
</video>
|
||||
<div className="text-sm text-muted-foreground mt-2">
|
||||
Demo showing Cua automating an eGecko-like desktop application on Windows behind AWS VPN
|
||||
</div>
|
||||
</div>
|
||||
|
||||
---
|
||||
|
||||
<Steps>
|
||||
|
||||
<Step>
|
||||
|
||||
### Set Up Your Environment
|
||||
|
||||
Install the required dependencies:
|
||||
|
||||
Create a `requirements.txt` file:
|
||||
|
||||
```text
|
||||
cua-agent
|
||||
cua-computer
|
||||
python-dotenv>=1.0.0
|
||||
```
|
||||
|
||||
Install the dependencies:
|
||||
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
Create a `.env` file with your API keys:
|
||||
|
||||
```text
|
||||
ANTHROPIC_API_KEY=your-anthropic-api-key
|
||||
CUA_API_KEY=sk_cua-api01...
|
||||
CUA_SANDBOX_NAME=your-windows-sandbox
|
||||
```
|
||||
|
||||
</Step>
|
||||
|
||||
<Step>
|
||||
|
||||
### Configure Windows Sandbox with VPN
|
||||
|
||||
<Tabs items={['Cloud Sandbox (Recommended)', 'Windows Sandbox', 'Self-Hosted VM']}>
|
||||
<Tab value="Cloud Sandbox (Recommended)">
|
||||
|
||||
For enterprise deployments, use Cua Cloud Sandbox with pre-configured VPN:
|
||||
|
||||
1. Go to [cua.ai/signin](https://cua.ai/signin)
|
||||
2. Navigate to **Dashboard > Containers > Create Instance**
|
||||
3. Create a **Windows** sandbox (Medium or Large for desktop apps)
|
||||
4. Configure VPN settings:
|
||||
- Upload your AWS VPN Client configuration (`.ovpn` file)
|
||||
- Or configure VPN credentials directly in the dashboard
|
||||
5. Note your sandbox name and API key
|
||||
|
||||
Your Windows sandbox will launch with VPN automatically connected.
|
||||
|
||||
</Tab>
|
||||
<Tab value="Windows Sandbox">
|
||||
|
||||
For local development on Windows 10 Pro/Enterprise or Windows 11:
|
||||
|
||||
1. Enable [Windows Sandbox](https://learn.microsoft.com/en-us/windows/security/application-security/application-isolation/windows-sandbox/windows-sandbox-install)
|
||||
2. Install the `pywinsandbox` dependency:
|
||||
```bash
|
||||
pip install -U git+git://github.com/karkason/pywinsandbox.git
|
||||
```
|
||||
3. Create a VPN setup script that runs on sandbox startup
|
||||
4. Configure your desktop application installation within the sandbox
|
||||
|
||||
<Callout type="warn">
|
||||
**Manual VPN Setup**: Windows Sandbox requires manual VPN configuration each time it starts. For production use, consider Cloud Sandbox or self-hosted VMs with persistent VPN connections.
|
||||
</Callout>
|
||||
|
||||
</Tab>
|
||||
<Tab value="Self-Hosted VM">
|
||||
|
||||
For self-managed infrastructure:
|
||||
|
||||
1. Deploy Windows VM on your preferred cloud (AWS, Azure, GCP)
|
||||
2. Install and configure VPN client (AWS VPN Client, OpenVPN, etc.)
|
||||
3. Install target desktop application and any dependencies
|
||||
4. Install `cua-computer-server`:
|
||||
```bash
|
||||
pip install cua-computer-server
|
||||
python -m computer_server
|
||||
```
|
||||
5. Configure firewall rules to allow Cua agent connections
|
||||
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
</Step>
|
||||
|
||||
<Step>
|
||||
|
||||
### Create Your Automation Script
|
||||
|
||||
Create a Python file (e.g., `hr_automation.py`):
|
||||
|
||||
<Tabs items={['Cloud Sandbox', 'Windows Sandbox', 'Self-Hosted']}>
|
||||
<Tab value="Cloud Sandbox">
|
||||
|
||||
```python
|
||||
import asyncio
|
||||
import logging
|
||||
import os
|
||||
from agent import ComputerAgent
|
||||
from computer import Computer, VMProviderType
|
||||
from dotenv import load_dotenv
|
||||
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
load_dotenv()
|
||||
|
||||
async def automate_hr_workflow():
|
||||
"""
|
||||
Automate HR/payroll desktop application workflow.
|
||||
|
||||
This example demonstrates:
|
||||
- Launching Windows desktop application
|
||||
- Navigating complex desktop UI
|
||||
- Data entry and form filling
|
||||
- Report generation and export
|
||||
"""
|
||||
try:
|
||||
# Connect to Windows Cloud Sandbox with VPN
|
||||
async with Computer(
|
||||
os_type="windows",
|
||||
provider_type=VMProviderType.CLOUD,
|
||||
name=os.environ["CUA_SANDBOX_NAME"],
|
||||
api_key=os.environ["CUA_API_KEY"],
|
||||
verbosity=logging.INFO,
|
||||
) as computer:
|
||||
|
||||
# Configure agent with specialized instructions
|
||||
agent = ComputerAgent(
|
||||
model="anthropic/claude-sonnet-4-5-20250929",
|
||||
tools=[computer],
|
||||
only_n_most_recent_images=3,
|
||||
verbosity=logging.INFO,
|
||||
trajectory_dir="trajectories",
|
||||
use_prompt_caching=True,
|
||||
max_trajectory_budget=10.0,
|
||||
instructions="""
|
||||
You are automating a Windows desktop HR/payroll application.
|
||||
|
||||
IMPORTANT GUIDELINES:
|
||||
- Always wait for windows and dialogs to fully load before interacting
|
||||
- Look for loading indicators and wait for them to disappear
|
||||
- Verify each action by checking on-screen confirmation messages
|
||||
- If a button or field is not visible, try scrolling or navigating tabs
|
||||
- Desktop apps often have nested menus - explore systematically
|
||||
- Save work frequently using File > Save or Ctrl+S
|
||||
- Before closing, always verify changes were saved
|
||||
|
||||
COMMON UI PATTERNS:
|
||||
- Menu bar navigation (File, Edit, View, etc.)
|
||||
- Ribbon interfaces with tabs
|
||||
- Modal dialogs that block interaction
|
||||
- Data grids/tables for viewing records
|
||||
- Form fields with validation
|
||||
- Status bars showing operation progress
|
||||
""".strip()
|
||||
)
|
||||
|
||||
# Define workflow tasks
|
||||
tasks = [
|
||||
"Launch the HR application from the desktop or start menu",
|
||||
"Log in with the credentials shown in credentials.txt on the desktop",
|
||||
"Navigate to Employee Management section",
|
||||
"Create a new employee record with information from new_hire.xlsx on desktop",
|
||||
"Verify the employee was created successfully by searching for their name",
|
||||
"Generate an onboarding report for the new employee",
|
||||
"Export the report as PDF to the desktop",
|
||||
"Log out of the application"
|
||||
]
|
||||
|
||||
history = []
|
||||
|
||||
for task in tasks:
|
||||
logger.info(f"\n{'='*60}")
|
||||
logger.info(f"Task: {task}")
|
||||
logger.info(f"{'='*60}\n")
|
||||
|
||||
history.append({"role": "user", "content": task})
|
||||
|
||||
async for result in agent.run(history):
|
||||
for item in result.get("output", []):
|
||||
if item.get("type") == "message":
|
||||
content = item.get("content", [])
|
||||
for block in content:
|
||||
if block.get("type") == "text":
|
||||
response = block.get("text", "")
|
||||
logger.info(f"Agent: {response}")
|
||||
history.append({"role": "assistant", "content": response})
|
||||
|
||||
logger.info("\nTask completed. Moving to next task...\n")
|
||||
|
||||
logger.info("\n" + "="*60)
|
||||
logger.info("All tasks completed successfully!")
|
||||
logger.info("="*60)
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error during automation: {e}")
|
||||
import traceback
|
||||
traceback.print_exc()
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(automate_hr_workflow())
|
||||
```
|
||||
|
||||
</Tab>
|
||||
<Tab value="Windows Sandbox">
|
||||
|
||||
```python
|
||||
import asyncio
|
||||
import logging
|
||||
import os
|
||||
from agent import ComputerAgent
|
||||
from computer import Computer, VMProviderType
|
||||
from dotenv import load_dotenv
|
||||
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
load_dotenv()
|
||||
|
||||
async def automate_hr_workflow():
|
||||
try:
|
||||
# Connect to Windows Sandbox
|
||||
async with Computer(
|
||||
os_type="windows",
|
||||
provider_type=VMProviderType.WINDOWS_SANDBOX,
|
||||
verbosity=logging.INFO,
|
||||
) as computer:
|
||||
|
||||
agent = ComputerAgent(
|
||||
model="anthropic/claude-sonnet-4-5-20250929",
|
||||
tools=[computer],
|
||||
only_n_most_recent_images=3,
|
||||
verbosity=logging.INFO,
|
||||
trajectory_dir="trajectories",
|
||||
use_prompt_caching=True,
|
||||
max_trajectory_budget=10.0,
|
||||
instructions="""
|
||||
You are automating a Windows desktop HR/payroll application.
|
||||
|
||||
IMPORTANT GUIDELINES:
|
||||
- Always wait for windows and dialogs to fully load before interacting
|
||||
- Verify each action by checking on-screen confirmation messages
|
||||
- Desktop apps often have nested menus - explore systematically
|
||||
- Save work frequently using File > Save or Ctrl+S
|
||||
""".strip()
|
||||
)
|
||||
|
||||
tasks = [
|
||||
"Launch the HR application from the desktop",
|
||||
"Log in with credentials from credentials.txt on desktop",
|
||||
"Navigate to Employee Management and create new employee from new_hire.xlsx",
|
||||
"Generate and export onboarding report as PDF",
|
||||
"Log out of the application"
|
||||
]
|
||||
|
||||
history = []
|
||||
|
||||
for task in tasks:
|
||||
logger.info(f"\nTask: {task}")
|
||||
history.append({"role": "user", "content": task})
|
||||
|
||||
async for result in agent.run(history):
|
||||
for item in result.get("output", []):
|
||||
if item.get("type") == "message":
|
||||
content = item.get("content", [])
|
||||
for block in content:
|
||||
if block.get("type") == "text":
|
||||
response = block.get("text", "")
|
||||
logger.info(f"Agent: {response}")
|
||||
history.append({"role": "assistant", "content": response})
|
||||
|
||||
logger.info("\nAll tasks completed!")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error: {e}")
|
||||
import traceback
|
||||
traceback.print_exc()
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(automate_hr_workflow())
|
||||
```
|
||||
|
||||
</Tab>
|
||||
<Tab value="Self-Hosted">
|
||||
|
||||
```python
|
||||
import asyncio
|
||||
import logging
|
||||
import os
|
||||
from agent import ComputerAgent
|
||||
from computer import Computer
|
||||
from dotenv import load_dotenv
|
||||
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
load_dotenv()
|
||||
|
||||
async def automate_hr_workflow():
|
||||
try:
|
||||
# Connect to self-hosted Windows VM running computer-server
|
||||
async with Computer(
|
||||
use_host_computer_server=True,
|
||||
base_url="http://your-windows-vm-ip:5757", # Update with your VM IP
|
||||
verbosity=logging.INFO,
|
||||
) as computer:
|
||||
|
||||
agent = ComputerAgent(
|
||||
model="anthropic/claude-sonnet-4-5-20250929",
|
||||
tools=[computer],
|
||||
only_n_most_recent_images=3,
|
||||
verbosity=logging.INFO,
|
||||
trajectory_dir="trajectories",
|
||||
use_prompt_caching=True,
|
||||
max_trajectory_budget=10.0,
|
||||
instructions="""
|
||||
You are automating a Windows desktop HR/payroll application.
|
||||
|
||||
IMPORTANT GUIDELINES:
|
||||
- Always wait for windows and dialogs to fully load before interacting
|
||||
- Verify each action by checking on-screen confirmation messages
|
||||
- Save work frequently using File > Save or Ctrl+S
|
||||
""".strip()
|
||||
)
|
||||
|
||||
tasks = [
|
||||
"Launch the HR application",
|
||||
"Log in with provided credentials",
|
||||
"Complete the required HR workflow",
|
||||
"Generate and export report",
|
||||
"Log out"
|
||||
]
|
||||
|
||||
history = []
|
||||
|
||||
for task in tasks:
|
||||
logger.info(f"\nTask: {task}")
|
||||
history.append({"role": "user", "content": task})
|
||||
|
||||
async for result in agent.run(history):
|
||||
for item in result.get("output", []):
|
||||
if item.get("type") == "message":
|
||||
content = item.get("content", [])
|
||||
for block in content:
|
||||
if block.get("type") == "text":
|
||||
response = block.get("text", "")
|
||||
logger.info(f"Agent: {response}")
|
||||
history.append({"role": "assistant", "content": response})
|
||||
|
||||
logger.info("\nAll tasks completed!")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error: {e}")
|
||||
import traceback
|
||||
traceback.print_exc()
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(automate_hr_workflow())
|
||||
```
|
||||
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
</Step>
|
||||
|
||||
<Step>
|
||||
|
||||
### Run Your Automation
|
||||
|
||||
Execute the script:
|
||||
|
||||
```bash
|
||||
python hr_automation.py
|
||||
```
|
||||
|
||||
The agent will:
|
||||
1. Connect to your Windows environment (with VPN if configured)
|
||||
2. Launch and navigate the desktop application
|
||||
3. Execute each workflow step sequentially
|
||||
4. Verify actions and handle errors
|
||||
5. Save trajectory logs for audit and debugging
|
||||
|
||||
Monitor the console output to see the agent's progress through each task.
|
||||
|
||||
</Step>
|
||||
|
||||
</Steps>
|
||||
|
||||
---
|
||||
|
||||
## Key Configuration Options
|
||||
|
||||
### Agent Instructions
|
||||
|
||||
The `instructions` parameter is critical for reliable desktop automation:
|
||||
|
||||
```python
|
||||
instructions="""
|
||||
You are automating a Windows desktop HR/payroll application.
|
||||
|
||||
IMPORTANT GUIDELINES:
|
||||
- Always wait for windows and dialogs to fully load before interacting
|
||||
- Look for loading indicators and wait for them to disappear
|
||||
- Verify each action by checking on-screen confirmation messages
|
||||
- If a button or field is not visible, try scrolling or navigating tabs
|
||||
- Desktop apps often have nested menus - explore systematically
|
||||
- Save work frequently using File > Save or Ctrl+S
|
||||
- Before closing, always verify changes were saved
|
||||
|
||||
COMMON UI PATTERNS:
|
||||
- Menu bar navigation (File, Edit, View, etc.)
|
||||
- Ribbon interfaces with tabs
|
||||
- Modal dialogs that block interaction
|
||||
- Data grids/tables for viewing records
|
||||
- Form fields with validation
|
||||
- Status bars showing operation progress
|
||||
|
||||
APPLICATION-SPECIFIC:
|
||||
- Login is at top-left corner
|
||||
- Employee records are under "HR Management" > "Employees"
|
||||
- Reports are generated via "Tools" > "Reports" > "Generate"
|
||||
- Always click "Save" before navigating away from a form
|
||||
""".strip()
|
||||
```
|
||||
|
||||
### Budget Management
|
||||
|
||||
For long-running workflows, adjust budget limits:
|
||||
|
||||
```python
|
||||
agent = ComputerAgent(
|
||||
model="anthropic/claude-sonnet-4-5-20250929",
|
||||
tools=[computer],
|
||||
max_trajectory_budget=20.0, # Increase for complex workflows
|
||||
# ... other params
|
||||
)
|
||||
```
|
||||
|
||||
### Image Retention
|
||||
|
||||
Balance context and cost by retaining only recent screenshots:
|
||||
|
||||
```python
|
||||
agent = ComputerAgent(
|
||||
# ...
|
||||
only_n_most_recent_images=3, # Keep last 3 screenshots
|
||||
# ...
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Production Considerations
|
||||
|
||||
<Callout type="warn" title="Production Deployment">
|
||||
For enterprise production deployments, consider these additional steps:
|
||||
</Callout>
|
||||
|
||||
### 1. Workflow Mining
|
||||
|
||||
Before deploying, analyze your actual workflows:
|
||||
- Record user interactions with the application
|
||||
- Identify common patterns and edge cases
|
||||
- Map out decision trees and validation requirements
|
||||
- Document application-specific quirks and timing issues
|
||||
|
||||
### 2. Custom Finetuning
|
||||
|
||||
Create vertical-specific actions instead of generic UI automation:
|
||||
|
||||
```python
|
||||
# Instead of generic steps:
|
||||
tasks = ["Click login", "Type username", "Type password", "Click submit"]
|
||||
|
||||
# Create semantic actions:
|
||||
tasks = ["onboard_employee", "run_payroll", "generate_compliance_report"]
|
||||
```
|
||||
|
||||
This provides:
|
||||
- Better audit trails
|
||||
- Approval gates at business logic level
|
||||
- Higher success rates
|
||||
- Easier maintenance and updates
|
||||
|
||||
### 3. Human-in-the-Loop
|
||||
|
||||
Add approval gates for critical operations:
|
||||
|
||||
```python
|
||||
agent = ComputerAgent(
|
||||
model="anthropic/claude-sonnet-4-5-20250929",
|
||||
tools=[computer],
|
||||
# Add human approval callback for sensitive operations
|
||||
callbacks=[ApprovalCallback(require_approval_for=["payroll", "termination"])]
|
||||
)
|
||||
```
|
||||
|
||||
### 4. Deployment Options
|
||||
|
||||
Choose your deployment model:
|
||||
|
||||
**Managed (Recommended)**
|
||||
- Cua hosts Windows sandboxes, VPN/RDP stack, and agent runtime
|
||||
- You get UI/API endpoints for triggering workflows
|
||||
- Automatic scaling, monitoring, and maintenance
|
||||
- SLA guarantees and enterprise support
|
||||
|
||||
**Self-Hosted**
|
||||
- You manage Windows VMs, VPN infrastructure, and agent deployment
|
||||
- Full control over data and security
|
||||
- Custom network configurations
|
||||
- On-premise or your preferred cloud
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### VPN Connection Issues
|
||||
|
||||
If the agent cannot reach the application:
|
||||
|
||||
1. Verify VPN is connected: Check VPN client status in the Windows sandbox
|
||||
2. Test network connectivity: Try pinging internal resources
|
||||
3. Check firewall rules: Ensure RDP and application ports are open
|
||||
4. Review VPN logs: Look for authentication or routing errors
|
||||
|
||||
### Application Not Launching
|
||||
|
||||
If the desktop application fails to start:
|
||||
|
||||
1. Verify installation: Check the application is installed in the sandbox
|
||||
2. Check dependencies: Ensure all required DLLs and frameworks are present
|
||||
3. Review permissions: Application may require admin rights
|
||||
4. Check logs: Look for error messages in Windows Event Viewer
|
||||
|
||||
### UI Element Not Found
|
||||
|
||||
If the agent cannot find buttons or fields:
|
||||
|
||||
1. Increase wait times: Some applications load slowly
|
||||
2. Check screen resolution: UI elements may be off-screen
|
||||
3. Verify DPI scaling: High DPI settings can affect element positions
|
||||
4. Update instructions: Provide more specific navigation guidance
|
||||
|
||||
### Cost Management
|
||||
|
||||
If costs are higher than expected:
|
||||
|
||||
1. Reduce `max_trajectory_budget`
|
||||
2. Decrease `only_n_most_recent_images`
|
||||
3. Use prompt caching: Set `use_prompt_caching=True`
|
||||
4. Optimize task descriptions: Be more specific to reduce retry attempts
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
- **Explore custom tools**: Learn how to create [custom tools](/agent-sdk/custom-tools) for application-specific actions
|
||||
- **Implement callbacks**: Add [monitoring and logging](/agent-sdk/callbacks) for production workflows
|
||||
- **Join community**: Get help in our [Discord](https://discord.com/invite/mVnXXpdE85)
|
||||
|
||||
---
|
||||
|
||||
## Related Examples
|
||||
|
||||
- [Form Filling](/example-usecases/form-filling) - Web form automation
|
||||
- [Post-Event Contact Export](/example-usecases/post-event-contact-export) - Data extraction workflows
|
||||
- [Custom Tools](/agent-sdk/custom-tools) - Building application-specific functions
|
||||
@@ -3,5 +3,5 @@
|
||||
"description": "Get started with Cua",
|
||||
"defaultOpen": true,
|
||||
"icon": "Rocket",
|
||||
"pages": ["quickstart"]
|
||||
"pages": ["../index", "quickstart"]
|
||||
}
|
||||
|
||||
@@ -8,7 +8,7 @@ import { Tab, Tabs } from 'fumadocs-ui/components/tabs';
|
||||
import { Accordion, Accordions } from 'fumadocs-ui/components/accordion';
|
||||
import { Code, Terminal } from 'lucide-react';
|
||||
|
||||
Choose your quickstart path:
|
||||
{/* Choose your quickstart path:
|
||||
|
||||
<div className="grid grid-cols-1 md:grid-cols-2 gap-6 mt-8 mb-8">
|
||||
<Card icon={<Code />} href="#developer-quickstart" title="Developer Quickstart">
|
||||
@@ -17,7 +17,7 @@ Choose your quickstart path:
|
||||
<Card icon={<Terminal />} href="#cli-quickstart" title="CLI Quickstart">
|
||||
Get started quickly with the command-line interface
|
||||
</Card>
|
||||
</div>
|
||||
</div> */}
|
||||
|
||||
---
|
||||
|
||||
@@ -30,11 +30,11 @@ You can run your Cua computer in the cloud (recommended for easiest setup), loca
|
||||
<Tabs items={['Cloud Sandbox', 'Linux on Docker', 'macOS Sandbox', 'Windows Sandbox']}>
|
||||
<Tab value="Cloud Sandbox">
|
||||
|
||||
Cua Cloud Sandbox provides sandboxes that run Linux (Ubuntu) or Windows.
|
||||
Cua Cloud Sandbox provides sandboxes that run Linux (Ubuntu), Windows, or macOS.
|
||||
|
||||
1. Go to [cua.ai/signin](https://cua.ai/signin)
|
||||
2. Navigate to **Dashboard > Containers > Create Instance**
|
||||
3. Create a **Small** sandbox, choosing either **Linux** or **Windows**
|
||||
3. Create a **Small** sandbox, choosing **Linux**, **Windows**, or **macOS**
|
||||
4. Note your sandbox name and API key
|
||||
|
||||
Your Cloud Sandbox will be automatically configured and ready to use.
|
||||
@@ -117,7 +117,7 @@ Connect to your Cua computer and perform basic interactions, such as taking scre
|
||||
from computer import Computer
|
||||
|
||||
computer = Computer(
|
||||
os_type="linux",
|
||||
os_type="linux", # or "windows" or "macos"
|
||||
provider_type="cloud",
|
||||
name="your-sandbox-name",
|
||||
api_key="your-api-key"
|
||||
@@ -192,6 +192,10 @@ Connect to your Cua computer and perform basic interactions, such as taking scre
|
||||
|
||||
</Tab>
|
||||
<Tab value="TypeScript">
|
||||
<Callout type="warn" title="TypeScript SDK Deprecated">
|
||||
The TypeScript interface is currently deprecated. We're working on version 0.2.0 with improved TypeScript support. In the meantime, please use the Python SDK.
|
||||
</Callout>
|
||||
|
||||
Install the Cua computer TypeScript SDK:
|
||||
```bash
|
||||
npm install @trycua/computer
|
||||
@@ -205,7 +209,7 @@ Connect to your Cua computer and perform basic interactions, such as taking scre
|
||||
import { Computer, OSType } from '@trycua/computer';
|
||||
|
||||
const computer = new Computer({
|
||||
osType: OSType.LINUX,
|
||||
osType: OSType.LINUX, // or OSType.WINDOWS or OSType.MACOS
|
||||
name: "your-sandbox-name",
|
||||
apiKey: "your-api-key"
|
||||
});
|
||||
@@ -328,7 +332,7 @@ Learn more about agents in [Agent Loops](/agent-sdk/agent-loops) and available m
|
||||
- Join our [Discord community](https://discord.com/invite/mVnXXpdE85) for help
|
||||
- Try out [Form Filling](/example-usecases/form-filling) preset usecase
|
||||
|
||||
---
|
||||
{/* ---
|
||||
|
||||
## CLI Quickstart
|
||||
|
||||
@@ -354,7 +358,7 @@ Get started quickly with the CUA CLI - the easiest way to manage cloud sandboxes
|
||||
```bash
|
||||
# Install Bun if you don't have it
|
||||
curl -fsSL https://bun.sh/install | bash
|
||||
|
||||
|
||||
# Install CUA CLI
|
||||
bun add -g @trycua/cli
|
||||
```
|
||||
@@ -467,4 +471,4 @@ cua delete my-vm-abc123
|
||||
|
||||
---
|
||||
|
||||
For running models locally, see [Running Models Locally](/agent-sdk/supported-model-providers/local-models).
|
||||
For running models locally, see [Running Models Locally](/agent-sdk/supported-model-providers/local-models). */}
|
||||
|
||||
@@ -4,15 +4,9 @@ title: Introduction
|
||||
|
||||
import { Monitor, Code, BookOpen, Zap, Bot, Boxes, Rocket } from 'lucide-react';
|
||||
|
||||
<Hero>
|
||||
|
||||
<div className="rounded-lg border bg-card text-card-foreground shadow-sm px-4 py-2 mb-6">
|
||||
Cua is an open-source framework for building **Computer-Use Agents** - AI systems that see, understand, and interact with desktop applications through vision and action, just like humans do.
|
||||
|
||||
<br />
|
||||
|
||||
Go from prototype to production with everything you need: multi-provider LLM support, cross-platform sandboxes, and trajectory tracing. Whether you're running locally or deploying to the cloud, Cua gives you the tools to build reliable computer-use agents.
|
||||
|
||||
</Hero>
|
||||
</div>
|
||||
|
||||
## Why Cua?
|
||||
|
||||
@@ -46,14 +40,14 @@ Follow the [Quickstart guide](/docs/get-started/quickstart) for step-by-step set
|
||||
If you're new to computer-use agents, check out our [tutorials](https://cua.ai/blog), [examples](https://github.com/trycua/cua/tree/main/examples), and [notebooks](https://github.com/trycua/cua/tree/main/notebooks) to start building with Cua today.
|
||||
|
||||
<div className="grid grid-cols-1 md:grid-cols-2 gap-6 mt-8">
|
||||
<Card icon={<Rocket />} href="/docs/get-started/quickstart" title="Quickstart">
|
||||
<Card icon={<Rocket />} href="/get-started/quickstart" title="Quickstart">
|
||||
Get up and running in 3 steps with Python or TypeScript.
|
||||
</Card>
|
||||
<Card icon={<BookOpen />} href="/agent-sdk/agent-loops" title="Learn Core Concepts">
|
||||
Understand agent loops, callbacks, and model composition.
|
||||
<Card icon={<Zap />} href="/agent-sdk/agent-loops" title="Agent Loops">
|
||||
Learn how agents work and how to build your own.
|
||||
</Card>
|
||||
<Card icon={<Code />} href="/libraries/agent" title="API Reference">
|
||||
Explore the full Agent SDK and Computer SDK APIs.
|
||||
<Card icon={<BookOpen />} href="/computer-sdk/computers" title="Computer SDK">
|
||||
Control desktop applications with the Computer SDK.
|
||||
</Card>
|
||||
<Card icon={<Monitor />} href="/example-usecases/form-filling" title="Example Use Cases">
|
||||
See Cua in action with real-world examples.
|
||||
|
||||
@@ -7,14 +7,7 @@ github:
|
||||
---
|
||||
|
||||
<Callout>
|
||||
A corresponding{' '}
|
||||
<a
|
||||
href="https://github.com/trycua/cua/blob/main/notebooks/computer_server_nb.ipynb"
|
||||
target="_blank"
|
||||
>
|
||||
Jupyter Notebook
|
||||
</a>{' '}
|
||||
is available for this documentation.
|
||||
A corresponding <a href="https://github.com/trycua/cua/blob/main/notebooks/computer_server_nb.ipynb" target="_blank">Jupyter Notebook</a> is available for this documentation.
|
||||
</Callout>
|
||||
|
||||
The Computer Server API reference documentation is currently under development.
|
||||
|
||||
@@ -7,11 +7,7 @@ github:
|
||||
---
|
||||
|
||||
<Callout>
|
||||
A corresponding{' '}
|
||||
<a href="https://github.com/trycua/cua/blob/main/examples/som_examples.py" target="_blank">
|
||||
Python example
|
||||
</a>{' '}
|
||||
is available for this documentation.
|
||||
A corresponding <a href="https://github.com/trycua/cua/blob/main/examples/som_examples.py" target="_blank">Python example</a> is available for this documentation.
|
||||
</Callout>
|
||||
|
||||
## Overview
|
||||
|
||||
@@ -4,7 +4,6 @@
|
||||
"root": true,
|
||||
"defaultOpen": true,
|
||||
"pages": [
|
||||
"index",
|
||||
"---[Rocket]Get Started---",
|
||||
"...get-started",
|
||||
"---[ChefHat]Cookbook---",
|
||||
|
||||
@@ -37,6 +37,7 @@ export const baseOptions: BaseLayoutProps = {
|
||||
Cua
|
||||
</>
|
||||
),
|
||||
url: 'https://cua.ai',
|
||||
},
|
||||
githubUrl: 'https://github.com/trycua/cua',
|
||||
links: [
|
||||
|
||||
Reference in New Issue
Block a user