Merge pull request #375 from trycua/feat/move-computer-sdk-readmes-to-docs

Move Computer SDK READMEs to docs
This commit is contained in:
James Murdza
2025-09-01 09:33:57 -04:00
committed by GitHub
7 changed files with 199 additions and 320 deletions

View File

@@ -202,17 +202,17 @@ Direct file and directory manipulation:
</Tab>
<Tab value="TypeScript">
```typescript
// File existence checks
# File existence checks
await computer.interface.fileExists(path); // Check if file exists
await computer.interface.directoryExists(path); // Check if directory exists
// File content operations
# File content operations
await computer.interface.readText(path, "utf-8"); // Read file content
await computer.interface.writeText(path, content, "utf-8"); // Write file content
await computer.interface.readBytes(path); // Read file content as bytes
await computer.interface.writeBytes(path, content); // Write file content as bytes
// File and directory management
# File and directory management
await computer.interface.deleteFile(path); // Delete file
await computer.interface.createDir(path); // Create directory
await computer.interface.deleteDir(path); // Delete directory
@@ -243,3 +243,38 @@ Access system accessibility information:
```
</Tab>
</Tabs>
## Delay Configuration
Control timing between actions:
<Tabs items={['Python']}>
<Tab value="Python">
```python
# Set default delay between all actions (in seconds)
computer.interface.delay = 0.5 # 500ms delay between actions
# Or specify delay for individual actions
await computer.interface.left_click(x, y, delay=1.0) # 1 second delay after click
await computer.interface.type_text("Hello", delay=0.2) # 200ms delay after typing
await computer.interface.press_key("enter", delay=0.5) # 500ms delay after key press
```
</Tab>
</Tabs>
## Python Virtual Environment Operations
Manage Python environments:
<Tabs items={['Python']}>
<Tab value="Python">
```python
# Virtual environment management
await computer.venv_install("demo_venv", ["requests", "macos-pyxa"]) # Install packages in a virtual environment
await computer.venv_cmd("demo_venv", "python -c 'import requests; print(requests.get(`https://httpbin.org/ip`).json())'') # Run a shell command in a virtual environment
await computer.venv_exec("demo_venv", python_function_or_code, *args, **kwargs) # Run a Python function in a virtual environment and return the result / raise an exception
```
</Tab>
</Tabs>

View File

@@ -0,0 +1,80 @@
---
title: Computer UI
---
The computer module includes a Gradio UI for creating and sharing demonstration data. We make it easy for people to build community datasets for better computer use models with an upload to Huggingface feature.
```bash
# Install with UI support
pip install "cua-computer[ui]"
```
<Callout title="Note">
For precise control of the computer, we recommend using VNC or Screen Sharing instead of the Computer Gradio UI.
</Callout>
### Building and Sharing Demonstrations with Huggingface
Follow these steps to contribute your own demonstrations:
#### 1. Set up Huggingface Access
Set your HF_TOKEN in a .env file or in your environment variables:
```bash
# In .env file
HF_TOKEN=your_huggingface_token
```
#### 2. Launch the Computer UI
```python
# launch_ui.py
from computer.ui.gradio.app import create_gradio_ui
from dotenv import load_dotenv
load_dotenv('.env')
app = create_gradio_ui()
app.launch(share=False)
```
For examples, see [Computer UI Examples](https://github.com/trycua/cua/tree/main/examples/computer_ui_examples.py)
#### 3. Record Your Tasks
<details open>
<summary>View demonstration video</summary>
<video src="https://github.com/user-attachments/assets/de3c3477-62fe-413c-998d-4063e48de176" controls width="600"></video>
</details>
Record yourself performing various computer tasks using the UI.
#### 4. Save Your Demonstrations
<details open>
<summary>View demonstration video</summary>
<video src="https://github.com/user-attachments/assets/5ad1df37-026a-457f-8b49-922ae805faef" controls width="600"></video>
</details>
Save each task by picking a descriptive name and adding relevant tags (e.g., "office", "web-browsing", "coding").
#### 5. Record Additional Demonstrations
Repeat steps 3 and 4 until you have a good amount of demonstrations covering different tasks and scenarios.
#### 6. Upload to Huggingface
<details open>
<summary>View demonstration video</summary>
<video src="https://github.com/user-attachments/assets/c586d460-3877-4b5f-a736-3248886d2134" controls width="600"></video>
</details>
Upload your dataset to Huggingface by:
- Naming it as `{your_username}/{dataset_name}`
- Choosing public or private visibility
- Optionally selecting specific tags to upload only tasks with certain tags
#### Examples and Resources
- Example Dataset: [ddupont/test-dataset](https://huggingface.co/datasets/ddupont/test-dataset)
- Find Community Datasets: 🔍 [Browse CUA Datasets on Huggingface](https://huggingface.co/datasets?other=cua)

View File

@@ -4,6 +4,7 @@
"pages": [
"computers",
"commands",
"computer-ui",
"sandboxed-python"
]
}

View File

@@ -44,6 +44,32 @@ You can also install packages in the virtual environment using the `venv_install
await my_computer.venv_install("myenv", ["requests"])
```
## Example: Interacting with macOS Applications
You can use sandboxed functions to interact with macOS applications on a local Cua Computer (requires `os_type="darwin"`). This is particularly useful for automation tasks that involve GUI applications.
```python
# Example: Use sandboxed functions to execute code in a Cua Container
from computer.helpers import sandboxed
await computer.venv_install("demo_venv", ["macos-pyxa"]) # Install packages in a virtual environment
@sandboxed("demo_venv")
def greet_and_print(name):
"""Get the HTML of the current Safari tab"""
import PyXA
safari = PyXA.Application("Safari")
html = safari.current_document.source()
print(f"Hello from inside the container, {name}!")
return {"greeted": name, "safari_html": html}
# When a @sandboxed function is called, it will execute in the container
result = await greet_and_print("Cua")
# Result: {"greeted": "Cua", "safari_html": "<html>...</html>"}
# stdout and stderr are also captured and printed / raised
print("Result from sandboxed function:", result)
```
## Error Handling
If the remote execution fails, the decorator will retry up to `max_retries` times. If all attempts fail, the last exception is raised locally.

View File

@@ -8,202 +8,16 @@ github:
- https://github.com/trycua/cua/tree/main/libs/typescript/computer
---
The Computer library provides a Computer class that can be used to control and automate a container running the Computer Server.
The Computer library provides a Computer class for controlling and automating containers running the Computer Server.
## Reference
## Connecting to Computers
### Basic Usage
See the [Cua Computers](../computer-sdk/computers) documentation for how to connect to different computer types (cloud, local, or host desktop).
Connect to a cua cloud container:
## Computer Commands
<Tabs items={['Python', 'TypeScript']}>
<Tab value="Python">
```python
from computer import Computer
See the [Commands](../computer-sdk/commands) documentation for all supported commands and interface methods (Shell, Mouse, Keyboard, File System, etc.).
computer = Computer(
os_type="linux",
provider_type="cloud",
name="your-container-name",
api_key="your-api-key"
)
## Sandboxed Python Functions
computer = await computer.run() # Connect to a cua cloud container
```
</Tab>
<Tab value="TypeScript">
```typescript
import { Computer, OSType } from '@trycua/computer';
const computer = new Computer({
osType: OSType.LINUX,
name: "your-container-name",
apiKey: "your-api-key"
});
await computer.run(); // Connect to a cua cloud container
```
</Tab>
</Tabs>
Connect to a cua local container:
<Tabs items={['Python']}>
<Tab value="Python">
```python
from computer import Computer
computer = Computer(
os_type="macos"
)
computer = await computer.run() # Connect to the container
```
</Tab>
</Tabs>
### Interface Actions
<Tabs items={['Python', 'TypeScript']}>
<Tab value="Python">
```python
# Shell Actions
result = await computer.interface.run_command(cmd) # Run shell command
# result.stdout, result.stderr, result.returncode
# Mouse Actions
await computer.interface.left_click(x, y) # Left click at coordinates
await computer.interface.right_click(x, y) # Right click at coordinates
await computer.interface.double_click(x, y) # Double click at coordinates
await computer.interface.move_cursor(x, y) # Move cursor to coordinates
await computer.interface.drag_to(x, y, duration) # Drag to coordinates
await computer.interface.get_cursor_position() # Get current cursor position
await computer.interface.mouse_down(x, y, button="left") # Press and hold a mouse button
await computer.interface.mouse_up(x, y, button="left") # Release a mouse button
# Keyboard Actions
await computer.interface.type_text("Hello") # Type text
await computer.interface.press_key("enter") # Press a single key
await computer.interface.hotkey("command", "c") # Press key combination
await computer.interface.key_down("command") # Press and hold a key
await computer.interface.key_up("command") # Release a key
# Scrolling Actions
await computer.interface.scroll(x, y) # Scroll the mouse wheel
await computer.interface.scroll_down(clicks) # Scroll down
await computer.interface.scroll_up(clicks) # Scroll up
# Screen Actions
await computer.interface.screenshot() # Take a screenshot
await computer.interface.get_screen_size() # Get screen dimensions
# Clipboard Actions
await computer.interface.set_clipboard(text) # Set clipboard content
await computer.interface.copy_to_clipboard() # Get clipboard content
# File System Operations
await computer.interface.file_exists(path) # Check if file exists
await computer.interface.directory_exists(path) # Check if directory exists
await computer.interface.read_text(path, encoding="utf-8") # Read file content
await computer.interface.write_text(path, content, encoding="utf-8") # Write file content
await computer.interface.read_bytes(path) # Read file content as bytes
await computer.interface.write_bytes(path, content) # Write file content as bytes
await computer.interface.delete_file(path) # Delete file
await computer.interface.create_dir(path) # Create directory
await computer.interface.delete_dir(path) # Delete directory
await computer.interface.list_dir(path) # List directory contents
# Accessibility
await computer.interface.get_accessibility_tree() # Get accessibility tree
# Delay Configuration
# Set default delay between all actions (in seconds)
computer.interface.delay = 0.5 # 500ms delay between actions
# Or specify delay for individual actions
await computer.interface.left_click(x, y, delay=1.0) # 1 second delay after click
await computer.interface.type_text("Hello", delay=0.2) # 200ms delay after typing
await computer.interface.press_key("enter", delay=0.5) # 500ms delay after key press
# Python Virtual Environment Operations
await computer.venv_install("demo_venv", ["requests", "macos-pyxa"]) # Install packages in a virtual environment
await computer.venv_cmd("demo_venv", "python -c 'import requests; print(requests.get(`https://httpbin.org/ip`).json())'') # Run a shell command in a virtual environment
await computer.venv_exec("demo_venv", python_function_or_code, *args, **kwargs) # Run a Python function in a virtual environment and return the result / raise an exception
# Example: Use sandboxed functions to execute code in a Cua Container
from computer.helpers import sandboxed
@sandboxed("demo_venv")
def greet_and_print(name):
"""Get the HTML of the current Safari tab"""
import PyXA
safari = PyXA.Application("Safari")
html = safari.current_document.source()
print(f"Hello from inside the container, {name}!")
return {"greeted": name, "safari_html": html}
# When a @sandboxed function is called, it will execute in the container
result = await greet_and_print("Cua")
# Result: {"greeted": "Cua", "safari_html": "<html>...</html>"}
# stdout and stderr are also captured and printed / raised
print("Result from sandboxed function:", result)
```
</Tab>
<Tab value="TypeScript">
```typescript
// Shell Actions
const result = await computer.interface.runCommand(cmd); // Run shell command
// result.stdout, result.stderr, result.returncode
// Mouse Actions
await computer.interface.leftClick(x, y); // Left click at coordinates
await computer.interface.rightClick(x, y); // Right click at coordinates
await computer.interface.doubleClick(x, y); // Double click at coordinates
await computer.interface.moveCursor(x, y); // Move cursor to coordinates
await computer.interface.dragTo(x, y, duration); // Drag to coordinates
await computer.interface.getCursorPosition(); // Get current cursor position
await computer.interface.mouseDown(x, y, "left"); // Press and hold a mouse button
await computer.interface.mouseUp(x, y, "left"); // Release a mouse button
// Keyboard Actions
await computer.interface.typeText("Hello"); // Type text
await computer.interface.pressKey("enter"); // Press a single key
await computer.interface.hotkey("command", "c"); // Press key combination
await computer.interface.keyDown("command"); // Press and hold a key
await computer.interface.keyUp("command"); // Release a key
// Scrolling Actions
await computer.interface.scroll(x, y); // Scroll the mouse wheel
await computer.interface.scrollDown(clicks); // Scroll down
await computer.interface.scrollUp(clicks); // Scroll up
// Screen Actions
await computer.interface.screenshot(); // Take a screenshot
await computer.interface.getScreenSize(); // Get screen dimensions
// Clipboard Actions
await computer.interface.setClipboard(text); // Set clipboard content
await computer.interface.copyToClipboard(); // Get clipboard content
// File System Operations
await computer.interface.fileExists(path); // Check if file exists
await computer.interface.directoryExists(path); // Check if directory exists
await computer.interface.readText(path, "utf-8"); // Read file content
await computer.interface.writeText(path, content, "utf-8"); // Write file content
await computer.interface.readBytes(path); // Read file content as bytes
await computer.interface.writeBytes(path, content); // Write file content as bytes
await computer.interface.deleteFile(path); // Delete file
await computer.interface.createDir(path); // Create directory
await computer.interface.deleteDir(path); // Delete directory
await computer.interface.listDir(path); // List directory contents
// Accessibility
await computer.interface.getAccessibilityTree(); // Get accessibility tree
```
</Tab>
</Tabs>
See the [Sandboxed Python](../computer-sdk/sandboxed-python) documentation for running Python functions securely in isolated environments on a remote Cua Computer.

View File

@@ -65,80 +65,9 @@ Refer to this notebook for a step-by-step guide on how to use the Computer-Use I
- [Computer-Use Interface (CUI)](https://github.com/trycua/cua/blob/main/notebooks/computer_nb.ipynb)
## Using the Gradio Computer UI
The computer module includes a Gradio UI for creating and sharing demonstration data. We make it easy for people to build community datasets for better computer use models with an upload to Huggingface feature.
```bash
# Install with UI support
pip install "cua-computer[ui]"
```
> **Note:** For precise control of the computer, we recommend using VNC or Screen Sharing instead of the Computer Gradio UI.
### Building and Sharing Demonstrations with Huggingface
Follow these steps to contribute your own demonstrations:
#### 1. Set up Huggingface Access
Set your HF_TOKEN in a .env file or in your environment variables:
```bash
# In .env file
HF_TOKEN=your_huggingface_token
```
#### 2. Launch the Computer UI
```python
# launch_ui.py
from computer.ui.gradio.app import create_gradio_ui
from dotenv import load_dotenv
load_dotenv('.env')
app = create_gradio_ui()
app.launch(share=False)
```
For examples, see [Computer UI Examples](https://github.com/trycua/cua/tree/main/examples/computer_ui_examples.py)
#### 3. Record Your Tasks
<details open>
<summary>View demonstration video</summary>
<video src="https://github.com/user-attachments/assets/de3c3477-62fe-413c-998d-4063e48de176" controls width="600"></video>
</details>
Record yourself performing various computer tasks using the UI.
#### 4. Save Your Demonstrations
<details open>
<summary>View demonstration video</summary>
<video src="https://github.com/user-attachments/assets/5ad1df37-026a-457f-8b49-922ae805faef" controls width="600"></video>
</details>
Save each task by picking a descriptive name and adding relevant tags (e.g., "office", "web-browsing", "coding").
#### 5. Record Additional Demonstrations
Repeat steps 3 and 4 until you have a good amount of demonstrations covering different tasks and scenarios.
#### 6. Upload to Huggingface
<details open>
<summary>View demonstration video</summary>
<video src="https://github.com/user-attachments/assets/c586d460-3877-4b5f-a736-3248886d2134" controls width="600"></video>
</details>
Upload your dataset to Huggingface by:
- Naming it as `{your_username}/{dataset_name}`
- Choosing public or private visibility
- Optionally selecting specific tags to upload only tasks with certain tags
#### Examples and Resources
- Example Dataset: [ddupont/test-dataset](https://huggingface.co/datasets/ddupont/test-dataset)
- Find Community Datasets: 🔍 [Browse CUA Datasets on Huggingface](https://huggingface.co/datasets?other=cua)
## Docs
- [Computers](https://trycua.com/docs/computer-sdk/computers)
- [Commands](https://trycua.com/docs/computer-sdk/commands)
- [Computer UI](https://trycua.com/docs/computer-sdk/computer-ui)
- [Sandboxed Python](https://trycua.com/docs/computer-sdk/sandboxed-python)

View File

@@ -1,28 +1,35 @@
# Cua Computer TypeScript Library
<div align="center">
<h1>
<div class="image-wrapper" style="display: inline-block;">
<picture>
<source media="(prefers-color-scheme: dark)" alt="logo" height="150" srcset="https://raw.githubusercontent.com/trycua/cua/main/img/logo_white.png" style="display: block; margin: auto;">
<source media="(prefers-color-scheme: light)" alt="logo" height="150" srcset="https://raw.githubusercontent.com/trycua/cua/main/img/logo_black.png" style="display: block; margin: auto;">
<img alt="Shows my svg">
</picture>
</div>
The TypeScript library for C/cua Computer - a powerful computer control and automation library.
[![TypeScript](https://img.shields.io/badge/TypeScript-333333?logo=typescript&logoColor=white&labelColor=333333)](#)
[![macOS](https://img.shields.io/badge/macOS-000000?logo=apple&logoColor=F0F0F0)](#)
[![Discord](https://img.shields.io/badge/Discord-%235865F2.svg?&logo=discord&logoColor=white)](https://discord.com/invite/mVnXXpdE85)
[![NPM](https://img.shields.io/npm/v/@trycua/computer?color=333333)](https://www.npmjs.com/package/@trycua/computer)
</h1>
</div>
## Overview
**@trycua/computer** is a Computer-Use Interface (CUI) framework powering Cua for interacting with local macOS and Linux sandboxes, Playwright-compatible, and pluggable with any AI agent systems (Cua, Langchain, CrewAI, AutoGen). Computer relies on [Lume](https://github.com/trycua/lume) for creating and managing sandbox environments.
This library is a TypeScript port of the Python computer library, providing the same functionality for controlling virtual machines and computer interfaces. It enables programmatic control of virtual machines through various providers and offers a consistent interface for interacting with the VM's operating system.
### Get started with Computer
## Installation
```bash
npm install @trycua/computer
# or
pnpm add @trycua/computer
```
## Usage
<div align="center">
<img src="https://raw.githubusercontent.com/trycua/cua/main/img/computer.png"/>
</div>
```typescript
import { Computer } from '@trycua/computer';
import { Computer, OSType } from '@trycua/computer';
// Create a new computer instance
const computer = new Computer({
osType: OSType.LINUX,
name: 's-linux-vm_id'
name: 's-linux-vm_id',
apiKey: 'your-api-key'
});
@@ -30,60 +37,47 @@ const computer = new Computer({
await computer.run();
// Get the computer interface for interaction
const interface = computer.interface;
const computerInterface = computer.interface;
// Take a screenshot
const screenshot = await interface.getScreenshot();
const screenshot = await computerInterface.getScreenshot();
// In a Node.js environment, you might save it like this:
// import * as fs from 'fs';
// fs.writeFileSync('screenshot.png', Buffer.from(screenshot));
// Click at coordinates
await interface.click(500, 300);
await computerInterface.click(500, 300);
// Type text
await interface.typeText('Hello, world!');
await computerInterface.typeText('Hello, world!');
// Stop the computer
await computer.stop();
```
## Architecture
## Install
The library is organized into the following structure:
### Core Components
- **Computer Factory**: A factory object that creates appropriate computer instances
- **BaseComputer**: Abstract base class with shared functionality for all computer types
- **Types**: Type definitions for configuration options and shared interfaces
### Provider Implementations
- **Computer**: Implementation for cloud-based VMs
## Development
- Install dependencies:
To install the Computer-Use Interface (CUI):
```bash
pnpm install
npm install @trycua/computer
# or
pnpm add @trycua/computer
```
- Run the unit tests:
The `@trycua/computer` package provides the TypeScript library for interacting with computer interfaces.
```bash
pnpm test
```
## Run
- Build the library:
Refer to this example for a step-by-step guide on how to use the Computer-Use Interface (CUI):
```bash
pnpm build
```
- [Computer-Use Interface (CUI)](https://github.com/trycua/cua/tree/main/examples/computer-example-ts)
- Type checking:
## Docs
```bash
pnpm typecheck
```
- [Computers](https://trycua.com/docs/computer-sdk/computers)
- [Commands](https://trycua.com/docs/computer-sdk/commands)
- [Computer UI](https://trycua.com/docs/computer-sdk/computer-ui)
## License