diff --git a/COMPATIBILITY.md b/COMPATIBILITY.md new file mode 100644 index 00000000..a00d6e45 --- /dev/null +++ b/COMPATIBILITY.md @@ -0,0 +1,86 @@ +# C/ua Compatibility Matrix + +## Table of Contents +- [Host OS Compatibility](#host-os-compatibility) + - [macOS Host](#macos-host) + - [Ubuntu/Linux Host](#ubuntulinux-host) + - [Windows Host](#windows-host) +- [VM Emulation Support](#vm-emulation-support) +- [Model Provider Compatibility](#model-provider-compatibility) + +--- + +## Host OS Compatibility + +*This section shows compatibility based on your **host operating system** (the OS you're running C/ua on).* + +### macOS Host + +| Installation Method | Requirements | Lume | Cloud | Notes | +|-------------------|-------------|------|-------|-------| +| **playground-docker.sh** | Docker Desktop | ✅ Full | ✅ Full | Recommended for quick setup | +| **Dev Container** | VS Code/WindSurf + Docker | ✅ Full | ✅ Full | Best for development | +| **PyPI packages** | Python 3.12+ | ✅ Full | ✅ Full | Most flexible | + +**macOS Host Requirements:** +- macOS 15+ (Sequoia) for local VM support +- Apple Silicon (M1/M2/M3/M4) recommended for best performance +- Docker Desktop for containerized installations + +--- + +### Ubuntu/Linux Host + +| Installation Method | Requirements | Lume | Cloud | Notes | +|-------------------|-------------|------|-------|-------| +| **playground-docker.sh** | Docker Engine | ✅ Full | ✅ Full | Recommended for quick setup | +| **Dev Container** | VS Code/WindSurf + Docker | ✅ Full | ✅ Full | Best for development | +| **PyPI packages** | Python 3.12+ | ✅ Full | ✅ Full | Most flexible | + +**Ubuntu/Linux Host Requirements:** +- Ubuntu 20.04+ or equivalent Linux distribution +- Docker Engine or Docker Desktop +- Python 3.12+ for PyPI installation + +--- + +### Windows Host + +| Installation Method | Requirements | Lume | Winsandbox | Cloud | Notes | +|-------------------|-------------|------|------------|-------|-------| +| **playground-docker.sh** | Docker Desktop + WSL2 | ❌ Not supported | ❌ Not supported | ✅ Full | Requires WSL2 | +| **Dev Container** | VS Code/WindSurf + Docker + WSL2 | ❌ Not supported | ❌ Not supported | ✅ Full | Requires WSL2 | +| **PyPI packages** | Python 3.12+ | ❌ Not supported | ✅ Limited | ✅ Full | WSL for .sh scripts | + +**Windows Host Requirements:** +- Windows 10/11 with WSL2 enabled for shell script execution +- Docker Desktop with WSL2 backend +- Windows Sandbox feature enabled (for Winsandbox support) +- Python 3.12+ installed in WSL2 or Windows +- **Note**: Lume CLI is not available on Windows - use Cloud or Winsandbox providers + +--- + +## VM Emulation Support + +*This section shows which **virtual machine operating systems** each provider can emulate.* + +| Provider | macOS VM | Ubuntu/Linux VM | Windows VM | Notes | +|----------|----------|-----------------|------------|-------| +| **Lume** | ✅ Full support | ⚠️ Limited support | ⚠️ Limited support | macOS: native; Ubuntu/Linux/Windows: need custom image | +| **Cloud** | 🚧 Coming soon | ✅ Full support | 🚧 Coming soon | Currently Ubuntu only, macOS/Windows in development | +| **Winsandbox** | ❌ Not supported | ❌ Not supported | ✅ Windows only | Windows Sandbox environments only | + +--- + +## Model Provider Compatibility + +*This section shows which **AI model providers** are supported on each host operating system.* + +| Provider | macOS Host | Ubuntu/Linux Host | Windows Host | Notes | +|----------|------------|-------------------|--------------|-------| +| **Anthropic** | ✅ Full support | ✅ Full support | ✅ Full support | Cloud-based API | +| **OpenAI** | ✅ Full support | ✅ Full support | ✅ Full support | Cloud-based API | +| **Ollama** | ✅ Full support | ✅ Full support | ✅ Full support | Local model serving | +| **OpenAI Compatible** | ✅ Full support | ✅ Full support | ✅ Full support | Any OpenAI-compatible API endpoint | +| **MLX VLM** | ✅ macOS only | ❌ Not supported | ❌ Not supported | Apple Silicon required. PyPI installation only. | \ No newline at end of file diff --git a/README.md b/README.md index cf1e8004..c07df910 100644 --- a/README.md +++ b/README.md @@ -53,15 +53,15 @@ -### Option 1: Fully-managed install (recommended) -*Guided install for quick use* +### Option 1: Fully-managed install with Docker (recommended) +*Docker-based guided install for quick use* **macOS/Linux/Windows (via WSL):** ```bash -# Requires Python 3.11+ -/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/trycua/cua/main/scripts/playground.sh)" +# Requires Docker +/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/trycua/cua/main/scripts/playground-docker.sh)" ``` -This script will guide you through setup and launch the Computer-Use Agent UI. +This script will guide you through setup using Docker containers and launch the Computer-Use Agent UI. --- @@ -72,26 +72,47 @@ This repository includes a [Dev Container](./.devcontainer/README.md) configurat 1. **Install the Dev Containers extension ([VS Code](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers) or [WindSurf](https://docs.windsurf.com/windsurf/advanced#dev-containers-beta))** 2. **Open the repository in the Dev Container:** - - Press `Ctrl+Shift+P` (or `Cmd+Shift+P` on macOS) - - Select `Dev Containers: Clone Repository in Container Volume...` and paste the repository URL: `https://github.com/trycua/cua.git` (if not cloned) or `Dev Containers: Open Folder in Container...` (if already cloned). **Note**: On WindSurf, the post install hook might not run automatically. If so, run `/bin/bash .devcontainer/post-install.sh` manually. -3. **Run the Agent UI example:** Click ![Run Agent UI](https://github.com/user-attachments/assets/7a61ef34-4b22-4dab-9864-f86bf83e290b) + - Press `Ctrl+Shift+P` (or `⌘+Shift+P` on macOS) + - Select `Dev Containers: Clone Repository in Container Volume...` and paste the repository URL: `https://github.com/trycua/cua.git` (if not cloned) or `Dev Containers: Open Folder in Container...` (if git cloned). + > **Note**: On WindSurf, the post install hook might not run automatically. If so, run `/bin/bash .devcontainer/post-install.sh` manually. +3. **Open the VS Code workspace:** Once the post-install.sh is done running, open the `.vscode/py.code-workspace` workspace and press ![Open Workspace](https://github.com/user-attachments/assets/923bdd43-8c8f-4060-8d78-75bfa302b48c) +. +4. **Run the Agent UI example:** Click ![Run Agent UI](https://github.com/user-attachments/assets/7a61ef34-4b22-4dab-9864-f86bf83e290b) to start the Gradio UI. If prompted to install **debugpy (Python Debugger)** to enable remote debugging, select 'Yes' to proceed. -4. **Access the Gradio UI:** The Gradio UI will be available at `http://localhost:7860` and will automatically forward to your host machine. +5. **Access the Gradio UI:** The Gradio UI will be available at `http://localhost:7860` and will automatically forward to your host machine. --- -*How it works: Computer module provides secure desktops (Lume CLI locally, [C/ua Cloud Containers](https://trycua.com) remotely), Agent module provides local/API agents with OpenAI AgentResponse format and [trajectory tracing](https://trycua.com/trajectory-viewer).* -### Supported [Agent Loops](https://github.com/trycua/cua/blob/main/libs/agent/README.md#agent-loops) +### Option 3: PyPI +*Direct Python package installation* + +```bash +# conda create -yn cua python==3.12 + +pip install -U "cua-computer[all]" "cua-agent[all]" +python -m agent.ui # Start the agent UI +``` + +Or check out the [Usage Guide](#-usage-guide) to learn how to use our Python SDK in your own code. + +--- + +## Supported [Agent Loops](https://github.com/trycua/cua/blob/main/libs/agent/README.md#agent-loops) - [UITARS-1.5](https://github.com/trycua/cua/blob/main/libs/agent/README.md#agent-loops) - Run locally on Apple Silicon with MLX, or use cloud providers - [OpenAI CUA](https://github.com/trycua/cua/blob/main/libs/agent/README.md#agent-loops) - Use OpenAI's Computer-Use Preview model - [Anthropic CUA](https://github.com/trycua/cua/blob/main/libs/agent/README.md#agent-loops) - Use Anthropic's Computer-Use capabilities - [OmniParser-v2.0](https://github.com/trycua/cua/blob/main/libs/agent/README.md#agent-loops) - Control UI with [Set-of-Marks prompting](https://som-gpt4v.github.io/) using any vision model +## 🖥️ Compatibility +For detailed compatibility information including host OS support, VM emulation capabilities, and model provider compatibility, see the [Compatibility Matrix](./COMPATIBILITY.md). -# 💻 Developer Guide +
+
-Follow these steps to use C/ua in your own code. See [Developer Guide](./docs/Developer-Guide.md) for building from source. +# 🐍 Usage Guide + +Follow these steps to use C/ua in your own Python code. See [Developer Guide](./docs/Developer-Guide.md) for building from source. ### Step 1: Install Lume CLI diff --git a/libs/agent/README.md b/libs/agent/README.md index 31d1accd..d1c82a5e 100644 --- a/libs/agent/README.md +++ b/libs/agent/README.md @@ -34,10 +34,7 @@ pip install "cua-agent[anthropic]" # Anthropic Cua Loop pip install "cua-agent[uitars]" # UI-Tars support pip install "cua-agent[omni]" # Cua Loop based on OmniParser (includes Ollama for local models) pip install "cua-agent[ui]" # Gradio UI for the agent - -# For local UI-TARS with MLX support, you need to manually install mlx-vlm: -pip install "cua-agent[uitars-mlx]" -pip install git+https://github.com/ddupont808/mlx-vlm.git@stable/fix/qwen2-position-id # PR: https://github.com/Blaizzy/mlx-vlm/pull/349 +pip install "cua-agent[uitars-mlx]" # MLX UI-Tars support ``` ## Run diff --git a/libs/agent/agent/__init__.py b/libs/agent/agent/__init__.py index 230eb91a..70c20add 100644 --- a/libs/agent/agent/__init__.py +++ b/libs/agent/agent/__init__.py @@ -6,7 +6,7 @@ import logging __version__ = "0.1.0" # Initialize logging -logger = logging.getLogger("cua.agent") +logger = logging.getLogger("agent") # Initialize telemetry when the package is imported try: diff --git a/libs/agent/agent/core/agent.py b/libs/agent/agent/core/agent.py index 6f2c6278..e9d3b866 100644 --- a/libs/agent/agent/core/agent.py +++ b/libs/agent/agent/core/agent.py @@ -11,10 +11,8 @@ from .types import AgentResponse from .factory import LoopFactory from .provider_config import DEFAULT_MODELS, ENV_VARS -logging.basicConfig(level=logging.INFO) logger = logging.getLogger(__name__) - class ComputerAgent: """A computer agent that can perform automated tasks using natural language instructions.""" diff --git a/libs/agent/agent/core/telemetry.py b/libs/agent/agent/core/telemetry.py index 3c708b17..d3e33a25 100644 --- a/libs/agent/agent/core/telemetry.py +++ b/libs/agent/agent/core/telemetry.py @@ -34,7 +34,7 @@ flush = _default_flush is_telemetry_enabled = _default_is_telemetry_enabled is_telemetry_globally_disabled = _default_is_telemetry_globally_disabled -logger = logging.getLogger("cua.agent.telemetry") +logger = logging.getLogger("agent.telemetry") try: # Import from core telemetry diff --git a/libs/agent/agent/providers/omni/loop.py b/libs/agent/agent/providers/omni/loop.py index 751d4fd3..faffefc0 100644 --- a/libs/agent/agent/providers/omni/loop.py +++ b/libs/agent/agent/providers/omni/loop.py @@ -26,10 +26,8 @@ from .api_handler import OmniAPIHandler from .tools.manager import ToolManager from .tools import ToolResult -logging.basicConfig(level=logging.INFO) logger = logging.getLogger(__name__) - def extract_data(input_string: str, data_type: str) -> str: """Extract content from code blocks.""" pattern = f"```{data_type}" + r"(.*?)(```|$)" diff --git a/libs/agent/agent/providers/uitars/loop.py b/libs/agent/agent/providers/uitars/loop.py index 133a3b83..a28cfec4 100644 --- a/libs/agent/agent/providers/uitars/loop.py +++ b/libs/agent/agent/providers/uitars/loop.py @@ -25,10 +25,8 @@ from .prompts import COMPUTER_USE, SYSTEM_PROMPT, MAC_SPECIFIC_NOTES from .clients.oaicompat import OAICompatClient from .clients.mlxvlm import MLXVLMUITarsClient -logging.basicConfig(level=logging.INFO) logger = logging.getLogger(__name__) - class UITARSLoop(BaseLoop): """UI-TARS-specific implementation of the agent loop. diff --git a/libs/agent/agent/ui/gradio/app.py b/libs/agent/agent/ui/gradio/app.py index 0593d776..f52a3222 100644 --- a/libs/agent/agent/ui/gradio/app.py +++ b/libs/agent/agent/ui/gradio/app.py @@ -132,6 +132,13 @@ class GradioChatScreenshotHandler(DefaultCallbackHandler): # Detect if current device is MacOS is_mac = platform.system().lower() == "darwin" +# Detect if lume is available (host device is macOS) +is_lume_available = is_mac or (os.environ.get("PYLUME_HOST", "localhost") != "localhost") + +print("PYLUME_HOST: ", os.environ.get("PYLUME_HOST", "localhost")) +print("is_mac: ", is_mac) +print("Lume available: ", is_lume_available) + # Map model names to specific provider model names MODEL_MAPPINGS = { "openai": { @@ -733,9 +740,9 @@ if __name__ == "__main__": is_mac = platform.system().lower() == "darwin" providers = ["cloud"] - if is_mac: + if is_lume_available: providers += ["lume"] - elif is_windows: + if is_windows: providers += ["winsandbox"] computer_provider = gr.Radio( diff --git a/libs/agent/pyproject.toml b/libs/agent/pyproject.toml index 8ea6a3fc..f078a5c9 100644 --- a/libs/agent/pyproject.toml +++ b/libs/agent/pyproject.toml @@ -4,7 +4,7 @@ build-backend = "pdm.backend" [project] name = "cua-agent" -version = "0.1.0" +version = "0.2.0" description = "CUA (Computer Use) Agent for AI-driven computer interaction" readme = "README.md" authors = [ @@ -38,8 +38,7 @@ uitars = [ "httpx>=0.27.0,<0.29.0", ] uitars-mlx = [ - # The mlx-vlm package needs to be installed manually with: - # pip install git+https://github.com/ddupont808/mlx-vlm.git@stable/fix/qwen2-position-id + "mlx-vlm>=0.1.27; sys_platform == 'darwin'" ] ui = [ "gradio>=5.23.3,<6.0.0", @@ -88,9 +87,8 @@ all = [ "requests>=2.31.0,<3.0.0", "ollama>=0.4.7,<0.5.0", "gradio>=5.23.3,<6.0.0", - "python-dotenv>=1.0.1,<2.0.0" - # mlx-vlm needs to be installed manually with: - # pip install git+https://github.com/ddupont808/mlx-vlm.git@stable/fix/qwen2-position-id + "python-dotenv>=1.0.1,<2.0.0", + "mlx-vlm>=0.1.27; sys_platform == 'darwin'" ] [tool.pdm] diff --git a/libs/computer-server/computer_server/diorama/diorama.py b/libs/computer-server/computer_server/diorama/diorama.py index fc426a7c..09aa6434 100644 --- a/libs/computer-server/computer_server/diorama/diorama.py +++ b/libs/computer-server/computer_server/diorama/diorama.py @@ -15,13 +15,7 @@ from computer_server.diorama.diorama_computer import DioramaComputer from computer_server.handlers.macos import * # simple, nicely formatted logging -logging.basicConfig( - level=logging.INFO, - format='[%(asctime)s] [%(levelname)s] %(message)s', - datefmt='%H:%M:%S', - stream=sys.stdout -) -logger = logging.getLogger("diorama.virtual_desktop") +logger = logging.getLogger(__name__) automation_handler = MacOSAutomationHandler() diff --git a/libs/computer-server/computer_server/diorama/draw.py b/libs/computer-server/computer_server/diorama/draw.py index 9fce809f..e915b790 100644 --- a/libs/computer-server/computer_server/diorama/draw.py +++ b/libs/computer-server/computer_server/diorama/draw.py @@ -28,13 +28,7 @@ import functools import logging # simple, nicely formatted logging -logging.basicConfig( - level=logging.INFO, - format='[%(asctime)s] [%(levelname)s] %(message)s', - datefmt='%H:%M:%S', - stream=sys.stdout -) -logger = logging.getLogger("diorama.draw") +logger = logging.getLogger(__name__) from computer_server.diorama.safezone import ( get_menubar_bounds, diff --git a/libs/computer-server/computer_server/main.py b/libs/computer-server/computer_server/main.py index bdca3693..7a0dd515 100644 --- a/libs/computer-server/computer_server/main.py +++ b/libs/computer-server/computer_server/main.py @@ -12,8 +12,8 @@ import os import aiohttp # Set up logging with more detail -logging.basicConfig(level=logging.INFO) logger = logging.getLogger(__name__) +logger.setLevel(logging.INFO) # Configure WebSocket with larger message size WEBSOCKET_MAX_SIZE = 1024 * 1024 * 10 # 10MB limit diff --git a/libs/computer/python/computer/__init__.py b/libs/computer/python/computer/__init__.py index 90d20454..e2f66bfb 100644 --- a/libs/computer/python/computer/__init__.py +++ b/libs/computer/python/computer/__init__.py @@ -6,14 +6,14 @@ import sys __version__ = "0.1.0" # Initialize logging -logger = logging.getLogger("cua.computer") +logger = logging.getLogger("computer") # Initialize telemetry when the package is imported try: # Import from core telemetry from core.telemetry import ( - is_telemetry_enabled, flush, + is_telemetry_enabled, record_event, ) diff --git a/libs/computer/python/computer/computer.py b/libs/computer/python/computer/computer.py index 8596bc8b..7ba29ee6 100644 --- a/libs/computer/python/computer/computer.py +++ b/libs/computer/python/computer/computer.py @@ -85,7 +85,7 @@ class Computer: experiments: Optional list of experimental features to enable (e.g. ["app-use"]) """ - self.logger = Logger("cua.computer", verbosity) + self.logger = Logger("computer", verbosity) self.logger.info("Initializing Computer...") # Store original parameters @@ -132,11 +132,11 @@ class Computer: # Configure root logger self.verbosity = verbosity - self.logger = Logger("cua", verbosity) + self.logger = Logger("computer", verbosity) # Configure component loggers with proper hierarchy - self.vm_logger = Logger("cua.vm", verbosity) - self.interface_logger = Logger("cua.interface", verbosity) + self.vm_logger = Logger("computer.vm", verbosity) + self.interface_logger = Logger("computer.interface", verbosity) if not use_host_computer_server: if ":" not in image or len(image.split(":")) != 2: diff --git a/libs/computer/python/computer/helpers.py b/libs/computer/python/computer/helpers.py index b472c047..8317b8d9 100644 --- a/libs/computer/python/computer/helpers.py +++ b/libs/computer/python/computer/helpers.py @@ -1,6 +1,7 @@ """ Helper functions and decorators for the Computer module. """ +import logging import asyncio from functools import wraps from typing import Any, Callable, Optional, TypeVar, cast @@ -8,6 +9,8 @@ from typing import Any, Callable, Optional, TypeVar, cast # Global reference to the default computer instance _default_computer = None +logger = logging.getLogger(__name__) + def set_default_computer(computer): """ Set the default computer instance to be used by the remote decorator. @@ -41,7 +44,7 @@ def sandboxed(venv_name: str = "default", computer: str = "default", max_retries try: return await comp.venv_exec(venv_name, func, *args, **kwargs) except Exception as e: - print(f"Attempt {i+1} failed: {e}") + logger.error(f"Attempt {i+1} failed: {e}") await asyncio.sleep(1) if i == max_retries - 1: raise e diff --git a/libs/computer/python/computer/interface/linux.py b/libs/computer/python/computer/interface/linux.py index e96cde50..23b542b0 100644 --- a/libs/computer/python/computer/interface/linux.py +++ b/libs/computer/python/computer/interface/linux.py @@ -30,7 +30,7 @@ class LinuxComputerInterface(BaseComputerInterface): self._command_lock = asyncio.Lock() # Lock to ensure only one command at a time # Set logger name for Linux interface - self.logger = Logger("cua.interface.linux", LogLevel.NORMAL) + self.logger = Logger("computer.interface.linux", LogLevel.NORMAL) @property def ws_uri(self) -> str: diff --git a/libs/computer/python/computer/interface/macos.py b/libs/computer/python/computer/interface/macos.py index 539303e4..a8821d60 100644 --- a/libs/computer/python/computer/interface/macos.py +++ b/libs/computer/python/computer/interface/macos.py @@ -29,7 +29,7 @@ class MacOSComputerInterface(BaseComputerInterface): self._command_lock = asyncio.Lock() # Lock to ensure only one command at a time # Set logger name for macOS interface - self.logger = Logger("cua.interface.macos", LogLevel.NORMAL) + self.logger = Logger("computer.interface.macos", LogLevel.NORMAL) @property def ws_uri(self) -> str: diff --git a/libs/computer/python/computer/interface/windows.py b/libs/computer/python/computer/interface/windows.py index 6acce1c1..b88c9138 100644 --- a/libs/computer/python/computer/interface/windows.py +++ b/libs/computer/python/computer/interface/windows.py @@ -30,7 +30,7 @@ class WindowsComputerInterface(BaseComputerInterface): self._command_lock = asyncio.Lock() # Lock to ensure only one command at a time # Set logger name for Windows interface - self.logger = Logger("cua.interface.windows", LogLevel.NORMAL) + self.logger = Logger("computer.interface.windows", LogLevel.NORMAL) @property def ws_uri(self) -> str: diff --git a/libs/computer/python/computer/providers/lume_api.py b/libs/computer/python/computer/providers/lume_api.py index fbfaca4b..3cbe1097 100644 --- a/libs/computer/python/computer/providers/lume_api.py +++ b/libs/computer/python/computer/providers/lume_api.py @@ -66,8 +66,6 @@ def lume_api_get( # Only print the curl command when debug is enabled display_curl_string = ' '.join(display_cmd) - if debug or verbose: - print(f"DEBUG: Executing curl API call: {display_curl_string}") logger.debug(f"Executing API request: {display_curl_string}") # Execute the command - for execution we need to use shell=True to handle URLs with special characters @@ -172,8 +170,6 @@ def lume_api_run( payload["sharedDirectories"] = run_opts["shared_directories"] # Log the payload for debugging - if debug or verbose: - print(f"DEBUG: Payload for {vm_name} run request: {json.dumps(payload, indent=2)}") logger.debug(f"API payload: {json.dumps(payload, indent=2)}") # Construct the curl command @@ -184,11 +180,6 @@ def lume_api_run( api_url ] - # Always print the command for debugging - if debug or verbose: - print(f"DEBUG: Executing curl run API call: {' '.join(cmd)}") - print(f"Run payload: {json.dumps(payload, indent=2)}") - # Execute the command try: result = subprocess.run(cmd, capture_output=True, text=True) @@ -405,8 +396,6 @@ def lume_api_pull( f"http://{host}:{port}/lume/pull" ]) - if debug or verbose: - print(f"DEBUG: Executing curl API call: {' '.join(pull_cmd)}") logger.debug(f"Executing API request: {' '.join(pull_cmd)}") try: @@ -474,8 +463,6 @@ def lume_api_delete( # Only print the curl command when debug is enabled display_curl_string = ' '.join(display_cmd) - if debug or verbose: - print(f"DEBUG: Executing curl API call: {display_curl_string}") logger.debug(f"Executing API request: {display_curl_string}") # Execute the command - for execution we need to use shell=True to handle URLs with special characters diff --git a/libs/computer/python/computer/providers/lumier/provider.py b/libs/computer/python/computer/providers/lumier/provider.py index 14c5620d..67f348be 100644 --- a/libs/computer/python/computer/providers/lumier/provider.py +++ b/libs/computer/python/computer/providers/lumier/provider.py @@ -305,7 +305,7 @@ class LumierProvider(BaseVMProvider): cmd = ["docker", "run", "-d", "--name", self.container_name] cmd.extend(["-p", f"{self.vnc_port}:8006"]) - print(f"Using specified noVNC_port: {self.vnc_port}") + logger.debug(f"Using specified noVNC_port: {self.vnc_port}") # Set API URL using the API port self._api_url = f"http://{self.host}:{self.api_port}" @@ -324,7 +324,7 @@ class LumierProvider(BaseVMProvider): "-v", f"{storage_dir}:/storage", "-e", f"HOST_STORAGE_PATH={storage_dir}" ]) - print(f"Using persistent storage at: {storage_dir}") + logger.debug(f"Using persistent storage at: {storage_dir}") # Add shared folder volume mount if shared_path is specified if self.shared_path: @@ -337,12 +337,12 @@ class LumierProvider(BaseVMProvider): "-v", f"{shared_dir}:/shared", "-e", f"HOST_SHARED_PATH={shared_dir}" ]) - print(f"Using shared folder at: {shared_dir}") + logger.debug(f"Using shared folder at: {shared_dir}") # Add environment variables # Always use the container_name as the VM_NAME for consistency # Use the VM image passed from the Computer class - print(f"Using VM image: {self.image}") + logger.debug(f"Using VM image: {self.image}") # If ghcr.io is in the image, use the full image name if "ghcr.io" in self.image: @@ -362,22 +362,22 @@ class LumierProvider(BaseVMProvider): # First check if the image exists locally try: - print(f"Checking if Docker image {lumier_image} exists locally...") + logger.debug(f"Checking if Docker image {lumier_image} exists locally...") check_image_cmd = ["docker", "image", "inspect", lumier_image] subprocess.run(check_image_cmd, capture_output=True, check=True) - print(f"Docker image {lumier_image} found locally.") + logger.debug(f"Docker image {lumier_image} found locally.") except subprocess.CalledProcessError: # Image doesn't exist locally - print(f"\nWARNING: Docker image {lumier_image} not found locally.") - print("The system will attempt to pull it from Docker Hub, which may fail if you have network connectivity issues.") - print("If the Docker pull fails, you may need to manually pull the image first with:") - print(f" docker pull {lumier_image}\n") + logger.warning(f"\nWARNING: Docker image {lumier_image} not found locally.") + logger.warning("The system will attempt to pull it from Docker Hub, which may fail if you have network connectivity issues.") + logger.warning("If the Docker pull fails, you may need to manually pull the image first with:") + logger.warning(f" docker pull {lumier_image}\n") # Add the image to the command cmd.append(lumier_image) # Print the Docker command for debugging - print(f"DOCKER COMMAND: {' '.join(cmd)}") + logger.debug(f"DOCKER COMMAND: {' '.join(cmd)}") # Run the container with improved error handling try: @@ -395,8 +395,8 @@ class LumierProvider(BaseVMProvider): raise # Container started, now check VM status with polling - print("Container started, checking VM status...") - print("NOTE: This may take some time while the VM image is being pulled and initialized") + logger.debug("Container started, checking VM status...") + logger.debug("NOTE: This may take some time while the VM image is being pulled and initialized") # Start a background thread to show container logs in real-time import threading @@ -404,8 +404,8 @@ class LumierProvider(BaseVMProvider): def show_container_logs(): # Give the container a moment to start generating logs time.sleep(1) - print(f"\n---- CONTAINER LOGS FOR '{name}' (LIVE) ----") - print("Showing logs as they are generated. Press Ctrl+C to stop viewing logs...\n") + logger.debug(f"\n---- CONTAINER LOGS FOR '{name}' (LIVE) ----") + logger.debug("Showing logs as they are generated. Press Ctrl+C to stop viewing logs...\n") try: # Use docker logs with follow option @@ -415,17 +415,17 @@ class LumierProvider(BaseVMProvider): # Read and print logs line by line for line in process.stdout: - print(line, end='') + logger.debug(line, end='') # Break if process has exited if process.poll() is not None: break except Exception as e: - print(f"\nError showing container logs: {e}") + logger.error(f"\nError showing container logs: {e}") if self.verbose: logger.error(f"Error in log streaming thread: {e}") finally: - print("\n---- LOG STREAMING ENDED ----") + logger.debug("\n---- LOG STREAMING ENDED ----") # Make sure process is terminated if 'process' in locals() and process.poll() is None: process.terminate() @@ -452,11 +452,11 @@ class LumierProvider(BaseVMProvider): else: wait_time = min(30, 5 + (attempt * 2)) - print(f"Waiting {wait_time}s before retry #{attempt+1}...") + logger.debug(f"Waiting {wait_time}s before retry #{attempt+1}...") await asyncio.sleep(wait_time) # Try to get VM status - print(f"Checking VM status (attempt {attempt+1})...") + logger.debug(f"Checking VM status (attempt {attempt+1})...") vm_status = await self.get_vm(name) # Check for API errors @@ -468,20 +468,20 @@ class LumierProvider(BaseVMProvider): # since _lume_api_get already logged the technical details if consecutive_errors == 1 or attempt % 5 == 0: if 'Empty reply from server' in error_msg: - print("API server is starting up - container is running, but API isn't fully initialized yet.") - print("This is expected during the initial VM setup - will continue polling...") + logger.info("API server is starting up - container is running, but API isn't fully initialized yet.") + logger.info("This is expected during the initial VM setup - will continue polling...") else: # Don't repeat the exact same error message each time - logger.debug(f"API request error (attempt {attempt+1}): {error_msg}") + logger.warning(f"API request error (attempt {attempt+1}): {error_msg}") # Just log that we're still working on it if attempt > 3: - print("Still waiting for the API server to become available...") + logger.debug("Still waiting for the API server to become available...") # If we're getting errors but container is running, that's normal during startup if vm_status.get('status') == 'running': if not vm_running: - print("Container is running, waiting for the VM within it to become fully ready...") - print("This might take a minute while the VM initializes...") + logger.info("Container is running, waiting for the VM within it to become fully ready...") + logger.info("This might take a minute while the VM initializes...") vm_running = True # Increase counter and continue @@ -497,35 +497,35 @@ class LumierProvider(BaseVMProvider): # Check if we have an IP address, which means the VM is fully ready if 'ip_address' in vm_status and vm_status['ip_address']: - print(f"VM is now fully running with IP: {vm_status.get('ip_address')}") + logger.info(f"VM is now fully running with IP: {vm_status.get('ip_address')}") if 'vnc_url' in vm_status and vm_status['vnc_url']: - print(f"VNC URL: {vm_status.get('vnc_url')}") + logger.info(f"VNC URL: {vm_status.get('vnc_url')}") return vm_status else: - print("VM is running but still initializing network interfaces...") - print("Waiting for IP address to be assigned...") + logger.debug("VM is running but still initializing network interfaces...") + logger.debug("Waiting for IP address to be assigned...") else: # VM exists but might still be starting up status = vm_status.get('status', 'unknown') - print(f"VM found but status is: {status}. Continuing to poll...") + logger.debug(f"VM found but status is: {status}. Continuing to poll...") # Increase counter for next iteration's delay calculation attempt += 1 # If we reach a very large number of attempts, give a reassuring message but continue if attempt % 10 == 0: - print(f"Still waiting after {attempt} attempts. This might take several minutes for first-time setup.") + logger.debug(f"Still waiting after {attempt} attempts. This might take several minutes for first-time setup.") if not vm_running and attempt >= 20: - print("\nNOTE: First-time VM initialization can be slow as images are downloaded.") - print("If this continues for more than 10 minutes, you may want to check:") - print(" 1. Docker logs with: docker logs " + name) - print(" 2. If your network can access container registries") - print("Press Ctrl+C to abort if needed.\n") + logger.warning("\nNOTE: First-time VM initialization can be slow as images are downloaded.") + logger.warning("If this continues for more than 10 minutes, you may want to check:") + logger.warning(" 1. Docker logs with: docker logs " + name) + logger.warning(" 2. If your network can access container registries") + logger.warning("Press Ctrl+C to abort if needed.\n") # After 150 attempts (likely over 30-40 minutes), return current status if attempt >= 150: - print(f"Reached 150 polling attempts. VM status is: {vm_status.get('status', 'unknown')}") - print("Returning current VM status, but please check Docker logs if there are issues.") + logger.debug(f"Reached 150 polling attempts. VM status is: {vm_status.get('status', 'unknown')}") + logger.debug("Returning current VM status, but please check Docker logs if there are issues.") return vm_status except Exception as e: @@ -535,9 +535,9 @@ class LumierProvider(BaseVMProvider): # If we've had too many consecutive errors, might be a deeper problem if consecutive_errors >= 10: - print(f"\nWARNING: Encountered {consecutive_errors} consecutive errors while checking VM status.") - print("You may need to check the Docker container logs or restart the process.") - print(f"Error details: {str(e)}\n") + logger.warning(f"\nWARNING: Encountered {consecutive_errors} consecutive errors while checking VM status.") + logger.warning("You may need to check the Docker container logs or restart the process.") + logger.warning(f"Error details: {str(e)}\n") # Increase attempt counter for next iteration attempt += 1 @@ -545,7 +545,7 @@ class LumierProvider(BaseVMProvider): # After many consecutive errors, add a delay to avoid hammering the system if attempt > 5: error_delay = min(30, 10 + attempt) - print(f"Multiple connection errors, waiting {error_delay}s before next attempt...") + logger.warning(f"Multiple connection errors, waiting {error_delay}s before next attempt...") await asyncio.sleep(error_delay) except subprocess.CalledProcessError as e: @@ -568,7 +568,7 @@ class LumierProvider(BaseVMProvider): api_ready = False container_running = False - print(f"Waiting for container {container_name} to be ready (timeout: {timeout}s)...") + logger.debug(f"Waiting for container {container_name} to be ready (timeout: {timeout}s)...") while time.time() - start_time < timeout: # Check if container is running @@ -579,7 +579,6 @@ class LumierProvider(BaseVMProvider): if container_status and container_status.startswith("Up"): container_running = True - print(f"Container {container_name} is running") logger.info(f"Container {container_name} is running with status: {container_status}") else: logger.warning(f"Container {container_name} not yet running, status: {container_status}") @@ -603,7 +602,6 @@ class LumierProvider(BaseVMProvider): if result.returncode == 0 and "ok" in result.stdout.lower(): api_ready = True - print(f"API is ready at {api_url}") logger.info(f"API is ready at {api_url}") break else: @@ -621,7 +619,6 @@ class LumierProvider(BaseVMProvider): if vm_result.returncode == 0 and vm_result.stdout.strip(): # VM API responded with something - consider the API ready api_ready = True - print(f"VM API is ready at {vm_api_url}") logger.info(f"VM API is ready at {vm_api_url}") break else: @@ -643,7 +640,6 @@ class LumierProvider(BaseVMProvider): else: curl_error = f"Unknown curl error code: {curl_code}" - print(f"API not ready yet: {curl_error}") logger.info(f"API not ready yet: {curl_error}") except subprocess.SubprocessError as e: logger.warning(f"Error checking API status: {e}") @@ -652,22 +648,19 @@ class LumierProvider(BaseVMProvider): # a bit longer before checking again, as the container may still be initializing elapsed_seconds = time.time() - start_time if int(elapsed_seconds) % 5 == 0: # Only print status every 5 seconds to reduce verbosity - print(f"Waiting for API to initialize... ({elapsed_seconds:.1f}s / {timeout}s)") + logger.debug(f"Waiting for API to initialize... ({elapsed_seconds:.1f}s / {timeout}s)") await asyncio.sleep(3) # Longer sleep between API checks # Handle timeout - if the container is running but API is not ready, that's not # necessarily an error - the API might just need more time to start up if not container_running: - print(f"Timed out waiting for container {container_name} to start") logger.warning(f"Timed out waiting for container {container_name} to start") return False if not api_ready: - print(f"Container {container_name} is running, but API is not fully ready yet.") - print("Proceeding with operations. API will become available shortly.") - print("NOTE: You may see some 'API request failed' messages while the API initializes.") logger.warning(f"Container {container_name} is running, but API is not fully ready yet.") + logger.warning(f"NOTE: You may see some 'API request failed' messages while the API initializes.") # Return True if container is running, even if API isn't ready yet # This allows VM operations to proceed, with appropriate retries for API calls @@ -777,8 +770,8 @@ class LumierProvider(BaseVMProvider): # For follow mode with timeout, we'll run the command and handle the timeout log_cmd.append(container_name) logger.info(f"Following logs for container '{container_name}' with timeout {timeout}s") - print(f"\n---- CONTAINER LOGS FOR '{container_name}' (LIVE) ----") - print(f"Press Ctrl+C to stop following logs\n") + logger.info(f"\n---- CONTAINER LOGS FOR '{container_name}' (LIVE) ----") + logger.info(f"Press Ctrl+C to stop following logs\n") try: # Run with timeout @@ -790,7 +783,7 @@ class LumierProvider(BaseVMProvider): process.wait(timeout=timeout) except subprocess.TimeoutExpired: process.terminate() # Stop after timeout - print(f"\n---- LOG FOLLOWING STOPPED (timeout {timeout}s reached) ----") + logger.info(f"\n---- LOG FOLLOWING STOPPED (timeout {timeout}s reached) ----") else: # Without timeout, wait for user interruption process.wait() @@ -798,14 +791,14 @@ class LumierProvider(BaseVMProvider): return "Logs were displayed to console in follow mode" except KeyboardInterrupt: process.terminate() - print("\n---- LOG FOLLOWING STOPPED (user interrupted) ----") + logger.info("\n---- LOG FOLLOWING STOPPED (user interrupted) ----") return "Logs were displayed to console in follow mode (interrupted)" else: # For follow mode without timeout, we'll print a helpful message log_cmd.append(container_name) logger.info(f"Following logs for container '{container_name}' indefinitely") - print(f"\n---- CONTAINER LOGS FOR '{container_name}' (LIVE) ----") - print(f"Press Ctrl+C to stop following logs\n") + logger.info(f"\n---- CONTAINER LOGS FOR '{container_name}' (LIVE) ----") + logger.info(f"Press Ctrl+C to stop following logs\n") try: # Run the command and let it run until interrupted @@ -814,7 +807,7 @@ class LumierProvider(BaseVMProvider): return "Logs were displayed to console in follow mode" except KeyboardInterrupt: process.terminate() - print("\n---- LOG FOLLOWING STOPPED (user interrupted) ----") + logger.info("\n---- LOG FOLLOWING STOPPED (user interrupted) ----") return "Logs were displayed to console in follow mode (interrupted)" else: # For non-follow mode, capture and return the logs as a string @@ -827,11 +820,11 @@ class LumierProvider(BaseVMProvider): # Only print header and logs if there's content if logs.strip(): - print(f"\n---- CONTAINER LOGS FOR '{container_name}' (LAST {num_lines} LINES) ----\n") - print(logs) - print(f"\n---- END OF LOGS ----") + logger.info(f"\n---- CONTAINER LOGS FOR '{container_name}' (LAST {num_lines} LINES) ----\n") + logger.info(logs) + logger.info(f"\n---- END OF LOGS ----") else: - print(f"\nNo logs available for container '{container_name}'") + logger.info(f"\nNo logs available for container '{container_name}'") return logs except subprocess.CalledProcessError as e: diff --git a/libs/computer/python/computer/telemetry.py b/libs/computer/python/computer/telemetry.py index 38be92a9..69d064f8 100644 --- a/libs/computer/python/computer/telemetry.py +++ b/libs/computer/python/computer/telemetry.py @@ -9,10 +9,10 @@ TELEMETRY_AVAILABLE = False try: from core.telemetry import ( - record_event, increment, is_telemetry_enabled, is_telemetry_globally_disabled, + record_event, ) def increment_counter(counter_name: str, value: int = 1) -> None: @@ -22,14 +22,14 @@ try: def set_dimension(name: str, value: Any) -> None: """Set a dimension that will be attached to all events.""" - logger = logging.getLogger("cua.computer.telemetry") + logger = logging.getLogger("computer.telemetry") logger.debug(f"Setting dimension {name}={value}") TELEMETRY_AVAILABLE = True - logger = logging.getLogger("cua.computer.telemetry") + logger = logging.getLogger("computer.telemetry") logger.info("Successfully imported telemetry") except ImportError as e: - logger = logging.getLogger("cua.computer.telemetry") + logger = logging.getLogger("computer.telemetry") logger.warning(f"Could not import telemetry: {e}") TELEMETRY_AVAILABLE = False @@ -40,7 +40,7 @@ def _noop(*args: Any, **kwargs: Any) -> None: pass -logger = logging.getLogger("cua.computer.telemetry") +logger = logging.getLogger("computer.telemetry") # If telemetry isn't available, use no-op functions if not TELEMETRY_AVAILABLE: diff --git a/libs/computer/python/pyproject.toml b/libs/computer/python/pyproject.toml index c9aa46da..04bd2dfb 100644 --- a/libs/computer/python/pyproject.toml +++ b/libs/computer/python/pyproject.toml @@ -4,7 +4,7 @@ build-backend = "pdm.backend" [project] name = "cua-computer" -version = "0.1.0" +version = "0.2.0" description = "Computer-Use Interface (CUI) framework powering Cua" readme = "README.md" authors = [ diff --git a/libs/core/core/telemetry/client.py b/libs/core/core/telemetry/client.py index 22686890..d4eb9c70 100644 --- a/libs/core/core/telemetry/client.py +++ b/libs/core/core/telemetry/client.py @@ -15,7 +15,7 @@ from typing import Any, Dict, List, Optional from core import __version__ from core.telemetry.sender import send_telemetry -logger = logging.getLogger("cua.telemetry") +logger = logging.getLogger("core.telemetry") # Controls how frequently telemetry will be sent (percentage) TELEMETRY_SAMPLE_RATE = 5 # 5% sampling rate diff --git a/libs/core/core/telemetry/posthog_client.py b/libs/core/core/telemetry/posthog_client.py index 8ddb1dd2..ad8a7685 100644 --- a/libs/core/core/telemetry/posthog_client.py +++ b/libs/core/core/telemetry/posthog_client.py @@ -16,7 +16,7 @@ from typing import Any, Dict, List, Optional import posthog from core import __version__ -logger = logging.getLogger("cua.telemetry") +logger = logging.getLogger("core.telemetry") # Controls how frequently telemetry will be sent (percentage) TELEMETRY_SAMPLE_RATE = 100 # 100% sampling rate (was 5%) diff --git a/libs/core/core/telemetry/sender.py b/libs/core/core/telemetry/sender.py index db96b1ac..8772868f 100644 --- a/libs/core/core/telemetry/sender.py +++ b/libs/core/core/telemetry/sender.py @@ -3,7 +3,7 @@ import logging from typing import Any, Dict -logger = logging.getLogger("cua.telemetry") +logger = logging.getLogger("core.telemetry") def send_telemetry(payload: Dict[str, Any]) -> bool: diff --git a/libs/core/core/telemetry/telemetry.py b/libs/core/core/telemetry/telemetry.py index 2a6e052e..f01421cd 100644 --- a/libs/core/core/telemetry/telemetry.py +++ b/libs/core/core/telemetry/telemetry.py @@ -30,7 +30,7 @@ def _configure_telemetry_logging() -> None: level = logging.ERROR # Configure the main telemetry logger - telemetry_logger = logging.getLogger("cua.telemetry") + telemetry_logger = logging.getLogger("core.telemetry") telemetry_logger.setLevel(level) @@ -46,11 +46,11 @@ try: POSTHOG_AVAILABLE = True except ImportError: - logger = logging.getLogger("cua.telemetry") + logger = logging.getLogger("core.telemetry") logger.info("PostHog not available. Install with: pdm add posthog") POSTHOG_AVAILABLE = False -logger = logging.getLogger("cua.telemetry") +logger = logging.getLogger("core.telemetry") # Check environment variables for global telemetry opt-out @@ -292,10 +292,9 @@ def set_telemetry_log_level(level: Optional[int] = None) -> None: # Set the level for all telemetry-related loggers telemetry_loggers = [ - "cua.telemetry", "core.telemetry", - "cua.agent.telemetry", - "cua.computer.telemetry", + "agent.telemetry", + "computer.telemetry", "posthog", ] diff --git a/scripts/playground-docker.sh b/scripts/playground-docker.sh new file mode 100644 index 00000000..61b57901 --- /dev/null +++ b/scripts/playground-docker.sh @@ -0,0 +1,323 @@ +#!/bin/bash + +set -e + +# Colors for output +GREEN='\033[0;32m' +BLUE='\033[0;34m' +RED='\033[0;31m' +YELLOW='\033[1;33m' +NC='\033[0m' # No Color + +# Print with color +print_info() { + echo -e "${BLUE}==> $1${NC}" +} + +print_success() { + echo -e "${GREEN}==> $1${NC}" +} + +print_error() { + echo -e "${RED}==> $1${NC}" +} + +print_warning() { + echo -e "${YELLOW}==> $1${NC}" +} + +echo "🚀 Launching C/ua Computer-Use Agent UI..." + +# Check if Docker is installed +if ! command -v docker &> /dev/null; then + print_error "Docker is not installed!" + echo "" + echo "To use C/ua with Docker containers, you need to install Docker first:" + echo "" + echo "📦 Install Docker:" + echo " • macOS: Download Docker Desktop from https://docker.com/products/docker-desktop" + echo " • Windows: Download Docker Desktop from https://docker.com/products/docker-desktop" + echo " • Linux: Follow instructions at https://docs.docker.com/engine/install/" + echo "" + echo "After installing Docker, run this script again." + exit 1 +fi + +# Check if Docker daemon is running +if ! docker info &> /dev/null; then + print_error "Docker is installed but not running!" + echo "" + echo "Please start Docker Desktop and try again." + exit 1 +fi + +print_success "Docker is installed and running!" + +# Save the original working directory +ORIGINAL_DIR="$(pwd)" + +DEMO_DIR="$HOME/.cua" +mkdir -p "$DEMO_DIR" + + +# Check if we're already in the cua repository +# Look for the specific trycua identifier in pyproject.toml +if [[ -f "pyproject.toml" ]] && grep -q "gh@trycua.com" "pyproject.toml"; then + print_success "Already in C/ua repository - using current directory" + REPO_DIR="$ORIGINAL_DIR" + USE_EXISTING_REPO=true +else + # Directories used by the script when not in repo + REPO_DIR="$DEMO_DIR/cua" + USE_EXISTING_REPO=false +fi + +# Function to clean up on exit +cleanup() { + cd "$ORIGINAL_DIR" 2>/dev/null || true +} +trap cleanup EXIT + +echo "" +echo "Choose your C/ua setup:" +echo "1) ☁️ C/ua Cloud Containers (works on any system)" +echo "2) 🖥️ Local macOS VMs (requires Apple Silicon Mac + macOS 15+)" +echo "3) 🖥️ Local Windows VMs (requires Windows 10 / 11)" +echo "" +read -p "Enter your choice (1, 2, or 3): " CHOICE + +if [[ "$CHOICE" == "1" ]]; then + # C/ua Cloud Container setup + echo "" + print_info "Setting up C/ua Cloud Containers..." + echo "" + + # Check if existing .env.local already has CUA_API_KEY + REPO_ENV_FILE="$REPO_DIR/.env.local" + CURRENT_ENV_FILE="$ORIGINAL_DIR/.env.local" + + CUA_API_KEY="" + + # First check current directory + if [[ -f "$CURRENT_ENV_FILE" ]] && grep -q "CUA_API_KEY=" "$CURRENT_ENV_FILE"; then + EXISTING_CUA_KEY=$(grep "CUA_API_KEY=" "$CURRENT_ENV_FILE" | cut -d'=' -f2- | tr -d '"' | tr -d "'" | xargs) + if [[ -n "$EXISTING_CUA_KEY" && "$EXISTING_CUA_KEY" != "your_cua_api_key_here" && "$EXISTING_CUA_KEY" != "" ]]; then + CUA_API_KEY="$EXISTING_CUA_KEY" + fi + fi + + # Then check repo directory if not found in current dir + if [[ -z "$CUA_API_KEY" ]] && [[ -f "$REPO_ENV_FILE" ]] && grep -q "CUA_API_KEY=" "$REPO_ENV_FILE"; then + EXISTING_CUA_KEY=$(grep "CUA_API_KEY=" "$REPO_ENV_FILE" | cut -d'=' -f2- | tr -d '"' | tr -d "'" | xargs) + if [[ -n "$EXISTING_CUA_KEY" && "$EXISTING_CUA_KEY" != "your_cua_api_key_here" && "$EXISTING_CUA_KEY" != "" ]]; then + CUA_API_KEY="$EXISTING_CUA_KEY" + fi + fi + + # If no valid API key found, prompt for one + if [[ -z "$CUA_API_KEY" ]]; then + echo "To use C/ua Cloud Containers, you need to:" + echo "1. Sign up at https://trycua.com" + echo "2. Create a Cloud Container" + echo "3. Generate an Api Key" + echo "" + read -p "Enter your C/ua Api Key: " CUA_API_KEY + + if [[ -z "$CUA_API_KEY" ]]; then + print_error "C/ua Api Key is required for Cloud Containers." + exit 1 + fi + else + print_success "Found existing CUA API key" + fi + + USE_CLOUD=true + COMPUTER_TYPE="cloud" + +elif [[ "$CHOICE" == "2" ]]; then + # Local macOS VM setup + echo "" + print_info "Setting up local macOS VMs..." + + # Check for Apple Silicon Mac + if [[ $(uname -s) != "Darwin" || $(uname -m) != "arm64" ]]; then + print_error "Local macOS VMs require an Apple Silicon Mac (M1/M2/M3/M4)." + echo "💡 Consider using C/ua Cloud Containers instead (option 1)." + exit 1 + fi + + # Check for macOS 15 (Sequoia) or newer + OSVERSION=$(sw_vers -productVersion) + if [[ $(echo "$OSVERSION 15.0" | tr " " "\n" | sort -V | head -n 1) != "15.0" ]]; then + print_error "Local macOS VMs require macOS 15 (Sequoia) or newer. You have $OSVERSION." + echo "💡 Consider using C/ua Cloud Containers instead (option 1)." + exit 1 + fi + + USE_CLOUD=false + COMPUTER_TYPE="macos" + +elif [[ "$CHOICE" == "3" ]]; then + # Local Windows VM setup + echo "" + print_info "Setting up local Windows VMs..." + + # Check if we're on Windows + if [[ $(uname -s) != MINGW* && $(uname -s) != CYGWIN* && $(uname -s) != MSYS* ]]; then + print_error "Local Windows VMs require Windows 10 or 11." + echo "💡 Consider using C/ua Cloud Containers instead (option 1)." + echo "" + echo "🔗 If you are using WSL, refer to the blog post to get started: https://www.trycua.com/blog/windows-sandbox" + exit 1 + fi + + USE_CLOUD=false + COMPUTER_TYPE="windows" + +else + print_error "Invalid choice. Please run the script again and choose 1, 2, or 3." + exit 1 +fi + +print_success "All checks passed! 🎉" + +# Create demo directory and handle repository +if [[ "$USE_EXISTING_REPO" == "true" ]]; then + print_info "Using existing repository in current directory" + cd "$REPO_DIR" +else + # Clone or update the repository + if [[ ! -d "$REPO_DIR" ]]; then + print_info "Cloning C/ua repository..." + cd "$DEMO_DIR" + git clone https://github.com/trycua/cua.git + else + print_info "Updating C/ua repository..." + cd "$REPO_DIR" + git pull origin main + fi + + cd "$REPO_DIR" +fi + +# Create .env.local file with API keys +ENV_FILE="$REPO_DIR/.env.local" +if [[ ! -f "$ENV_FILE" ]]; then + cat > "$ENV_FILE" << EOF +# Uncomment and add your API keys here +# OPENAI_API_KEY=your_openai_api_key_here +# ANTHROPIC_API_KEY=your_anthropic_api_key_here +CUA_API_KEY=your_cua_api_key_here +EOF + print_success "Created .env.local file with API key placeholders" +else + print_success "Found existing .env.local file - keeping your current settings" +fi + +if [[ "$USE_CLOUD" == "true" ]]; then + # Add CUA API key to .env.local if not already present + if ! grep -q "CUA_API_KEY" "$ENV_FILE"; then + echo "CUA_API_KEY=$CUA_API_KEY" >> "$ENV_FILE" + print_success "Added CUA_API_KEY to .env.local" + elif grep -q "CUA_API_KEY=your_cua_api_key_here" "$ENV_FILE"; then + # Update placeholder with actual key + sed -i.bak "s/CUA_API_KEY=your_cua_api_key_here/CUA_API_KEY=$CUA_API_KEY/" "$ENV_FILE" + print_success "Updated CUA_API_KEY in .env.local" + fi +fi + +# Build the Docker image if it doesn't exist +print_info "Checking Docker image..." +if ! docker image inspect cua-dev-image &> /dev/null; then + print_info "Building Docker image (this may take a while)..." + ./scripts/run-docker-dev.sh build +else + print_success "Docker image already exists" +fi + +# Install Lume if needed for local VMs +if [[ "$USE_CLOUD" == "false" && "$COMPUTER_TYPE" == "macos" ]]; then + if ! command -v lume &> /dev/null; then + print_info "Installing Lume CLI..." + curl -fsSL https://raw.githubusercontent.com/trycua/cua/main/libs/lume/scripts/install.sh | bash + + # Add lume to PATH for this session if it's not already there + if ! command -v lume &> /dev/null; then + export PATH="$PATH:$HOME/.local/bin" + fi + fi + + # Pull the macOS CUA image if not already present + if ! lume ls | grep -q "macos-sequoia-cua"; then + # Check available disk space + IMAGE_SIZE_GB=30 + AVAILABLE_SPACE_KB=$(df -k $HOME | tail -1 | awk '{print $4}') + AVAILABLE_SPACE_GB=$(($AVAILABLE_SPACE_KB / 1024 / 1024)) + + echo "📊 The macOS CUA image will use approximately ${IMAGE_SIZE_GB}GB of disk space." + echo " You currently have ${AVAILABLE_SPACE_GB}GB available on your system." + + # Prompt for confirmation + read -p " Continue? [y]/n: " CONTINUE + CONTINUE=${CONTINUE:-y} + + if [[ $CONTINUE =~ ^[Yy]$ ]]; then + print_info "Pulling macOS CUA image (this may take a while)..." + + # Use caffeinate on macOS to prevent system sleep during the pull + if command -v caffeinate &> /dev/null; then + print_info "Using caffeinate to prevent system sleep during download..." + caffeinate -i lume pull macos-sequoia-cua:latest + else + lume pull macos-sequoia-cua:latest + fi + else + print_error "Installation cancelled." + exit 1 + fi + fi + + # Check if the VM is running + print_info "Checking if the macOS CUA VM is running..." + VM_RUNNING=$(lume ls | grep "macos-sequoia-cua" | grep "running" || echo "") + + if [ -z "$VM_RUNNING" ]; then + print_info "Starting the macOS CUA VM in the background..." + lume run macos-sequoia-cua:latest & + # Wait a moment for the VM to initialize + sleep 5 + print_success "VM started successfully." + else + print_success "macOS CUA VM is already running." + fi +fi + +# Create a convenience script to run the demo +cat > "$DEMO_DIR/start_ui.sh" << EOF +#!/bin/bash +cd "$REPO_DIR" +./scripts/run-docker-dev.sh run agent_ui_examples.py +EOF +chmod +x "$DEMO_DIR/start_ui.sh" + +print_success "Setup complete!" + +if [[ "$USE_CLOUD" == "true" ]]; then + echo "☁️ C/ua Cloud Container setup complete!" +else + echo "🖥️ C/ua Local VM setup complete!" +fi + +echo "📝 Edit $ENV_FILE to update your API keys" +echo "🖥️ Start the playground by running: $DEMO_DIR/start_ui.sh" + +# Start the demo automatically +echo +print_info "Starting the C/ua Computer-Use Agent UI..." +echo "" + +print_success "C/ua Computer-Use Agent UI is now running at http://localhost:7860/" +echo +echo "🌐 Open your browser and go to: http://localhost:7860/" +echo +"$DEMO_DIR/start_ui.sh" diff --git a/scripts/playground.sh b/scripts/playground.sh index 4cdb1ffa..9be712d2 100755 --- a/scripts/playground.sh +++ b/scripts/playground.sh @@ -209,13 +209,6 @@ echo "📦 Updating C/ua packages..." pip install -U pip setuptools wheel Cmake pip install -U cua-computer "cua-agent[all]" -# Install mlx-vlm on Apple Silicon Macs -if [[ $(uname -m) == 'arm64' ]]; then - echo "Installing mlx-vlm for Apple Silicon Macs..." - pip install git+https://github.com/Blaizzy/mlx-vlm.git - # pip install git+https://github.com/ddupont808/mlx-vlm.git@stable/fix/qwen2-position-id -fi - # Create a simple demo script mkdir -p "$DEMO_DIR"