{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": "## Agent\n\nThis notebook demonstrates how to use Cua's Agent to run workflows in virtual sandboxes, either using Cua Cloud Sandbox or local VMs on Apple Silicon Macs."
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Installation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip uninstall -y cua-agent"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install \"cua-agent[all]\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# If locally installed, use this instead:\n",
    "import os\n",
    "\n",
    "os.chdir('../libs/python/agent')\n",
    "!poetry install\n",
    "!poetry build\n",
    "\n",
    "!pip uninstall cua-agent -y\n",
    "!pip install ./dist/cua_agent-0.1.0-py3-none-any.whl --force-reinstall"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Initialize a Computer Agent"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": "Agent allows you to run an agentic workflow in virtual sandbox instances. You can choose between Cloud Sandbox or local VMs."
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from computer import Computer, VMProviderType\n",
    "from agent import ComputerAgent"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
    "# Get API keys from environment or prompt user\n",
    "anthropic_key = os.getenv(\"ANTHROPIC_API_KEY\") or \\\n",
    "    input(\"Enter your Anthropic API key: \")\n",
    "openai_key = os.getenv(\"OPENAI_API_KEY\") or \\\n",
    "    input(\"Enter your OpenAI API key: \")\n",
    "\n",
    "os.environ[\"ANTHROPIC_API_KEY\"] = anthropic_key\n",
    "os.environ[\"OPENAI_API_KEY\"] = openai_key"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": "## Option 1: Agent with Cua Cloud Sandbox\n\nUse Cloud Sandbox for running agents from any system without local setup."
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": "### Prerequisites for Cloud Sandbox\n\nTo use Cua Cloud Sandbox, you need to:\n1. Sign up at https://trycua.com\n2. Create a Cloud Sandbox\n3. Generate an API Key\n\nOnce you have these, you can connect to your Cloud Sandbox and run agents on it."
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": "Get Cua API credentials and sandbox details"
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "cua_api_key = os.getenv(\"CUA_API_KEY\") or \\\n",
    "    input(\"Enter your Cua API Key: \")\n",
    "container_name = os.getenv(\"CONTAINER_NAME\") or \\\n",
    "    input(\"Enter your Cloud Container name: \")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": "Choose the OS type for your sandbox (linux or macos)"
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": "os_type = input(\"Enter the OS type of your sandbox (linux/macos) [default: linux]: \").lower() or \"linux\""
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": "### Create an agent with Cloud Sandbox"
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": "import logging\nfrom pathlib import Path\n\n# Connect to your existing Cloud Sandbox\ncomputer = Computer(\n    os_type=os_type,\n    api_key=cua_api_key,\n    name=container_name,\n    provider_type=VMProviderType.CLOUD,\n    verbosity=logging.INFO\n)\n\n# Create agent\nagent = ComputerAgent(\n    model=\"openai/computer-use-preview\",\n    tools=[computer],\n    trajectory_dir=str(Path(\"trajectories\")),\n    only_n_most_recent_images=3,\n    verbosity=logging.INFO\n)\n"
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": "Run tasks on Cloud Sandbox"
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "tasks = [\n",
    "    \"Open a web browser and navigate to GitHub\",\n",
    "    \"Search for the trycua/cua repository\",\n",
    "    \"Take a screenshot of the repository page\"\n",
    "]\n",
    "\n",
    "for i, task in enumerate(tasks):\n",
    "    print(f\"\\nExecuting task {i+1}/{len(tasks)}: {task}\")\n",
    "    async for result in agent.run(task):\n",
    "        # print(result)\n",
    "        pass\n",
    "    print(f\"✅ Task {i+1}/{len(tasks)} completed: {task}\")\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Option 2: KASM Local Docker Containers (cross-platform)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Before we can create an agent, we need to initialize a local computer with Docker provider."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import logging\n",
    "from pathlib import Path\n",
    "\n",
    "computer = Computer(\n",
    "    os_type=\"linux\",\n",
    "    provider_type=\"docker\",\n",
    "    image=\"trycua/cua-ubuntu:latest\",\n",
    "    name=\"my-cua-container\"\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Option 3: Agent with Local VMs (Lume daemon)\n",
    "\n",
    "For Apple Silicon Macs, run agents on local VMs with near-native performance."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Before we can create an agent, we need to initialize a local computer with Lume."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import logging\n",
    "from pathlib import Path\n",
    "\n",
    "\n",
    "computer = Computer(\n",
    "    verbosity=logging.INFO, \n",
    "    provider_type=VMProviderType.LUME,\n",
    "    display=\"1024x768\",\n",
    "    memory=\"8GB\",\n",
    "    cpu=\"4\",\n",
    "    os_type=\"macos\"\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Create an agent"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's start by creating an agent that relies on the OpenAI API computer-use-preview model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Create agent with Anthropic loop and provider\n",
    "agent = ComputerAgent(\n",
    "    model=\"openai/computer-use-preview\",\n",
    "    tools=[computer],\n",
    "    trajectory_dir=str(Path(\"trajectories\")),\n",
    "    only_n_most_recent_images=3,\n",
    "    verbosity=logging.INFO\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Run tasks on a computer:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "tasks = [\n",
    "    \"Look for a repository named trycua/cua on GitHub.\",\n",
    "    \"Check the open issues, open the most recent one and read it.\",\n",
    "    \"Clone the repository in users/lume/projects if it doesn't exist yet.\",\n",
    "    \"Open the repository with an app named Cursor (on the dock, black background and white cube icon).\",\n",
    "    \"From Cursor, open Composer if not already open.\",\n",
    "    \"Focus on the Composer text area, then write and submit a task to help resolve the GitHub issue.\",\n",
    "]\n",
    "\n",
    "for i, task in enumerate(tasks):\n",
    "    print(f\"\\nExecuting task {i}/{len(tasks)}: {task}\")\n",
    "    async for result in agent.run(task):\n",
    "        # print(result)\n",
    "        pass\n",
    "\n",
    "    print(f\"\\n✅ Task {i+1}/{len(tasks)} completed: {task}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Or using the Omni Agent Loop:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import logging\n",
    "from pathlib import Path\n",
    "from agent import ComputerAgent\n",
    "\n",
    "computer = Computer(verbosity=logging.INFO)\n",
    "\n",
    "# Create agent with Anthropic loop and provider\n",
    "agent = ComputerAgent(\n",
    "        model=\"omniparser+ollama_chat/gemma3:12b-it-q4_K_M\",\n",
    "        # model=\"omniparser+openai/gpt-4o-mini\",\n",
    "        # model=\"omniparser+anthropic/claude-3-7-sonnet-20250219\",\n",
    "        tools=[computer],\n",
    "        trajectory_dir=str(Path(\"trajectories\")),\n",
    "        only_n_most_recent_images=3,\n",
    "        verbosity=logging.INFO\n",
    "    )\n",
    "\n",
    "tasks = [\n",
    "    \"Look for a repository named trycua/cua on GitHub.\",\n",
    "    \"Check the open issues, open the most recent one and read it.\",\n",
    "    \"Clone the repository in users/lume/projects if it doesn't exist yet.\",\n",
    "    \"Open the repository with an app named Cursor (on the dock, black background and white cube icon).\",\n",
    "    \"From Cursor, open Composer if not already open.\",\n",
    "    \"Focus on the Composer text area, then write and submit a task to help resolve the GitHub issue.\",\n",
    "]\n",
    "\n",
    "for i, task in enumerate(tasks):\n",
    "    print(f\"\\nExecuting task {i}/{len(tasks)}: {task}\")\n",
    "    async for result in agent.run(task):\n",
    "        # print(result)\n",
    "        pass\n",
    "\n",
    "    print(f\"\\n✅ Task {i+1}/{len(tasks)} completed: {task}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Using the Gradio UI\n",
    "\n",
    "The agent includes a Gradio-based user interface for easy interaction. To use it:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from agent.ui.gradio.ui_components import create_gradio_ui\n",
    "\n",
    "app = create_gradio_ui()\n",
    "app.launch(share=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Advanced Agent Configurations"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Using different agent loops"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can use different agent loops depending on your needs:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "1. OpenAI Agent Loop"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "openai_agent = ComputerAgent(\n",
    "    tools=[computer],  # Can be cloud or local\n",
    "    model=\"openai/computer-use-preview\",\n",
    "    trajectory_dir=str(Path(\"trajectories\")),\n",
    "    verbosity=logging.INFO\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "2. Anthropic Agent Loop"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "anthropic_agent = ComputerAgent(\n",
    "    tools=[computer],\n",
    "    model=\"anthropic/claude-3-5-sonnet-20241022\",\n",
    "    trajectory_dir=str(Path(\"trajectories\")),\n",
    "    verbosity=logging.INFO\n",
    ")\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "3. Omni Agent Loop (supports multiple providers)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "omni_agent = ComputerAgent(\n",
    "    tools=[computer],\n",
    "    model=\"omniparser+anthropic/claude-3-7-sonnet-20250219\",\n",
    "    # model=\"omniparser+openai/gpt-4o-mini\",\n",
    "    trajectory_dir=str(Path(\"trajectories\")),\n",
    "    only_n_most_recent_images=3,\n",
    "    verbosity=logging.INFO\n",
    ")\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "4. UITARS Agent Loop (for local inference on Apple Silicon)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "uitars_agent = ComputerAgent(\n",
    "    tools=[computer],\n",
    "    model=\"mlx/mlx-community/UI-TARS-1.5-7B-6bit\", # local MLX\n",
    "    # model=\"huggingface-local/ByteDance-Seed/UI-TARS-1.5-7B\", # local Huggingface (transformers)\n",
    "    # model=\"huggingface/ByteDance-Seed/UI-TARS-1.5-7B\", # remote Huggingface (TGI)\n",
    "    trajectory_dir=str(Path(\"trajectories\")),\n",
    "    verbosity=logging.INFO\n",
    ")\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Trajectory viewing"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "All agent runs save trajectories that can be viewed at https://trycua.com/trajectory-viewer"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(f\"Trajectories saved to: {Path('trajectories').absolute()}\")\n",
    "print(\"Upload trajectory files to https://trycua.com/trajectory-viewer to visualize agent actions\")\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "cua",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.13.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}