{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Agent\n", "\n", "This notebook demonstrates how to use Cua's Agent to run a workflow in a virtual sandbox on Apple Silicon Macs." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Installation" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!pip uninstall -y cua-agent" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!pip install \"cua-agent[all]\"\n", "\n", "# Or install individual agent loops:\n", "# !pip install cua-agent[openai]\n", "# !pip install cua-agent[anthropic]\n", "# !pip install cua-agent[uitars]\n", "# !pip install cua-agent[omni]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# If locally installed, use this instead:\n", "import os\n", "\n", "os.chdir('../libs/agent')\n", "!poetry install\n", "!poetry build\n", "\n", "!pip uninstall cua-agent -y\n", "!pip install ./dist/cua_agent-0.1.0-py3-none-any.whl --force-reinstall" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Initialize a Computer Agent" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Agent allows you to run an agentic workflow in a virtual sandbox instances on Apple Silicon. Here's a basic example:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from computer import Computer, VMProviderType\n", "from agent import ComputerAgent, LLM, AgentLoop, LLMProvider" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "import os\n", "\n", "# Get API keys from environment or prompt user\n", "anthropic_key = os.getenv(\"ANTHROPIC_API_KEY\") or input(\"Enter your Anthropic API key: \")\n", "openai_key = os.getenv(\"OPENAI_API_KEY\") or input(\"Enter your OpenAI API key: \")\n", "\n", "os.environ[\"ANTHROPIC_API_KEY\"] = anthropic_key\n", "os.environ[\"OPENAI_API_KEY\"] = openai_key" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Similar to Computer, you can either use the async context manager pattern or initialize the ComputerAgent instance directly." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's start by creating an agent that relies on the OpenAI API computer-use-preview model." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import logging\n", "from pathlib import Path\n", "\n", "computer = Computer(verbosity=logging.INFO, provider_type=VMProviderType.LUME)\n", "\n", "# Create agent with Anthropic loop and provider\n", "agent = ComputerAgent(\n", " computer=computer,\n", " loop=AgentLoop.OPENAI,\n", " model=LLM(provider=LLMProvider.OPENAI),\n", " save_trajectory=True,\n", " trajectory_dir=str(Path(\"trajectories\")),\n", " only_n_most_recent_images=3,\n", " verbosity=logging.INFO\n", " )\n", "\n", "tasks = [\n", " \"Look for a repository named trycua/cua on GitHub.\",\n", " \"Check the open issues, open the most recent one and read it.\",\n", " \"Clone the repository in users/lume/projects if it doesn't exist yet.\",\n", " \"Open the repository with an app named Cursor (on the dock, black background and white cube icon).\",\n", " \"From Cursor, open Composer if not already open.\",\n", " \"Focus on the Composer text area, then write and submit a task to help resolve the GitHub issue.\",\n", "]\n", "\n", "for i, task in enumerate(tasks):\n", " print(f\"\\nExecuting task {i}/{len(tasks)}: {task}\")\n", " async for result in agent.run(task):\n", " # print(result)\n", " pass\n", "\n", " print(f\"\\n✅ Task {i+1}/{len(tasks)} completed: {task}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Or using the Omni Agent Loop:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import logging\n", "from pathlib import Path\n", "from agent import ComputerAgent, LLM, AgentLoop\n", "\n", "computer = Computer(verbosity=logging.INFO)\n", "\n", "# Create agent with Anthropic loop and provider\n", "agent = ComputerAgent(\n", " computer=computer,\n", " loop=AgentLoop.OMNI,\n", " # model=LLM(provider=LLMProvider.ANTHROPIC, name=\"claude-3-7-sonnet-20250219\"),\n", " # model=LLM(provider=LLMProvider.OPENAI, name=\"gpt-4.5-preview\"),\n", " model=LLM(provider=LLMProvider.OLLAMA, name=\"gemma3:12b-it-q4_K_M\"),\n", " save_trajectory=True,\n", " trajectory_dir=str(Path(\"trajectories\")),\n", " only_n_most_recent_images=3,\n", " verbosity=logging.INFO\n", " )\n", "\n", "tasks = [\n", " \"Look for a repository named trycua/cua on GitHub.\",\n", " \"Check the open issues, open the most recent one and read it.\",\n", " \"Clone the repository in users/lume/projects if it doesn't exist yet.\",\n", " \"Open the repository with an app named Cursor (on the dock, black background and white cube icon).\",\n", " \"From Cursor, open Composer if not already open.\",\n", " \"Focus on the Composer text area, then write and submit a task to help resolve the GitHub issue.\",\n", "]\n", "\n", "for i, task in enumerate(tasks):\n", " print(f\"\\nExecuting task {i}/{len(tasks)}: {task}\")\n", " async for result in agent.run(task):\n", " # print(result)\n", " pass\n", "\n", " print(f\"\\n✅ Task {i+1}/{len(tasks)} completed: {task}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Using the Gradio UI\n", "\n", "The agent includes a Gradio-based user interface for easy interaction. To use it:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "import os\n", "\n", "# Get API keys from environment or prompt user\n", "anthropic_key = os.getenv(\"ANTHROPIC_API_KEY\") or input(\"Enter your Anthropic API key: \")\n", "openai_key = os.getenv(\"OPENAI_API_KEY\") or input(\"Enter your OpenAI API key: \")\n", "\n", "os.environ[\"ANTHROPIC_API_KEY\"] = anthropic_key\n", "os.environ[\"OPENAI_API_KEY\"] = openai_key" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from agent.ui.gradio.app import create_gradio_ui\n", "\n", "app = create_gradio_ui()\n", "app.launch(share=False)" ] } ], "metadata": { "kernelspec": { "display_name": "cua312", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.9" } }, "nbformat": 4, "nbformat_minor": 2 }