Replaced computer shim with Docker computer

This commit is contained in:
Dillon DuPont
2025-09-09 11:00:52 -04:00
parent b21c668946
commit 665e65cb85

View File

@@ -4,12 +4,15 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Customizing Your ComputerAgent\n\n",
"This notebook demonstrates four practical ways to increase the capabilities and success rate of your `ComputerAgent` in the Agent SDK:\n\n",
"# Customizing Your ComputerAgent\n",
"\n",
"This notebook demonstrates four practical ways to increase the capabilities and success rate of your `ComputerAgent` in the Agent SDK:\n",
"\n",
"1. Simple: Prompt engineering (via optional `instructions`)\n",
"2. Easy: Tools (function tools and custom computer tools)\n",
"3. Intermediate: Callbacks\n",
"4. Expert: Custom `@register_agent` loops\n\n",
"4. Expert: Custom `@register_agent` loops\n",
"\n",
"> Tip: The same patterns work in scripts and services — the notebook just makes it easy to iterate."
]
},
@@ -17,8 +20,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setup\n\n",
"We'll import `ComputerAgent`, a simple computer shim, and some utilities."
"## Setup\n",
"\n",
"We'll import `ComputerAgent`, a simple Docker-based computer, and some utilities."
]
},
{
@@ -29,33 +33,31 @@
"source": [
"import logging\n",
"from agent.agent import ComputerAgent\n",
"from agent.callbacks import PromptInstructionsCallback, LoggingCallback\n",
"from agent.callbacks import LoggingCallback\n",
"from computer import Computer\n",
"\n",
"# A very small computer shim for demo purposes (for full computer handlers, see docs)\n",
"class DummyComputer:\n",
" async def screenshot(self):\n",
" # Return a 1x1 transparent PNG as base64 string (placeholder)\n",
" import base64\n",
" png_bytes = base64.b64decode(\"iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR42mP8Xw8AAr8B9k2m0oYAAAAASUVORK5CYII=\")\n",
" return base64.b64encode(png_bytes).decode()\n",
"computer = Computer(\n",
" os_type=\"linux\",\n",
" provider_type=\"docker\",\n",
" image=\"trycua/cua-ubuntu:latest\",\n",
" name=\"my-cua-container\"\n",
")\n",
"\n",
" async def click(self, x: int, y: int):\n",
" pass\n",
"\n",
" async def type(self, text: str):\n",
" pass\n",
"\n",
"computer = DummyComputer()\n"
"await computer.run() # Launch & connect to Docker container"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1) Simple: Prompt engineering\n\n",
"You can guide your agent with system-like `instructions`.\n\n",
"Under the hood, `ComputerAgent(instructions=...)` adds a `PromptInstructionsCallback` that prepends a user message before each LLM call.\n\n",
"This mirrors the recommended snippet in code:\n\n",
"## 1) Simple: Prompt engineering\n",
"\n",
"You can guide your agent with system-like `instructions`.\n",
"\n",
"Under the hood, `ComputerAgent(instructions=...)` adds a `PromptInstructionsCallback` that prepends a user message before each LLM call.\n",
"\n",
"This mirrors the recommended snippet in code:\n",
"\n",
"```python\n",
"effective_input = full_input\n",
"if instructions:\n",
@@ -101,7 +103,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2) Easy: Tools\n\n",
"## 2) Easy: Tools\n",
"\n",
"Add function tools to expose deterministic capabilities. Tools are auto-extracted to schemas and callable by the agent."
]
},
@@ -135,7 +138,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3) Intermediate: Callbacks\n\n",
"## 3) Intermediate: Callbacks\n",
"\n",
"Callbacks offer lifecycle hooks. For example, limit recent images or record trajectories."
]
},
@@ -161,8 +165,10 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4) Expert: Custom `@register_agent`\n\n",
"Register custom agent configs that implement `predict_step` (and optionally `predict_click`). This gives you full control over prompting, message shaping, and tool wiring.\n\n",
"## 4) Expert: Custom `@register_agent`\n",
"\n",
"Register custom agent configs that implement `predict_step` (and optionally `predict_click`). This gives you full control over prompting, message shaping, and tool wiring.\n",
"\n",
"See: `libs/python/agent/agent/loops/` for concrete examples."
]
},
@@ -170,7 +176,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Next steps\n\n",
"## Next steps\n",
"\n",
"- Start with `instructions` for fast wins.\n",
"- Add function tools for determinism and reliability.\n",
"- Use callbacks to manage cost, logs, and safety.\n",