mirror of
https://github.com/trycua/computer.git
synced 2026-01-01 11:00:31 -06:00
Replaced computer shim with Docker computer
This commit is contained in:
@@ -4,12 +4,15 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Customizing Your ComputerAgent\n\n",
|
||||
"This notebook demonstrates four practical ways to increase the capabilities and success rate of your `ComputerAgent` in the Agent SDK:\n\n",
|
||||
"# Customizing Your ComputerAgent\n",
|
||||
"\n",
|
||||
"This notebook demonstrates four practical ways to increase the capabilities and success rate of your `ComputerAgent` in the Agent SDK:\n",
|
||||
"\n",
|
||||
"1. Simple: Prompt engineering (via optional `instructions`)\n",
|
||||
"2. Easy: Tools (function tools and custom computer tools)\n",
|
||||
"3. Intermediate: Callbacks\n",
|
||||
"4. Expert: Custom `@register_agent` loops\n\n",
|
||||
"4. Expert: Custom `@register_agent` loops\n",
|
||||
"\n",
|
||||
"> Tip: The same patterns work in scripts and services — the notebook just makes it easy to iterate."
|
||||
]
|
||||
},
|
||||
@@ -17,8 +20,9 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Setup\n\n",
|
||||
"We'll import `ComputerAgent`, a simple computer shim, and some utilities."
|
||||
"## Setup\n",
|
||||
"\n",
|
||||
"We'll import `ComputerAgent`, a simple Docker-based computer, and some utilities."
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -29,33 +33,31 @@
|
||||
"source": [
|
||||
"import logging\n",
|
||||
"from agent.agent import ComputerAgent\n",
|
||||
"from agent.callbacks import PromptInstructionsCallback, LoggingCallback\n",
|
||||
"from agent.callbacks import LoggingCallback\n",
|
||||
"from computer import Computer\n",
|
||||
"\n",
|
||||
"# A very small computer shim for demo purposes (for full computer handlers, see docs)\n",
|
||||
"class DummyComputer:\n",
|
||||
" async def screenshot(self):\n",
|
||||
" # Return a 1x1 transparent PNG as base64 string (placeholder)\n",
|
||||
" import base64\n",
|
||||
" png_bytes = base64.b64decode(\"iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR42mP8Xw8AAr8B9k2m0oYAAAAASUVORK5CYII=\")\n",
|
||||
" return base64.b64encode(png_bytes).decode()\n",
|
||||
"computer = Computer(\n",
|
||||
" os_type=\"linux\",\n",
|
||||
" provider_type=\"docker\",\n",
|
||||
" image=\"trycua/cua-ubuntu:latest\",\n",
|
||||
" name=\"my-cua-container\"\n",
|
||||
")\n",
|
||||
"\n",
|
||||
" async def click(self, x: int, y: int):\n",
|
||||
" pass\n",
|
||||
"\n",
|
||||
" async def type(self, text: str):\n",
|
||||
" pass\n",
|
||||
"\n",
|
||||
"computer = DummyComputer()\n"
|
||||
"await computer.run() # Launch & connect to Docker container"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 1) Simple: Prompt engineering\n\n",
|
||||
"You can guide your agent with system-like `instructions`.\n\n",
|
||||
"Under the hood, `ComputerAgent(instructions=...)` adds a `PromptInstructionsCallback` that prepends a user message before each LLM call.\n\n",
|
||||
"This mirrors the recommended snippet in code:\n\n",
|
||||
"## 1) Simple: Prompt engineering\n",
|
||||
"\n",
|
||||
"You can guide your agent with system-like `instructions`.\n",
|
||||
"\n",
|
||||
"Under the hood, `ComputerAgent(instructions=...)` adds a `PromptInstructionsCallback` that prepends a user message before each LLM call.\n",
|
||||
"\n",
|
||||
"This mirrors the recommended snippet in code:\n",
|
||||
"\n",
|
||||
"```python\n",
|
||||
"effective_input = full_input\n",
|
||||
"if instructions:\n",
|
||||
@@ -101,7 +103,8 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 2) Easy: Tools\n\n",
|
||||
"## 2) Easy: Tools\n",
|
||||
"\n",
|
||||
"Add function tools to expose deterministic capabilities. Tools are auto-extracted to schemas and callable by the agent."
|
||||
]
|
||||
},
|
||||
@@ -135,7 +138,8 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 3) Intermediate: Callbacks\n\n",
|
||||
"## 3) Intermediate: Callbacks\n",
|
||||
"\n",
|
||||
"Callbacks offer lifecycle hooks. For example, limit recent images or record trajectories."
|
||||
]
|
||||
},
|
||||
@@ -161,8 +165,10 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 4) Expert: Custom `@register_agent`\n\n",
|
||||
"Register custom agent configs that implement `predict_step` (and optionally `predict_click`). This gives you full control over prompting, message shaping, and tool wiring.\n\n",
|
||||
"## 4) Expert: Custom `@register_agent`\n",
|
||||
"\n",
|
||||
"Register custom agent configs that implement `predict_step` (and optionally `predict_click`). This gives you full control over prompting, message shaping, and tool wiring.\n",
|
||||
"\n",
|
||||
"See: `libs/python/agent/agent/loops/` for concrete examples."
|
||||
]
|
||||
},
|
||||
@@ -170,7 +176,8 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Next steps\n\n",
|
||||
"## Next steps\n",
|
||||
"\n",
|
||||
"- Start with `instructions` for fast wins.\n",
|
||||
"- Add function tools for determinism and reliability.\n",
|
||||
"- Use callbacks to manage cost, logs, and safety.\n",
|
||||
|
||||
Reference in New Issue
Block a user