mirror of
https://github.com/trycua/computer.git
synced 2026-01-01 11:00:31 -06:00
Move text from API Reference to Agent Playbook section
This commit is contained in:
@@ -30,3 +30,58 @@ async for result in agent.run(prompt):
|
||||
```
|
||||
|
||||
For a list of supported models and configurations, see the [Supported Agents](./supported-agents/computer-use-agents) page.
|
||||
|
||||
### ComputerAgent Constructor Options
|
||||
|
||||
The `ComputerAgent` constructor provides a wide range of options for customizing agent behavior, tool integration, callbacks, resource management, and more.
|
||||
|
||||
| Parameter | Type | Default | Description |
|
||||
| --------------------------- | ----------------- | ------------ | ---------------------------------------------------------------------------------------------------- |
|
||||
| `model` | `str` | **required** | Model name (e.g., "claude-3-5-sonnet-20241022", "computer-use-preview", "omni+vertex_ai/gemini-pro") |
|
||||
| `tools` | `List[Any]` | `None` | List of tools (e.g., computer objects, decorated functions) |
|
||||
| `custom_loop` | `Callable` | `None` | Custom agent loop function (overrides auto-selection) |
|
||||
| `only_n_most_recent_images` | `int` | `None` | If set, only keep the N most recent images in message history (adds ImageRetentionCallback) |
|
||||
| `callbacks` | `List[Any]` | `None` | List of AsyncCallbackHandler instances for preprocessing/postprocessing |
|
||||
| `verbosity` | `int` | `None` | Logging level (`logging.DEBUG`, `logging.INFO`, etc.; adds LoggingCallback) |
|
||||
| `trajectory_dir` | `str` | `None` | Directory to save trajectory data (adds TrajectorySaverCallback) |
|
||||
| `max_retries` | `int` | `3` | Maximum number of retries for failed API calls |
|
||||
| `screenshot_delay` | `float` \| `int` | `0.5` | Delay before screenshots (seconds) |
|
||||
| `use_prompt_caching` | `bool` | `False` | Use prompt caching to avoid reprocessing the same prompt (mainly for Anthropic) |
|
||||
| `max_trajectory_budget` | `float` \| `dict` | `None` | If set, adds BudgetManagerCallback to track usage costs and stop when budget is exceeded |
|
||||
| `**kwargs` | _any_ | | Additional arguments passed to the agent loop |
|
||||
|
||||
#### Parameter Details
|
||||
|
||||
- **model**: The LLM or agent model to use. Determines which agent loop is selected unless `custom_loop` is provided.
|
||||
- **tools**: List of tools the agent can use (e.g., `Computer`, sandboxed Python functions, etc.).
|
||||
- **custom_loop**: Optional custom agent loop function. If provided, overrides automatic loop selection.
|
||||
- **only_n_most_recent_images**: If set, only the N most recent images are kept in the message history. Useful for limiting memory usage. Automatically adds `ImageRetentionCallback`.
|
||||
- **callbacks**: List of callback instances for advanced preprocessing, postprocessing, logging, or custom hooks. See [Callbacks & Extensibility](#callbacks--extensibility).
|
||||
- **verbosity**: Logging level (e.g., `logging.INFO`). If set, adds a logging callback.
|
||||
- **trajectory_dir**: Directory path to save full trajectory data, including screenshots and responses. Adds `TrajectorySaverCallback`.
|
||||
- **max_retries**: Maximum number of retries for failed API calls (default: 3).
|
||||
- **screenshot_delay**: Delay (in seconds) before taking screenshots (default: 0.5).
|
||||
- **use_prompt_caching**: Enables prompt caching for repeated prompts (mainly for Anthropic models).
|
||||
- **max_trajectory_budget**: If set (float or dict), adds a budget manager callback that tracks usage costs and stops execution if the budget is exceeded. Dict allows advanced options (e.g., `{ "max_budget": 5.0, "raise_error": True }`).
|
||||
- **\*\*kwargs**: Any additional keyword arguments are passed through to the agent loop or model provider.
|
||||
|
||||
**Example with advanced options:**
|
||||
|
||||
```python
|
||||
from agent import ComputerAgent
|
||||
from computer import Computer
|
||||
from agent.callbacks import ImageRetentionCallback
|
||||
|
||||
agent = ComputerAgent(
|
||||
model="anthropic/claude-3-5-sonnet-20241022",
|
||||
tools=[Computer(...)],
|
||||
only_n_most_recent_images=3,
|
||||
callbacks=[ImageRetentionCallback(only_n_most_recent_images=3)],
|
||||
verbosity=logging.INFO,
|
||||
trajectory_dir="trajectories",
|
||||
max_retries=5,
|
||||
screenshot_delay=1.0,
|
||||
use_prompt_caching=True,
|
||||
max_trajectory_budget={"max_budget": 5.0, "raise_error": True}
|
||||
)
|
||||
```
|
||||
@@ -6,6 +6,20 @@ Callbacks in the Agent SDK provide hooks into the agent's lifecycle, allowing fo
|
||||
|
||||
The callback lifecycle is described in [Agent Lifecycle](callbacks/agent-lifecycle).
|
||||
|
||||
## Usage
|
||||
|
||||
You can add preprocessing and postprocessing hooks using callbacks, or write your own by subclassing `AsyncCallbackHandler`:
|
||||
|
||||
```python
|
||||
from agent.callbacks import ImageRetentionCallback, PIIAnonymizationCallback
|
||||
|
||||
agent = ComputerAgent(
|
||||
model="anthropic/claude-3-5-sonnet-20241022",
|
||||
tools=[computer],
|
||||
callbacks=[ImageRetentionCallback(only_n_most_recent_images=3)]
|
||||
)
|
||||
```
|
||||
|
||||
## Built-in Callbacks
|
||||
|
||||
- [BudgetManagerCallback](callbacks/cost-saving): Stops execution when budget exceeded
|
||||
|
||||
@@ -8,109 +8,14 @@ github:
|
||||
|
||||
The Agent library provides the ComputerAgent class and tools for building AI agents that automate workflows on Cua Computers.
|
||||
|
||||
## Reference
|
||||
## Agent Loops
|
||||
|
||||
### Basic Usage
|
||||
See the [Agent Loops](../agent-sdk/agent-loops) documentation for how agents process information and take actions.
|
||||
|
||||
```python
|
||||
from agent import ComputerAgent
|
||||
from computer import Computer
|
||||
## Chat History
|
||||
|
||||
computer = Computer() # Connect to a cua container
|
||||
agent = ComputerAgent(
|
||||
model="anthropic/claude-3-5-sonnet-20241022",
|
||||
tools=[computer]
|
||||
)
|
||||
See the [Chat History](../agent-sdk/chat-history) documentation for managing conversational context and turn-by-turn interactions.
|
||||
|
||||
prompt = "open github, navigate to trycua/cua"
|
||||
## Callbacks
|
||||
|
||||
async for result in agent.run(prompt):
|
||||
print("Agent:", result["output"][-1]["content"][0]["text"])
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### ComputerAgent Constructor Options
|
||||
|
||||
The `ComputerAgent` constructor provides a wide range of options for customizing agent behavior, tool integration, callbacks, resource management, and more.
|
||||
|
||||
| Parameter | Type | Default | Description |
|
||||
| --------------------------- | ----------------- | ------------ | ---------------------------------------------------------------------------------------------------- |
|
||||
| `model` | `str` | **required** | Model name (e.g., "claude-3-5-sonnet-20241022", "computer-use-preview", "omni+vertex_ai/gemini-pro") |
|
||||
| `tools` | `List[Any]` | `None` | List of tools (e.g., computer objects, decorated functions) |
|
||||
| `custom_loop` | `Callable` | `None` | Custom agent loop function (overrides auto-selection) |
|
||||
| `only_n_most_recent_images` | `int` | `None` | If set, only keep the N most recent images in message history (adds ImageRetentionCallback) |
|
||||
| `callbacks` | `List[Any]` | `None` | List of AsyncCallbackHandler instances for preprocessing/postprocessing |
|
||||
| `verbosity` | `int` | `None` | Logging level (`logging.DEBUG`, `logging.INFO`, etc.; adds LoggingCallback) |
|
||||
| `trajectory_dir` | `str` | `None` | Directory to save trajectory data (adds TrajectorySaverCallback) |
|
||||
| `max_retries` | `int` | `3` | Maximum number of retries for failed API calls |
|
||||
| `screenshot_delay` | `float` \| `int` | `0.5` | Delay before screenshots (seconds) |
|
||||
| `use_prompt_caching` | `bool` | `False` | Use prompt caching to avoid reprocessing the same prompt (mainly for Anthropic) |
|
||||
| `max_trajectory_budget` | `float` \| `dict` | `None` | If set, adds BudgetManagerCallback to track usage costs and stop when budget is exceeded |
|
||||
| `**kwargs` | _any_ | | Additional arguments passed to the agent loop |
|
||||
|
||||
#### Parameter Details
|
||||
|
||||
- **model**: The LLM or agent model to use. Determines which agent loop is selected unless `custom_loop` is provided.
|
||||
- **tools**: List of tools the agent can use (e.g., `Computer`, sandboxed Python functions, etc.).
|
||||
- **custom_loop**: Optional custom agent loop function. If provided, overrides automatic loop selection.
|
||||
- **only_n_most_recent_images**: If set, only the N most recent images are kept in the message history. Useful for limiting memory usage. Automatically adds `ImageRetentionCallback`.
|
||||
- **callbacks**: List of callback instances for advanced preprocessing, postprocessing, logging, or custom hooks. See [Callbacks & Extensibility](#callbacks--extensibility).
|
||||
- **verbosity**: Logging level (e.g., `logging.INFO`). If set, adds a logging callback.
|
||||
- **trajectory_dir**: Directory path to save full trajectory data, including screenshots and responses. Adds `TrajectorySaverCallback`.
|
||||
- **max_retries**: Maximum number of retries for failed API calls (default: 3).
|
||||
- **screenshot_delay**: Delay (in seconds) before taking screenshots (default: 0.5).
|
||||
- **use_prompt_caching**: Enables prompt caching for repeated prompts (mainly for Anthropic models).
|
||||
- **max_trajectory_budget**: If set (float or dict), adds a budget manager callback that tracks usage costs and stops execution if the budget is exceeded. Dict allows advanced options (e.g., `{ "max_budget": 5.0, "raise_error": True }`).
|
||||
- **\*\*kwargs**: Any additional keyword arguments are passed through to the agent loop or model provider.
|
||||
|
||||
**Example with advanced options:**
|
||||
|
||||
```python
|
||||
from agent import ComputerAgent
|
||||
from computer import Computer
|
||||
from agent.callbacks import ImageRetentionCallback
|
||||
|
||||
agent = ComputerAgent(
|
||||
model="anthropic/claude-3-5-sonnet-20241022",
|
||||
tools=[Computer(...)],
|
||||
only_n_most_recent_images=3,
|
||||
callbacks=[ImageRetentionCallback(only_n_most_recent_images=3)],
|
||||
verbosity=logging.INFO,
|
||||
trajectory_dir="trajectories",
|
||||
max_retries=5,
|
||||
screenshot_delay=1.0,
|
||||
use_prompt_caching=True,
|
||||
max_trajectory_budget={"max_budget": 5.0, "raise_error": True}
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Message Array (Multi-turn)
|
||||
|
||||
```python
|
||||
messages = [
|
||||
{"role": "user", "content": "go to trycua on gh"},
|
||||
# ... (reasoning, computer_call, computer_call_output, etc)
|
||||
]
|
||||
async for result in agent.run(messages):
|
||||
# Handle output, tool invocations, screenshots, etc.
|
||||
print("Agent:", result["output"][-1]["content"][0]["text"])
|
||||
messages += result["output"] # Add agent output to message array
|
||||
...
|
||||
```
|
||||
|
||||
### Callbacks & Extensibility
|
||||
|
||||
You can add preprocessing and postprocessing hooks using callbacks, or write your own by subclassing `AsyncCallbackHandler`:
|
||||
|
||||
```python
|
||||
from agent.callbacks import ImageRetentionCallback, PIIAnonymizationCallback
|
||||
|
||||
agent = ComputerAgent(
|
||||
model="anthropic/claude-3-5-sonnet-20241022",
|
||||
tools=[computer],
|
||||
callbacks=[ImageRetentionCallback(only_n_most_recent_images=3)]
|
||||
)
|
||||
```
|
||||
See the [Callbacks](../agent-sdk/callbacks) documentation for extending and customizing agent behavior with custom hooks.
|
||||
|
||||
Reference in New Issue
Block a user