mirror of
https://github.com/trycua/computer.git
synced 2026-01-06 05:20:02 -06:00
caching, budget pages
This commit is contained in:
@@ -8,6 +8,8 @@
|
||||
"callbacks",
|
||||
"sandboxed-tools",
|
||||
"local-models",
|
||||
"prompt-caching",
|
||||
"usage-tracking",
|
||||
"migration-guide"
|
||||
]
|
||||
}
|
||||
|
||||
54
docs/content/docs/home/agent-sdk/prompt-caching.mdx
Normal file
54
docs/content/docs/home/agent-sdk/prompt-caching.mdx
Normal file
@@ -0,0 +1,54 @@
|
||||
---
|
||||
title: Prompt Caching
|
||||
sidebar_position: 8
|
||||
description: How to use prompt caching in ComputerAgent and agent loops.
|
||||
---
|
||||
|
||||
Prompt caching is a cost-saving feature offered by some LLM API providers that helps avoid reprocessing the same prompt, improving efficiency and reducing costs for repeated or long-running tasks.
|
||||
|
||||
## Usage
|
||||
|
||||
The `use_prompt_caching` argument is available for `ComputerAgent` and agent loops:
|
||||
|
||||
```python
|
||||
agent = ComputerAgent(
|
||||
...,
|
||||
use_prompt_caching=True,
|
||||
)
|
||||
```
|
||||
|
||||
- **Type:** `bool`
|
||||
- **Default:** `False`
|
||||
- **Purpose:** Use prompt caching to avoid reprocessing the same prompt.
|
||||
|
||||
## Anthropic CUAs
|
||||
|
||||
When using Anthropic-based CUAs (Claude models), setting `use_prompt_caching=True` will automatically add `{ "cache_control": "ephemeral" }` to your messages. This enables prompt caching for the session and can speed up repeated runs with the same prompt.
|
||||
|
||||
> **Note:** This argument is only required for Anthropic CUAs. For other providers, it is ignored.
|
||||
|
||||
## OpenAI Provider
|
||||
|
||||
With the OpenAI provider, prompt caching is handled automatically for prompts of 1000+ tokens. You do **not** need to set `use_prompt_caching`—caching will occur for long prompts without any extra configuration.
|
||||
|
||||
## Example
|
||||
|
||||
```python
|
||||
from agent import ComputerAgent
|
||||
agent = ComputerAgent(
|
||||
model="anthropic/claude-3-5-sonnet-20240620",
|
||||
use_prompt_caching=True,
|
||||
)
|
||||
```
|
||||
|
||||
## Implementation Details
|
||||
- For Anthropic: Adds `{ "cache_control": "ephemeral" }` to messages when enabled.
|
||||
- For OpenAI: Caching is automatic for long prompts; the argument is ignored.
|
||||
|
||||
## When to Use
|
||||
- Enable for Anthropic CUAs if you want to avoid reprocessing the same prompt in repeated or iterative tasks.
|
||||
- Not needed for OpenAI models unless you want explicit ephemeral cache control (not required for most users).
|
||||
|
||||
## See Also
|
||||
- [Agent Loops](./agent-loops)
|
||||
- [Migration Guide](./migration-guide)
|
||||
63
docs/content/docs/home/agent-sdk/usage-tracking.mdx
Normal file
63
docs/content/docs/home/agent-sdk/usage-tracking.mdx
Normal file
@@ -0,0 +1,63 @@
|
||||
---
|
||||
title: Usage Tracking
|
||||
sidebar_position: 9
|
||||
description: How to track token usage and cost in ComputerAgent and agent loops.
|
||||
---
|
||||
|
||||
Tracking usage is important for monitoring costs and optimizing your agent workflows. The ComputerAgent API provides easy access to token and cost usage for every run.
|
||||
|
||||
## Accessing Usage Data
|
||||
|
||||
Whenever you run an agent loop, each result contains a `usage` dictionary with token and cost information:
|
||||
|
||||
```python
|
||||
async for result in agent.run(...):
|
||||
print(result["usage"])
|
||||
# Example output:
|
||||
# {
|
||||
# "prompt_tokens": 150,
|
||||
# "completion_tokens": 75,
|
||||
# "total_tokens": 225,
|
||||
# "response_cost": 0.01,
|
||||
# }
|
||||
```
|
||||
|
||||
- `prompt_tokens`: Number of tokens in the prompt
|
||||
- `completion_tokens`: Number of tokens in the agent's response
|
||||
- `total_tokens`: Total tokens used
|
||||
- `response_cost`: Estimated cost (USD) for this turn
|
||||
|
||||
## Tracking Total Usage
|
||||
|
||||
You can accumulate usage across multiple turns:
|
||||
|
||||
```python
|
||||
total_usage = {"prompt_tokens": 0, "completion_tokens": 0, "total_tokens": 0, "response_cost": 0.0}
|
||||
async for result in agent.run(...):
|
||||
for k in total_usage:
|
||||
total_usage[k] += result["usage"].get(k, 0)
|
||||
print("Total usage:", total_usage)
|
||||
```
|
||||
|
||||
## Using Callbacks for Usage Tracking
|
||||
|
||||
You can also use a callback to automatically track usage. Implement the `on_usage` method in your callback class:
|
||||
|
||||
```python
|
||||
from agent.callbacks import AsyncCallbackHandler
|
||||
|
||||
class UsageTrackerCallback(AsyncCallbackHandler):
|
||||
async def on_usage(self, usage):
|
||||
print("Usage update:", usage)
|
||||
|
||||
agent = ComputerAgent(
|
||||
...,
|
||||
callbacks=[UsageTrackerCallback()]
|
||||
)
|
||||
```
|
||||
|
||||
See also: [Budget Manager Callbacks](./callbacks#cost-saving)
|
||||
|
||||
## See Also
|
||||
- [Prompt Caching](./prompt-caching)
|
||||
- [Callbacks](./callbacks)
|
||||
Reference in New Issue
Block a user