caching, budget pages

This commit is contained in:
Dillon DuPont
2025-07-25 18:54:16 -04:00
parent 1541d522f9
commit d94879aade
3 changed files with 119 additions and 0 deletions

View File

@@ -8,6 +8,8 @@
"callbacks",
"sandboxed-tools",
"local-models",
"prompt-caching",
"usage-tracking",
"migration-guide"
]
}

View File

@@ -0,0 +1,54 @@
---
title: Prompt Caching
sidebar_position: 8
description: How to use prompt caching in ComputerAgent and agent loops.
---
Prompt caching is a cost-saving feature offered by some LLM API providers that helps avoid reprocessing the same prompt, improving efficiency and reducing costs for repeated or long-running tasks.
## Usage
The `use_prompt_caching` argument is available for `ComputerAgent` and agent loops:
```python
agent = ComputerAgent(
...,
use_prompt_caching=True,
)
```
- **Type:** `bool`
- **Default:** `False`
- **Purpose:** Use prompt caching to avoid reprocessing the same prompt.
## Anthropic CUAs
When using Anthropic-based CUAs (Claude models), setting `use_prompt_caching=True` will automatically add `{ "cache_control": "ephemeral" }` to your messages. This enables prompt caching for the session and can speed up repeated runs with the same prompt.
> **Note:** This argument is only required for Anthropic CUAs. For other providers, it is ignored.
## OpenAI Provider
With the OpenAI provider, prompt caching is handled automatically for prompts of 1000+ tokens. You do **not** need to set `use_prompt_caching`—caching will occur for long prompts without any extra configuration.
## Example
```python
from agent import ComputerAgent
agent = ComputerAgent(
model="anthropic/claude-3-5-sonnet-20240620",
use_prompt_caching=True,
)
```
## Implementation Details
- For Anthropic: Adds `{ "cache_control": "ephemeral" }` to messages when enabled.
- For OpenAI: Caching is automatic for long prompts; the argument is ignored.
## When to Use
- Enable for Anthropic CUAs if you want to avoid reprocessing the same prompt in repeated or iterative tasks.
- Not needed for OpenAI models unless you want explicit ephemeral cache control (not required for most users).
## See Also
- [Agent Loops](./agent-loops)
- [Migration Guide](./migration-guide)

View File

@@ -0,0 +1,63 @@
---
title: Usage Tracking
sidebar_position: 9
description: How to track token usage and cost in ComputerAgent and agent loops.
---
Tracking usage is important for monitoring costs and optimizing your agent workflows. The ComputerAgent API provides easy access to token and cost usage for every run.
## Accessing Usage Data
Whenever you run an agent loop, each result contains a `usage` dictionary with token and cost information:
```python
async for result in agent.run(...):
print(result["usage"])
# Example output:
# {
# "prompt_tokens": 150,
# "completion_tokens": 75,
# "total_tokens": 225,
# "response_cost": 0.01,
# }
```
- `prompt_tokens`: Number of tokens in the prompt
- `completion_tokens`: Number of tokens in the agent's response
- `total_tokens`: Total tokens used
- `response_cost`: Estimated cost (USD) for this turn
## Tracking Total Usage
You can accumulate usage across multiple turns:
```python
total_usage = {"prompt_tokens": 0, "completion_tokens": 0, "total_tokens": 0, "response_cost": 0.0}
async for result in agent.run(...):
for k in total_usage:
total_usage[k] += result["usage"].get(k, 0)
print("Total usage:", total_usage)
```
## Using Callbacks for Usage Tracking
You can also use a callback to automatically track usage. Implement the `on_usage` method in your callback class:
```python
from agent.callbacks import AsyncCallbackHandler
class UsageTrackerCallback(AsyncCallbackHandler):
async def on_usage(self, usage):
print("Usage update:", usage)
agent = ComputerAgent(
...,
callbacks=[UsageTrackerCallback()]
)
```
See also: [Budget Manager Callbacks](./callbacks#cost-saving)
## See Also
- [Prompt Caching](./prompt-caching)
- [Callbacks](./callbacks)