caching, budget pages

2026-01-06 05:20:02 -06:00 · 2025-07-25 18:54:16 -04:00
parent 1541d522f9
commit d94879aade
3 changed files with 119 additions and 0 deletions
--- a/docs/content/docs/home/agent-sdk/meta.json
+++ b/docs/content/docs/home/agent-sdk/meta.json
@@ -8,6 +8,8 @@
 		"callbacks",
        "sandboxed-tools",
    	"local-models",
+        "prompt-caching",
+		"usage-tracking",
        "migration-guide"
 	]
 }
--- a/docs/content/docs/home/agent-sdk/prompt-caching.mdx
+++ b/docs/content/docs/home/agent-sdk/prompt-caching.mdx
@@ -0,0 +1,54 @@
+---
+title: Prompt Caching
+sidebar_position: 8
+description: How to use prompt caching in ComputerAgent and agent loops.
+---
+
+Prompt caching is a cost-saving feature offered by some LLM API providers that helps avoid reprocessing the same prompt, improving efficiency and reducing costs for repeated or long-running tasks.
+
+## Usage
+
+The `use_prompt_caching` argument is available for `ComputerAgent` and agent loops:
+
+```python
+agent = ComputerAgent(
+    ...,
+    use_prompt_caching=True,
+)
+```
+
+- **Type:** `bool`
+- **Default:** `False`
+- **Purpose:** Use prompt caching to avoid reprocessing the same prompt.
+
+## Anthropic CUAs
+
+When using Anthropic-based CUAs (Claude models), setting `use_prompt_caching=True` will automatically add `{ "cache_control": "ephemeral" }` to your messages. This enables prompt caching for the session and can speed up repeated runs with the same prompt.
+
+> **Note:** This argument is only required for Anthropic CUAs. For other providers, it is ignored.
+
+## OpenAI Provider
+
+With the OpenAI provider, prompt caching is handled automatically for prompts of 1000+ tokens. You do **not** need to set `use_prompt_caching`—caching will occur for long prompts without any extra configuration.
+
+## Example
+
+```python
+from agent import ComputerAgent
+agent = ComputerAgent(
+    model="anthropic/claude-3-5-sonnet-20240620",
+    use_prompt_caching=True,
+)
+```
+
+## Implementation Details
+- For Anthropic: Adds `{ "cache_control": "ephemeral" }` to messages when enabled.
+- For OpenAI: Caching is automatic for long prompts; the argument is ignored.
+
+## When to Use
+- Enable for Anthropic CUAs if you want to avoid reprocessing the same prompt in repeated or iterative tasks.
+- Not needed for OpenAI models unless you want explicit ephemeral cache control (not required for most users).
+
+## See Also
+- [Agent Loops](./agent-loops)
+- [Migration Guide](./migration-guide)
--- a/docs/content/docs/home/agent-sdk/usage-tracking.mdx
+++ b/docs/content/docs/home/agent-sdk/usage-tracking.mdx
@@ -0,0 +1,63 @@
+---
+title: Usage Tracking
+sidebar_position: 9
+description: How to track token usage and cost in ComputerAgent and agent loops.
+---
+
+Tracking usage is important for monitoring costs and optimizing your agent workflows. The ComputerAgent API provides easy access to token and cost usage for every run.
+
+## Accessing Usage Data
+
+Whenever you run an agent loop, each result contains a `usage` dictionary with token and cost information:
+
+```python
+async for result in agent.run(...):
+    print(result["usage"])
+    # Example output:
+    # {
+    #     "prompt_tokens": 150,
+    #     "completion_tokens": 75,
+    #     "total_tokens": 225,
+    #     "response_cost": 0.01,
+    # }
+```
+
+- `prompt_tokens`: Number of tokens in the prompt
+- `completion_tokens`: Number of tokens in the agent's response
+- `total_tokens`: Total tokens used
+- `response_cost`: Estimated cost (USD) for this turn
+
+## Tracking Total Usage
+
+You can accumulate usage across multiple turns:
+
+```python
+total_usage = {"prompt_tokens": 0, "completion_tokens": 0, "total_tokens": 0, "response_cost": 0.0}
+async for result in agent.run(...):
+    for k in total_usage:
+        total_usage[k] += result["usage"].get(k, 0)
+print("Total usage:", total_usage)
+```
+
+## Using Callbacks for Usage Tracking
+
+You can also use a callback to automatically track usage. Implement the `on_usage` method in your callback class:
+
+```python
+from agent.callbacks import AsyncCallbackHandler
+
+class UsageTrackerCallback(AsyncCallbackHandler):
+    async def on_usage(self, usage):
+        print("Usage update:", usage)
+
+agent = ComputerAgent(
+    ..., 
+    callbacks=[UsageTrackerCallback()]
+)
+```
+
+See also: [Budget Manager Callbacks](./callbacks#cost-saving)
+
+## See Also
+- [Prompt Caching](./prompt-caching)
+- [Callbacks](./callbacks)