diff --git a/docs/content/docs/home/agent-sdk/meta.json b/docs/content/docs/home/agent-sdk/meta.json index 4e065c19..933452cb 100644 --- a/docs/content/docs/home/agent-sdk/meta.json +++ b/docs/content/docs/home/agent-sdk/meta.json @@ -8,6 +8,8 @@ "callbacks", "sandboxed-tools", "local-models", + "prompt-caching", + "usage-tracking", "migration-guide" ] } diff --git a/docs/content/docs/home/agent-sdk/prompt-caching.mdx b/docs/content/docs/home/agent-sdk/prompt-caching.mdx new file mode 100644 index 00000000..4c69f99f --- /dev/null +++ b/docs/content/docs/home/agent-sdk/prompt-caching.mdx @@ -0,0 +1,54 @@ +--- +title: Prompt Caching +sidebar_position: 8 +description: How to use prompt caching in ComputerAgent and agent loops. +--- + +Prompt caching is a cost-saving feature offered by some LLM API providers that helps avoid reprocessing the same prompt, improving efficiency and reducing costs for repeated or long-running tasks. + +## Usage + +The `use_prompt_caching` argument is available for `ComputerAgent` and agent loops: + +```python +agent = ComputerAgent( + ..., + use_prompt_caching=True, +) +``` + +- **Type:** `bool` +- **Default:** `False` +- **Purpose:** Use prompt caching to avoid reprocessing the same prompt. + +## Anthropic CUAs + +When using Anthropic-based CUAs (Claude models), setting `use_prompt_caching=True` will automatically add `{ "cache_control": "ephemeral" }` to your messages. This enables prompt caching for the session and can speed up repeated runs with the same prompt. + +> **Note:** This argument is only required for Anthropic CUAs. For other providers, it is ignored. + +## OpenAI Provider + +With the OpenAI provider, prompt caching is handled automatically for prompts of 1000+ tokens. You do **not** need to set `use_prompt_caching`—caching will occur for long prompts without any extra configuration. + +## Example + +```python +from agent import ComputerAgent +agent = ComputerAgent( + model="anthropic/claude-3-5-sonnet-20240620", + use_prompt_caching=True, +) +``` + +## Implementation Details +- For Anthropic: Adds `{ "cache_control": "ephemeral" }` to messages when enabled. +- For OpenAI: Caching is automatic for long prompts; the argument is ignored. + +## When to Use +- Enable for Anthropic CUAs if you want to avoid reprocessing the same prompt in repeated or iterative tasks. +- Not needed for OpenAI models unless you want explicit ephemeral cache control (not required for most users). + +## See Also +- [Agent Loops](./agent-loops) +- [Migration Guide](./migration-guide) diff --git a/docs/content/docs/home/agent-sdk/usage-tracking.mdx b/docs/content/docs/home/agent-sdk/usage-tracking.mdx new file mode 100644 index 00000000..54bbcaae --- /dev/null +++ b/docs/content/docs/home/agent-sdk/usage-tracking.mdx @@ -0,0 +1,63 @@ +--- +title: Usage Tracking +sidebar_position: 9 +description: How to track token usage and cost in ComputerAgent and agent loops. +--- + +Tracking usage is important for monitoring costs and optimizing your agent workflows. The ComputerAgent API provides easy access to token and cost usage for every run. + +## Accessing Usage Data + +Whenever you run an agent loop, each result contains a `usage` dictionary with token and cost information: + +```python +async for result in agent.run(...): + print(result["usage"]) + # Example output: + # { + # "prompt_tokens": 150, + # "completion_tokens": 75, + # "total_tokens": 225, + # "response_cost": 0.01, + # } +``` + +- `prompt_tokens`: Number of tokens in the prompt +- `completion_tokens`: Number of tokens in the agent's response +- `total_tokens`: Total tokens used +- `response_cost`: Estimated cost (USD) for this turn + +## Tracking Total Usage + +You can accumulate usage across multiple turns: + +```python +total_usage = {"prompt_tokens": 0, "completion_tokens": 0, "total_tokens": 0, "response_cost": 0.0} +async for result in agent.run(...): + for k in total_usage: + total_usage[k] += result["usage"].get(k, 0) +print("Total usage:", total_usage) +``` + +## Using Callbacks for Usage Tracking + +You can also use a callback to automatically track usage. Implement the `on_usage` method in your callback class: + +```python +from agent.callbacks import AsyncCallbackHandler + +class UsageTrackerCallback(AsyncCallbackHandler): + async def on_usage(self, usage): + print("Usage update:", usage) + +agent = ComputerAgent( + ..., + callbacks=[UsageTrackerCallback()] +) +``` + +See also: [Budget Manager Callbacks](./callbacks#cost-saving) + +## See Also +- [Prompt Caching](./prompt-caching) +- [Callbacks](./callbacks)