Update Agent README with Ollama provider

This commit is contained in:
f-trycua
2025-04-05 09:51:37 -07:00
parent 18fca491b5
commit 0d442ac606

View File

@@ -43,6 +43,12 @@ async with Computer() as macos_computer:
computer=macos_computer,
loop=AgentLoop.OPENAI,
model=LLM(provider=LLMProvider.OPENAI)
# or
# loop=AgentLoop.ANTHROPIC,
# model=LLM(provider=LLMProvider.ANTHROPIC)
# or
# loop=AgentLoop.OMNI,
# model=LLM(provider=LLMProvider.OLLAMA, model="gemma3")
)
tasks = [
@@ -74,7 +80,7 @@ The `cua-agent` package provides three agent loops variations, based on differen
|:-----------|:-----------------|:------------|:-------------|
| `AgentLoop.OPENAI` | • `computer_use_preview` | Use OpenAI Operator CUA model | Not Required |
| `AgentLoop.ANTHROPIC` | • `claude-3-5-sonnet-20240620`<br>• `claude-3-7-sonnet-20250219` | Use Anthropic Computer-Use | Not Required |
| `AgentLoop.OMNI` <br>(experimental) | • `claude-3-5-sonnet-20240620`<br>• `claude-3-7-sonnet-20250219`<br>• `gpt-4.5-preview`<br>• `gpt-4o`<br>• `gpt-4` | Use OmniParser for element pixel-detection (SoM) and any VLMs for UI Grounding and Reasoning | OmniParser |
| `AgentLoop.OMNI` | • `claude-3-5-sonnet-20240620`<br>• `claude-3-7-sonnet-20250219`<br>• `gpt-4.5-preview`<br>• `gpt-4o`<br>• `gpt-4`<br>• `phi4`<br>• `phi4-mini`<br>• `gemma3`<br>• `...`<br>• `Any Ollama-compatible model` | Use OmniParser for element pixel-detection (SoM) and any VLMs for UI Grounding and Reasoning | OmniParser |
## AgentResponse
The `AgentResponse` class represents the structured output returned after each agent turn. It contains the agent's response, reasoning, tool usage, and other metadata. The response format aligns with the new [OpenAI Agent SDK specification](https://platform.openai.com/docs/api-reference/responses) for better consistency across different agent loops.