diff --git a/libs/agent/README.md b/libs/agent/README.md
index 81e8b8f1..399a023b 100644
--- a/libs/agent/README.md
+++ b/libs/agent/README.md
@@ -50,6 +50,9 @@ async with Computer() as macos_computer:
# or
# loop=AgentLoop.OMNI,
# model=LLM(provider=LLMProvider.OLLAMA, model="gemma3")
+ # or
+ # loop=AgentLoop.UITARS,
+ # model=LLM(provider=LLMProvider.OAICOMPAT, model="tgi", provider_base_url="https://**************.us-east-1.aws.endpoints.huggingface.cloud/v1")
)
tasks = [
@@ -124,6 +127,10 @@ The Gradio UI provides:
- Configuration of agent parameters
- Chat interface for interacting with the agent
+### Using UI-TARS
+
+You can use UI-TARS by first following the [deployment guide](https://github.com/bytedance/UI-TARS/blob/main/README_deploy.md). This will give you a provider URL like this: `https://**************.us-east-1.aws.endpoints.huggingface.cloud/v1` which you can use in the gradio UI.
+
## Agent Loops
The `cua-agent` package provides three agent loops variations, based on different CUA models providers and techniques:
@@ -132,6 +139,7 @@ The `cua-agent` package provides three agent loops variations, based on differen
|:-----------|:-----------------|:------------|:-------------|
| `AgentLoop.OPENAI` | • `computer_use_preview` | Use OpenAI Operator CUA model | Not Required |
| `AgentLoop.ANTHROPIC` | • `claude-3-5-sonnet-20240620`
• `claude-3-7-sonnet-20250219` | Use Anthropic Computer-Use | Not Required |
+| `AgentLoop.UITARS` | • `ByteDance-Seed/UI-TARS-1.5-7B` | Uses ByteDance's UI-TARS 1.5 model | Not Required |
| `AgentLoop.OMNI` | • `claude-3-5-sonnet-20240620`
• `claude-3-7-sonnet-20250219`
• `gpt-4.5-preview`
• `gpt-4o`
• `gpt-4`
• `phi4`
• `phi4-mini`
• `gemma3`
• `...`
• `Any Ollama or OpenAI-compatible model` | Use OmniParser for element pixel-detection (SoM) and any VLMs for UI Grounding and Reasoning | OmniParser |
## AgentResponse
@@ -173,25 +181,9 @@ async for result in agent.run(task):
print(output)
```
-### Gradio UI
-
-You can also interact with the agent using a Gradio interface.
-
-```python
-# Ensure environment variables (e.g., API keys) are loaded
-# You might need a helper function like load_dotenv_files() if using .env
-# from utils import load_dotenv_files
-# load_dotenv_files()
-
-from agent.ui.gradio.app import create_gradio_ui
-
-app = create_gradio_ui()
-app.launch(share=False)
-```
-
**Note on Settings Persistence:**
* The Gradio UI automatically saves your configuration (Agent Loop, Model Choice, Custom Base URL, Save Trajectory state, Recent Images count) to a file named `.gradio_settings.json` in the project's root directory when you successfully run a task.
* This allows your preferences to persist between sessions.
* API keys entered into the custom provider field are **not** saved in this file for security reasons. Manage API keys using environment variables (e.g., `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`) or a `.env` file.
-* It's recommended to add `.gradio_settings.json` to your `.gitignore` file.
\ No newline at end of file
+* It's recommended to add `.gradio_settings.json` to your `.gitignore` file.