mirror of
https://github.com/trycua/computer.git
synced 2026-01-01 19:10:30 -06:00
add UI-TARS-2 to docs
This commit is contained in:
@@ -216,6 +216,7 @@ The following table shows which capabilities are supported by each model:
|
||||
| [Gemini CU Preview](https://ai.google.dev/gemini-api/docs/computer-use) | 🖥️ | 🎯 | | 👁️ |
|
||||
| [InternVL](https://huggingface.co/OpenGVLab/InternVL3_5-1B) | 🖥️ | 🎯 | 🛠️ | 👁️ |
|
||||
| [UI-TARS](https://huggingface.co/ByteDance-Seed/UI-TARS-1.5-7B) | 🖥️ | 🎯 | 🛠️ | 👁️ |
|
||||
| [UI-TARS-2](https://cua.ai/dashboard/vlm-router) | 🖥️ | 🎯 | 🛠️ | 👁️ |
|
||||
| [OpenCUA](https://huggingface.co/xlangai/OpenCUA-7B) | | 🎯 | | |
|
||||
| [GTA](https://huggingface.co/HelloKKMe/GTA1-7B) | | 🎯 | | |
|
||||
| [Holo](https://huggingface.co/Hcompany/Holo1.5-3B) | | 🎯 | | |
|
||||
@@ -264,6 +265,7 @@ agent = ComputerAgent(model="moondream3+openai/gpt-4o")
|
||||
| [Gemini CU Preview](https://ai.google.dev/gemini-api/docs/computer-use) | `gemini-2.5-computer-use-preview` |
|
||||
| [InternVL](https://huggingface.co/OpenGVLab/InternVL3_5-1B) | `huggingface-local/OpenGVLab/InternVL3_5-{1B,2B,4B,8B,...}` |
|
||||
| [UI-TARS](https://huggingface.co/ByteDance-Seed/UI-TARS-1.5-7B) | `huggingface-local/ByteDance-Seed/UI-TARS-1.5-7B` |
|
||||
| [UI-TARS-2](https://cua.ai/dashboard/vlm-router) | `cua/bytedance/ui-tars-2` |
|
||||
| [OpenCUA](https://huggingface.co/xlangai/OpenCUA-7B) | `huggingface-local/xlangai/OpenCUA-{7B,32B}` |
|
||||
| [GTA](https://huggingface.co/HelloKKMe/GTA1-7B) | `huggingface-local/HelloKKMe/GTA1-{7B,32B,72B}` |
|
||||
| [Holo](https://huggingface.co/Hcompany/Holo1.5-3B) | `huggingface-local/Hcompany/Holo1.5-{3B,7B,72B}` |
|
||||
|
||||
@@ -99,6 +99,18 @@ async for _ in agent.run("Open the settings menu and change the theme to dark mo
|
||||
pass
|
||||
```
|
||||
|
||||
## UI-TARS-2
|
||||
|
||||
Next‑generation UI‑TARS via Cua Router:
|
||||
|
||||
- `cua/bytedance/ui-tars-2`
|
||||
|
||||
```python
|
||||
agent = ComputerAgent("cua/bytedance/ui-tars-2", tools=[computer])
|
||||
async for _ in agent.run("Open a browser and search for Python tutorials"):
|
||||
pass
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
CUAs also support direct click prediction. See [Grounding Models](./grounding-models) for details on `predict_click()`.
|
||||
|
||||
Reference in New Issue
Block a user