Fix styling, fix images
14
docs/.prettierrc
Normal file
@@ -0,0 +1,14 @@
|
||||
{
|
||||
"printWidth": 80,
|
||||
"tabWidth": 2,
|
||||
"useTabs": false,
|
||||
"semi": true,
|
||||
"singleQuote": true,
|
||||
"quoteProps": "as-needed",
|
||||
"jsxSingleQuote": false,
|
||||
"trailingComma": "es5",
|
||||
"bracketSpacing": true,
|
||||
"bracketSameLine": false,
|
||||
"arrowParens": "always",
|
||||
"endOfLine": "lf"
|
||||
}
|
||||
@@ -1,86 +0,0 @@
|
||||
{
|
||||
"$schema": "https://biomejs.dev/schemas/1.9.4/schema.json",
|
||||
"vcs": {
|
||||
"enabled": false,
|
||||
"clientKind": "git",
|
||||
"useIgnoreFile": false
|
||||
},
|
||||
"files": {
|
||||
"ignoreUnknown": false,
|
||||
"ignore": [
|
||||
".next",
|
||||
"build"
|
||||
]
|
||||
},
|
||||
"formatter": {
|
||||
"enabled": true,
|
||||
"useEditorconfig": true,
|
||||
"formatWithErrors": false,
|
||||
"indentStyle": "space",
|
||||
"indentWidth": 2,
|
||||
"lineEnding": "lf",
|
||||
"lineWidth": 80,
|
||||
"attributePosition": "auto",
|
||||
"bracketSpacing": true
|
||||
},
|
||||
"organizeImports": {
|
||||
"enabled": true
|
||||
},
|
||||
"linter": {
|
||||
"enabled": true,
|
||||
"rules": {
|
||||
"recommended": true,
|
||||
"style": {
|
||||
"useSelfClosingElements": "warn",
|
||||
"noUnusedTemplateLiteral": "warn",
|
||||
"noNonNullAssertion": "off"
|
||||
},
|
||||
"a11y": {
|
||||
"useMediaCaption": "off",
|
||||
"useKeyWithClickEvents": "warn",
|
||||
"useKeyWithMouseEvents": "warn",
|
||||
"noSvgWithoutTitle": "off",
|
||||
"useButtonType": "warn",
|
||||
"noAutofocus": "off"
|
||||
},
|
||||
"suspicious": {
|
||||
"noArrayIndexKey": "off"
|
||||
},
|
||||
"correctness": {
|
||||
"noUnusedVariables": "warn",
|
||||
"noUnusedFunctionParameters": "warn",
|
||||
"noUnusedImports": "warn"
|
||||
},
|
||||
"complexity": {
|
||||
"useOptionalChain": "info"
|
||||
},
|
||||
"nursery": {
|
||||
"useSortedClasses": {
|
||||
"level": "warn",
|
||||
"fix": "safe",
|
||||
"options": {
|
||||
"attributes": [
|
||||
"className"
|
||||
],
|
||||
"functions": [
|
||||
"cn"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"javascript": {
|
||||
"formatter": {
|
||||
"jsxQuoteStyle": "double",
|
||||
"quoteProperties": "asNeeded",
|
||||
"trailingCommas": "es5",
|
||||
"semicolons": "always",
|
||||
"arrowParentheses": "always",
|
||||
"bracketSameLine": false,
|
||||
"quoteStyle": "single",
|
||||
"attributePosition": "auto",
|
||||
"bracketSpacing": true
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -2,11 +2,40 @@
|
||||
title: Agent
|
||||
---
|
||||
|
||||
<div align="center" style={{display: 'flex', gap: '10px', margin: '0 auto', width: '100%', justifyContent: 'center'}}>
|
||||
<a href="#"><img src="https://img.shields.io/badge/Python-333333?logo=python&logoColor=white&labelColor=333333" alt="Python" /></a>
|
||||
<a href="#"><img src="https://img.shields.io/badge/macOS-000000?logo=apple&logoColor=F0F0F0" alt="macOS" /></a>
|
||||
<a href="https://discord.com/invite/mVnXXpdE85"><img src="https://img.shields.io/badge/Discord-%235865F2.svg?&logo=discord&logoColor=white" alt="Discord" /></a>
|
||||
<a href="https://pypi.org/project/cua-computer/"><img src="https://img.shields.io/pypi/v/cua-computer?color=333333" alt="PyPI" /></a>
|
||||
<div
|
||||
align="center"
|
||||
style={{
|
||||
display: 'flex',
|
||||
gap: '10px',
|
||||
margin: '0 auto',
|
||||
width: '100%',
|
||||
justifyContent: 'center',
|
||||
}}
|
||||
>
|
||||
<a href="#">
|
||||
<img
|
||||
src="https://img.shields.io/badge/Python-333333?logo=python&logoColor=white&labelColor=333333"
|
||||
alt="Python"
|
||||
/>
|
||||
</a>
|
||||
<a href="#">
|
||||
<img
|
||||
src="https://img.shields.io/badge/macOS-000000?logo=apple&logoColor=F0F0F0"
|
||||
alt="macOS"
|
||||
/>
|
||||
</a>
|
||||
<a href="https://discord.com/invite/mVnXXpdE85">
|
||||
<img
|
||||
src="https://img.shields.io/badge/Discord-%235865F2.svg?&logo=discord&logoColor=white"
|
||||
alt="Discord"
|
||||
/>
|
||||
</a>
|
||||
<a href="https://pypi.org/project/cua-computer/">
|
||||
<img
|
||||
src="https://img.shields.io/pypi/v/cua-computer?color=333333"
|
||||
alt="PyPI"
|
||||
/>
|
||||
</a>
|
||||
</div>
|
||||
|
||||
**cua-agent** is a general Computer-Use framework for running multi-app agentic workflows targeting macOS and Linux sandbox created with Cua, supporting local (Ollama) and cloud model providers (OpenAI, Anthropic, Groq, DeepSeek, Qwen).
|
||||
@@ -14,7 +43,7 @@ title: Agent
|
||||
### Get started with Agent
|
||||
|
||||
<div align="center">
|
||||
<img src="./agent.png"/>
|
||||
<img src="/img/agent.png" />
|
||||
</div>
|
||||
|
||||
## Install
|
||||
@@ -80,7 +109,7 @@ Refer to these notebooks for step-by-step guides on how to use the Computer-Use
|
||||
The agent includes a Gradio-based user interface for easier interaction.
|
||||
|
||||
<div align="center">
|
||||
<img src="./agent_gradio_ui.png"/>
|
||||
<img src="/img/agent_gradio_ui.png" />
|
||||
</div>
|
||||
|
||||
To use it:
|
||||
@@ -119,14 +148,16 @@ Without these environment variables, the UI will show "No models available" for
|
||||
|
||||
### Using Local Models
|
||||
|
||||
You can use local models with the OMNI loop provider by selecting "Custom model..." from the dropdown. The default provider URL is set to `http://localhost:1234/v1` which works with LM Studio.
|
||||
You can use local models with the OMNI loop provider by selecting "Custom model..." from the dropdown. The default provider URL is set to `http://localhost:1234/v1` which works with LM Studio.
|
||||
|
||||
If you're using a different local model server:
|
||||
|
||||
- vLLM: `http://localhost:8000/v1`
|
||||
- LocalAI: `http://localhost:8080/v1`
|
||||
- Ollama with OpenAI compat API: `http://localhost:11434/v1`
|
||||
|
||||
The Gradio UI provides:
|
||||
|
||||
- Selection of different agent loops (OpenAI, Anthropic, OMNI)
|
||||
- Model selection for each provider
|
||||
- Configuration of agent parameters
|
||||
@@ -137,6 +168,7 @@ The Gradio UI provides:
|
||||
The UI-TARS models are available in two forms:
|
||||
|
||||
1. **MLX UI-TARS models** (Default): These models run locally using MLXVLM provider
|
||||
|
||||
- `mlx-community/UI-TARS-1.5-7B-4bit` (default) - 4-bit quantized version
|
||||
- `mlx-community/UI-TARS-1.5-7B-6bit` - 6-bit quantized version for higher quality
|
||||
|
||||
@@ -149,14 +181,15 @@ The UI-TARS models are available in two forms:
|
||||
```
|
||||
|
||||
2. **OpenAI-compatible UI-TARS**: For using the original ByteDance model
|
||||
|
||||
- If you want to use the original ByteDance UI-TARS model via an OpenAI-compatible API, follow the [deployment guide](https://github.com/bytedance/UI-TARS/blob/main/README_deploy.md)
|
||||
- This will give you a provider URL like `https://**************.us-east-1.aws.endpoints.huggingface.cloud/v1` which you can use in the code or Gradio UI:
|
||||
|
||||
```python
|
||||
```python
|
||||
agent = ComputerAgent(
|
||||
computer=macos_computer,
|
||||
loop=AgentLoop.UITARS,
|
||||
model=LLM(provider=LLMProvider.OAICOMPAT, name="tgi",
|
||||
model=LLM(provider=LLMProvider.OAICOMPAT, name="tgi",
|
||||
provider_base_url="https://**************.us-east-1.aws.endpoints.huggingface.cloud/v1")
|
||||
)
|
||||
```
|
||||
@@ -165,14 +198,15 @@ The UI-TARS models are available in two forms:
|
||||
|
||||
The `cua-agent` package provides three agent loops variations, based on different CUA models providers and techniques:
|
||||
|
||||
| Agent Loop | Supported Models | Description | Set-Of-Marks |
|
||||
|:-----------|:-----------------|:------------|:-------------|
|
||||
| `AgentLoop.OPENAI` | • `computer_use_preview` | Use OpenAI Operator CUA model | Not Required |
|
||||
| `AgentLoop.ANTHROPIC` | • `claude-3-5-sonnet-20240620`<br/>• `claude-3-7-sonnet-20250219` | Use Anthropic Computer-Use | Not Required |
|
||||
| `AgentLoop.UITARS` | • `mlx-community/UI-TARS-1.5-7B-4bit` (default)<br/>• `mlx-community/UI-TARS-1.5-7B-6bit`<br/>• `ByteDance-Seed/UI-TARS-1.5-7B` (via openAI-compatible endpoint) | Uses UI-TARS models with MLXVLM (default) or OAICOMPAT providers | Not Required |
|
||||
| `AgentLoop.OMNI` | • `claude-3-5-sonnet-20240620`<br/>• `claude-3-7-sonnet-20250219`<br/>• `gpt-4.5-preview`<br/>• `gpt-4o`<br/>• `gpt-4`<br/>• `phi4`<br/>• `phi4-mini`<br/>• `gemma3`<br/>• `...`<br/>• `Any Ollama or OpenAI-compatible model` | Use OmniParser for element pixel-detection (SoM) and any VLMs for UI Grounding and Reasoning | OmniParser |
|
||||
| Agent Loop | Supported Models | Description | Set-Of-Marks |
|
||||
| :-------------------- | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :------------------------------------------------------------------------------------------- | :----------- |
|
||||
| `AgentLoop.OPENAI` | • `computer_use_preview` | Use OpenAI Operator CUA model | Not Required |
|
||||
| `AgentLoop.ANTHROPIC` | • `claude-3-5-sonnet-20240620`<br/>• `claude-3-7-sonnet-20250219` | Use Anthropic Computer-Use | Not Required |
|
||||
| `AgentLoop.UITARS` | • `mlx-community/UI-TARS-1.5-7B-4bit` (default)<br/>• `mlx-community/UI-TARS-1.5-7B-6bit`<br/>• `ByteDance-Seed/UI-TARS-1.5-7B` (via openAI-compatible endpoint) | Uses UI-TARS models with MLXVLM (default) or OAICOMPAT providers | Not Required |
|
||||
| `AgentLoop.OMNI` | • `claude-3-5-sonnet-20240620`<br/>• `claude-3-7-sonnet-20250219`<br/>• `gpt-4.5-preview`<br/>• `gpt-4o`<br/>• `gpt-4`<br/>• `phi4`<br/>• `phi4-mini`<br/>• `gemma3`<br/>• `...`<br/>• `Any Ollama or OpenAI-compatible model` | Use OmniParser for element pixel-detection (SoM) and any VLMs for UI Grounding and Reasoning | OmniParser |
|
||||
|
||||
## AgentResponse
|
||||
|
||||
The `AgentResponse` class represents the structured output returned after each agent turn. It contains the agent's response, reasoning, tool usage, and other metadata. The response format aligns with the new [OpenAI Agent SDK specification](https://platform.openai.com/docs/api-reference/responses) for better consistency across different agent loops.
|
||||
|
||||
```python
|
||||
@@ -213,7 +247,7 @@ async for result in agent.run(task):
|
||||
|
||||
**Note on Settings Persistence:**
|
||||
|
||||
* The Gradio UI automatically saves your configuration (Agent Loop, Model Choice, Custom Base URL, Save Trajectory state, Recent Images count) to a file named `.gradio_settings.json` in the project's root directory when you successfully run a task.
|
||||
* This allows your preferences to persist between sessions.
|
||||
* API keys entered into the custom provider field are **not** saved in this file for security reasons. Manage API keys using environment variables (e.g., `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`) or a `.env` file.
|
||||
* It's recommended to add `.gradio_settings.json` to your `.gitignore` file.
|
||||
- The Gradio UI automatically saves your configuration (Agent Loop, Model Choice, Custom Base URL, Save Trajectory state, Recent Images count) to a file named `.gradio_settings.json` in the project's root directory when you successfully run a task.
|
||||
- This allows your preferences to persist between sessions.
|
||||
- API keys entered into the custom provider field are **not** saved in this file for security reasons. Manage API keys using environment variables (e.g., `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`) or a `.env` file.
|
||||
- It's recommended to add `.gradio_settings.json` to your `.gitignore` file.
|
||||
|
||||
@@ -2,11 +2,40 @@
|
||||
title: Computer Server
|
||||
---
|
||||
|
||||
<div align="center" style={{display: 'flex', gap: '10px', margin: '0 auto', width: '100%', justifyContent: 'center'}}>
|
||||
<a href="#"><img src="https://img.shields.io/badge/Python-333333?logo=python&logoColor=white&labelColor=333333" alt="Python" /></a>
|
||||
<a href="#"><img src="https://img.shields.io/badge/macOS-000000?logo=apple&logoColor=F0F0F0" alt="macOS" /></a>
|
||||
<a href="https://discord.com/invite/mVnXXpdE85"><img src="https://img.shields.io/badge/Discord-%235865F2.svg?&logo=discord&logoColor=white" alt="Discord" /></a>
|
||||
<a href="https://pypi.org/project/cua-computer-server/"><img src="https://img.shields.io/pypi/v/cua-computer-server?color=333333" alt="PyPI" /></a>
|
||||
<div
|
||||
align="center"
|
||||
style={{
|
||||
display: 'flex',
|
||||
gap: '10px',
|
||||
margin: '0 auto',
|
||||
width: '100%',
|
||||
justifyContent: 'center',
|
||||
}}
|
||||
>
|
||||
<a href="#">
|
||||
<img
|
||||
src="https://img.shields.io/badge/Python-333333?logo=python&logoColor=white&labelColor=333333"
|
||||
alt="Python"
|
||||
/>
|
||||
</a>
|
||||
<a href="#">
|
||||
<img
|
||||
src="https://img.shields.io/badge/macOS-000000?logo=apple&logoColor=F0F0F0"
|
||||
alt="macOS"
|
||||
/>
|
||||
</a>
|
||||
<a href="https://discord.com/invite/mVnXXpdE85">
|
||||
<img
|
||||
src="https://img.shields.io/badge/Discord-%235865F2.svg?&logo=discord&logoColor=white"
|
||||
alt="Discord"
|
||||
/>
|
||||
</a>
|
||||
<a href="https://pypi.org/project/cua-computer-server/">
|
||||
<img
|
||||
src="https://img.shields.io/pypi/v/cua-computer-server?color=333333"
|
||||
alt="PyPI"
|
||||
/>
|
||||
</a>
|
||||
</div>
|
||||
|
||||
**Computer Server** is the server component for the Computer-Use Interface (CUI) framework powering Cua for interacting with local macOS and Linux sandboxes, PyAutoGUI-compatible, and pluggable with any AI agent systems (Cua, Langchain, CrewAI, AutoGen).
|
||||
@@ -29,4 +58,4 @@ pip install cua-computer-server
|
||||
|
||||
Refer to this notebook for a step-by-step guide on how to use the Computer-Use Server on the host system or VM:
|
||||
|
||||
- [Computer-Use Server](https://github.com/trycua/cua/tree/main/notebooks/samples/computer_server_nb.ipynb)
|
||||
- [Computer-Use Server](https://github.com/trycua/cua/tree/main/notebooks/samples/computer_server_nb.ipynb)
|
||||
|
||||
@@ -2,11 +2,40 @@
|
||||
title: Computer
|
||||
---
|
||||
|
||||
<div align="center" style={{display: 'flex', gap: '10px', margin: '0 auto', width: '100%', justifyContent: 'center'}}>
|
||||
<a href="#"><img src="https://img.shields.io/badge/Python-333333?logo=python&logoColor=white&labelColor=333333" alt="Python" /></a>
|
||||
<a href="#"><img src="https://img.shields.io/badge/macOS-000000?logo=apple&logoColor=F0F0F0" alt="macOS" /></a>
|
||||
<a href="https://discord.com/invite/mVnXXpdE85"><img src="https://img.shields.io/badge/Discord-%235865F2.svg?&logo=discord&logoColor=white" alt="Discord" /></a>
|
||||
<a href="https://pypi.org/project/cua-computer/"><img src="https://img.shields.io/pypi/v/cua-computer?color=333333" alt="PyPI" /></a>
|
||||
<div
|
||||
align="center"
|
||||
style={{
|
||||
display: 'flex',
|
||||
gap: '10px',
|
||||
margin: '0 auto',
|
||||
width: '100%',
|
||||
justifyContent: 'center',
|
||||
}}
|
||||
>
|
||||
<a href="#">
|
||||
<img
|
||||
src="https://img.shields.io/badge/Python-333333?logo=python&logoColor=white&labelColor=333333"
|
||||
alt="Python"
|
||||
/>
|
||||
</a>
|
||||
<a href="#">
|
||||
<img
|
||||
src="https://img.shields.io/badge/macOS-000000?logo=apple&logoColor=F0F0F0"
|
||||
alt="macOS"
|
||||
/>
|
||||
</a>
|
||||
<a href="https://discord.com/invite/mVnXXpdE85">
|
||||
<img
|
||||
src="https://img.shields.io/badge/Discord-%235865F2.svg?&logo=discord&logoColor=white"
|
||||
alt="Discord"
|
||||
/>
|
||||
</a>
|
||||
<a href="https://pypi.org/project/cua-computer/">
|
||||
<img
|
||||
src="https://img.shields.io/pypi/v/cua-computer?color=333333"
|
||||
alt="PyPI"
|
||||
/>
|
||||
</a>
|
||||
</div>
|
||||
|
||||
**cua-computer** is a Computer-Use Interface (CUI) framework powering Cua for interacting with local macOS and Linux sandboxes, PyAutoGUI-compatible, and pluggable with any AI agent systems (Cua, Langchain, CrewAI, AutoGen). Computer relies on [Lume](https://github.com/trycua/lume) for creating and managing sandbox environments.
|
||||
@@ -14,7 +43,7 @@ title: Computer
|
||||
### Get started with Computer
|
||||
|
||||
<div align="center">
|
||||
<img src="./computer.png"/>
|
||||
<img src="/img/computer.png" />
|
||||
</div>
|
||||
|
||||
```python
|
||||
@@ -23,11 +52,11 @@ from computer import Computer
|
||||
computer = Computer(os_type="macos", display="1024x768", memory="8GB", cpu="4")
|
||||
try:
|
||||
await computer.run()
|
||||
|
||||
|
||||
screenshot = await computer.interface.screenshot()
|
||||
with open("screenshot.png", "wb") as f:
|
||||
f.write(screenshot)
|
||||
|
||||
|
||||
await computer.interface.move_cursor(100, 100)
|
||||
await computer.interface.left_click()
|
||||
await computer.interface.right_click(300, 300)
|
||||
@@ -100,8 +129,12 @@ For examples, see [Computer UI Examples](https://github.com/trycua/cua/tree/main
|
||||
#### 3. Record Your Tasks
|
||||
|
||||
<details open>
|
||||
<summary>View demonstration video</summary>
|
||||
<video src="https://github.com/user-attachments/assets/de3c3477-62fe-413c-998d-4063e48de176" controls width="600"></video>
|
||||
<summary>View demonstration video</summary>
|
||||
<video
|
||||
src="https://github.com/user-attachments/assets/de3c3477-62fe-413c-998d-4063e48de176"
|
||||
controls
|
||||
width="600"
|
||||
></video>
|
||||
</details>
|
||||
|
||||
Record yourself performing various computer tasks using the UI.
|
||||
@@ -109,8 +142,12 @@ Record yourself performing various computer tasks using the UI.
|
||||
#### 4. Save Your Demonstrations
|
||||
|
||||
<details open>
|
||||
<summary>View demonstration video</summary>
|
||||
<video src="https://github.com/user-attachments/assets/5ad1df37-026a-457f-8b49-922ae805faef" controls width="600"></video>
|
||||
<summary>View demonstration video</summary>
|
||||
<video
|
||||
src="https://github.com/user-attachments/assets/5ad1df37-026a-457f-8b49-922ae805faef"
|
||||
controls
|
||||
width="600"
|
||||
></video>
|
||||
</details>
|
||||
|
||||
Save each task by picking a descriptive name and adding relevant tags (e.g., "office", "web-browsing", "coding").
|
||||
@@ -122,11 +159,16 @@ Repeat steps 3 and 4 until you have a good amount of demonstrations covering dif
|
||||
#### 6. Upload to Huggingface
|
||||
|
||||
<details open>
|
||||
<summary>View demonstration video</summary>
|
||||
<video src="https://github.com/user-attachments/assets/c586d460-3877-4b5f-a736-3248886d2134" controls width="600"></video>
|
||||
<summary>View demonstration video</summary>
|
||||
<video
|
||||
src="https://github.com/user-attachments/assets/c586d460-3877-4b5f-a736-3248886d2134"
|
||||
controls
|
||||
width="600"
|
||||
></video>
|
||||
</details>
|
||||
|
||||
Upload your dataset to Huggingface by:
|
||||
|
||||
- Naming it as `{your_username}/{dataset_name}`
|
||||
- Choosing public or private visibility
|
||||
- Optionally selecting specific tags to upload only tasks with certain tags
|
||||
@@ -135,4 +177,3 @@ Upload your dataset to Huggingface by:
|
||||
|
||||
- Example Dataset: [ddupont/test-dataset](https://huggingface.co/datasets/ddupont/test-dataset)
|
||||
- Find Community Datasets: 🔍 [Browse CUA Datasets on Huggingface](https://huggingface.co/datasets?other=cua)
|
||||
|
||||
|
||||
51
docs/content/docs/libraries/core/index.mdx
Normal file
@@ -0,0 +1,51 @@
|
||||
---
|
||||
title: c/ua Core
|
||||
---
|
||||
|
||||
<div
|
||||
align="center"
|
||||
style={{
|
||||
display: 'flex',
|
||||
gap: '10px',
|
||||
margin: '0 auto',
|
||||
width: '100%',
|
||||
justifyContent: 'center',
|
||||
}}
|
||||
>
|
||||
<a href="#">
|
||||
<img
|
||||
src="https://img.shields.io/badge/Python-333333?logo=python&logoColor=white&labelColor=333333"
|
||||
alt="Python"
|
||||
/>
|
||||
</a>
|
||||
<a href="#">
|
||||
<img
|
||||
src="https://img.shields.io/badge/macOS-000000?logo=apple&logoColor=F0F0F0"
|
||||
alt="macOS"
|
||||
/>
|
||||
</a>
|
||||
<a href="https://discord.com/invite/mVnXXpdE85">
|
||||
<img
|
||||
src="https://img.shields.io/badge/Discord-%235865F2.svg?&logo=discord&logoColor=white"
|
||||
alt="Discord"
|
||||
/>
|
||||
</a>
|
||||
<a href="https://pypi.org/project/cua-computer-server/">
|
||||
<img
|
||||
src="https://img.shields.io/pypi/v/cua-computer-server?color=333333"
|
||||
alt="PyPI"
|
||||
/>
|
||||
</a>
|
||||
</div>
|
||||
|
||||
**Cua Core** provides essential shared functionality and utilities used across the Cua ecosystem:
|
||||
|
||||
- Privacy-focused telemetry system for transparent usage analytics
|
||||
- Common helper functions and utilities used by other Cua packages
|
||||
- Core infrastructure components shared between modules
|
||||
|
||||
## Installation
|
||||
|
||||
```bash
|
||||
pip install cua-core
|
||||
```
|
||||
@@ -5,4 +5,4 @@ description: Libraries
|
||||
|
||||
## Libraries
|
||||
|
||||
The CUA project provides several libraries for building Computer-Use AI agents.
|
||||
The CUA project provides several libraries for building Computer-Use AI agents.
|
||||
|
||||
@@ -23,6 +23,7 @@ curl --connect-timeout 6000 \
|
||||
}' \
|
||||
http://localhost:7777/lume/vms
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
<details open>
|
||||
@@ -53,6 +54,7 @@ curl --connect-timeout 6000 \
|
||||
}' \
|
||||
http://localhost:7777/lume/vms/lume_vm/run
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
<details open>
|
||||
@@ -63,6 +65,7 @@ curl --connect-timeout 6000 \
|
||||
--max-time 5000 \
|
||||
http://localhost:7777/lume/vms
|
||||
```
|
||||
|
||||
```
|
||||
[
|
||||
{
|
||||
@@ -83,6 +86,7 @@ curl --connect-timeout 6000 \
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
<details open>
|
||||
@@ -99,6 +103,7 @@ curl --connect-timeout 6000 \
|
||||
--max-time 5000 \
|
||||
http://localhost:7777/lume/vms/lume_vm?storage=ssd
|
||||
```
|
||||
|
||||
```
|
||||
{
|
||||
"name": "lume_vm",
|
||||
@@ -109,6 +114,7 @@ curl --connect-timeout 6000 \
|
||||
"diskSize": "64GB"
|
||||
}
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
<details open>
|
||||
@@ -127,6 +133,7 @@ curl --connect-timeout 6000 \
|
||||
}' \
|
||||
http://localhost:7777/lume/vms/my-vm-name
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
<details open>
|
||||
@@ -145,6 +152,7 @@ curl --connect-timeout 6000 \
|
||||
-X POST \
|
||||
http://localhost:7777/lume/vms/my-vm-name/stop?storage=ssd
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
<details open>
|
||||
@@ -163,6 +171,7 @@ curl --connect-timeout 6000 \
|
||||
-X DELETE \
|
||||
http://localhost:7777/lume/vms/my-vm-name?storage=ssd
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
<details open>
|
||||
@@ -194,6 +203,7 @@ curl --connect-timeout 6000 \
|
||||
}' \
|
||||
http://localhost:7777/lume/pull
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
<details open>
|
||||
@@ -206,15 +216,15 @@ curl --connect-timeout 6000 \
|
||||
-X POST \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"name": "my-local-vm",
|
||||
"name": "my-local-vm",
|
||||
"imageName": "my-image",
|
||||
"tags": ["latest", "v1"],
|
||||
"organization": "my-org",
|
||||
"organization": "my-org",
|
||||
"registry": "ghcr.io",
|
||||
"chunkSizeMb": 512,
|
||||
"storage": null
|
||||
"storage": null
|
||||
}' \
|
||||
http://localhost:7777/lume/vms/push
|
||||
http://localhost:7777/lume/vms/push
|
||||
```
|
||||
|
||||
**Response (202 Accepted):**
|
||||
@@ -224,12 +234,10 @@ curl --connect-timeout 6000 \
|
||||
"message": "Push initiated in background",
|
||||
"name": "my-local-vm",
|
||||
"imageName": "my-image",
|
||||
"tags": [
|
||||
"latest",
|
||||
"v1"
|
||||
]
|
||||
"tags": ["latest", "v1"]
|
||||
}
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
<details open>
|
||||
@@ -248,6 +256,7 @@ curl --connect-timeout 6000 \
|
||||
}' \
|
||||
http://localhost:7777/lume/vms/clone
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
<details open>
|
||||
@@ -258,6 +267,7 @@ curl --connect-timeout 6000 \
|
||||
--max-time 5000 \
|
||||
http://localhost:7777/lume/ipsw
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
<details open>
|
||||
@@ -272,12 +282,10 @@ curl --connect-timeout 6000 \
|
||||
|
||||
```json
|
||||
{
|
||||
"local": [
|
||||
"macos-sequoia-xcode:latest",
|
||||
"macos-sequoia-vanilla:latest"
|
||||
]
|
||||
"local": ["macos-sequoia-xcode:latest", "macos-sequoia-vanilla:latest"]
|
||||
}
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
<details open>
|
||||
@@ -289,6 +297,7 @@ curl --connect-timeout 6000 \
|
||||
-X POST \
|
||||
http://localhost:7777/lume/prune
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
<details open>
|
||||
@@ -307,6 +316,7 @@ curl --connect-timeout 6000 \
|
||||
"cachingEnabled": true
|
||||
}
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
<details open>
|
||||
@@ -324,6 +334,7 @@ curl --connect-timeout 6000 \
|
||||
}' \
|
||||
http://localhost:7777/lume/config
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
<details open>
|
||||
@@ -349,6 +360,7 @@ curl --connect-timeout 6000 \
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
<details open>
|
||||
@@ -365,6 +377,7 @@ curl --connect-timeout 6000 \
|
||||
}' \
|
||||
http://localhost:7777/lume/config/locations
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
<details open>
|
||||
@@ -376,6 +389,7 @@ curl --connect-timeout 6000 \
|
||||
-X DELETE \
|
||||
http://localhost:7777/lume/config/locations/ssd
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
<details open>
|
||||
@@ -387,4 +401,5 @@ curl --connect-timeout 6000 \
|
||||
-X POST \
|
||||
http://localhost:7777/lume/config/locations/default/ssd
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
@@ -1,6 +1,7 @@
|
||||
---
|
||||
title: Development Guide
|
||||
---
|
||||
|
||||
# Development Guide
|
||||
|
||||
This guide will help you set up your development environment and understand the process for contributing code to lume.
|
||||
@@ -8,6 +9,7 @@ This guide will help you set up your development environment and understand the
|
||||
## Environment Setup
|
||||
|
||||
Lume development requires:
|
||||
|
||||
- Swift 6 or higher
|
||||
- Xcode 15 or higher
|
||||
- macOS Sequoia 15.2 or higher
|
||||
@@ -16,7 +18,7 @@ Lume development requires:
|
||||
## Setting Up the Repository Locally
|
||||
|
||||
1. **Fork the Repository**: Create your own fork of lume
|
||||
2. **Clone the Repository**:
|
||||
2. **Clone the Repository**:
|
||||
```bash
|
||||
git clone https://github.com/trycua/lume.git
|
||||
cd lume
|
||||
|
||||
@@ -1,6 +1,7 @@
|
||||
---
|
||||
title: FAQs
|
||||
---
|
||||
|
||||
# FAQs
|
||||
|
||||
### Where are the VMs stored?
|
||||
@@ -18,10 +19,12 @@ Lume follows the XDG Base Directory specification for the configuration file:
|
||||
- Configuration is stored in `$XDG_CONFIG_HOME/lume/config.yaml` (defaults to `~/.config/lume/config.yaml`)
|
||||
|
||||
By default, other data is stored in:
|
||||
|
||||
- VM data: `~/.lume`
|
||||
- Cache files: `~/.lume/cache`
|
||||
|
||||
The config file contains settings for:
|
||||
|
||||
- VM storage locations and the default location
|
||||
- Cache directory location
|
||||
- Whether caching is enabled
|
||||
@@ -89,6 +92,7 @@ lume delete <name>
|
||||
### How to Install macOS from an IPSW Image
|
||||
|
||||
#### Create a new macOS VM using the latest supported IPSW image:
|
||||
|
||||
Run the following command to create a new macOS virtual machine using the latest available IPSW image:
|
||||
|
||||
```bash
|
||||
@@ -96,6 +100,7 @@ lume create <name> --os macos --ipsw latest
|
||||
```
|
||||
|
||||
#### Create a new macOS VM using a specific IPSW image:
|
||||
|
||||
To create a macOS virtual machine from an older or specific IPSW file, first download the desired IPSW (UniversalMac) from a trusted source.
|
||||
|
||||
Then, use the downloaded IPSW path:
|
||||
|
||||
@@ -2,22 +2,44 @@
|
||||
title: Lume
|
||||
---
|
||||
|
||||
<div align="center" style={{display: 'flex', gap: '10px', margin: '0 auto', width: '100%', justifyContent: 'center'}}>
|
||||
<a href="#"><img src="https://img.shields.io/badge/Swift_6-F54A2A?logo=swift&logoColor=white&labelColor=F54A2A" alt="Swift 6" /></a>
|
||||
<a href="#"><img src="https://img.shields.io/badge/macOS-000000?logo=apple&logoColor=F0F0F0" alt="macOS" /></a>
|
||||
<a href="https://discord.com/invite/mVnXXpdE85"><img src="https://img.shields.io/badge/Discord-%235865F2.svg?&logo=discord&logoColor=white" alt="Discord" /></a>
|
||||
<div
|
||||
align="center"
|
||||
style={{
|
||||
display: 'flex',
|
||||
gap: '10px',
|
||||
margin: '0 auto',
|
||||
width: '100%',
|
||||
justifyContent: 'center',
|
||||
}}
|
||||
>
|
||||
<a href="#">
|
||||
<img
|
||||
src="https://img.shields.io/badge/Swift_6-F54A2A?logo=swift&logoColor=white&labelColor=F54A2A"
|
||||
alt="Swift 6"
|
||||
/>
|
||||
</a>
|
||||
<a href="#">
|
||||
<img
|
||||
src="https://img.shields.io/badge/macOS-000000?logo=apple&logoColor=F0F0F0"
|
||||
alt="macOS"
|
||||
/>
|
||||
</a>
|
||||
<a href="https://discord.com/invite/mVnXXpdE85">
|
||||
<img
|
||||
src="https://img.shields.io/badge/Discord-%235865F2.svg?&logo=discord&logoColor=white"
|
||||
alt="Discord"
|
||||
/>
|
||||
</a>
|
||||
</div>
|
||||
|
||||
|
||||
**lume** is a lightweight Command Line Interface and local API server to create, run and manage macOS and Linux virtual machines (VMs) with near-native performance on Apple Silicon, using Apple's `Virtualization.Framework`.
|
||||
|
||||
### Run prebuilt macOS images in just 1 step
|
||||
|
||||
<div align="center">
|
||||
<img src="./cli.png" alt="lume cli"/>
|
||||
<img src="/img/cli.png" alt="lume cli" />
|
||||
</div>
|
||||
|
||||
|
||||
```bash
|
||||
lume run macos-sequoia-vanilla:latest
|
||||
```
|
||||
@@ -30,6 +52,7 @@ If you're working on Lume in the context of the CUA monorepo, we recommend using
|
||||
# Open VS Code workspace from the root of the monorepo
|
||||
code .vscode/lume.code-workspace
|
||||
```
|
||||
|
||||
This workspace is preconfigured with Swift language support, build tasks, and debug configurations.
|
||||
|
||||
## Usage
|
||||
@@ -153,7 +176,7 @@ You can also download the `lume.pkg.tar.gz` archive from the [latest release](ht
|
||||
|
||||
## Prebuilt Images
|
||||
|
||||
Pre-built images are available in the registry [ghcr.io/trycua](https://github.com/orgs/trycua/packages).
|
||||
Pre-built images are available in the registry [ghcr.io/trycua](https://github.com/orgs/trycua/packages).
|
||||
|
||||
**Important Note (v0.2.0+):** Images are being re-uploaded with sparse file system optimizations enabled, resulting in significantly lower actual disk usage. Older images (without the `-sparse` suffix) are now **deprecated**. The last version of `lume` fully supporting the non-sparse images was `v0.1.x`. Starting from `v0.2.0`, lume will automatically pull images optimized with sparse file system support.
|
||||
|
||||
@@ -161,17 +184,17 @@ These images come with an SSH server pre-configured and auto-login enabled.
|
||||
|
||||
For the security of your VM, change the default password `lume` immediately after your first login.
|
||||
|
||||
| Image | Tag | Description | Logical Size |
|
||||
|-------|------------|-------------|------|
|
||||
| `macos-sequoia-vanilla` | `latest`, `15.2` | macOS Sequoia 15.2 image | 20GB |
|
||||
| `macos-sequoia-xcode` | `latest`, `15.2` | macOS Sequoia 15.2 image with Xcode command line tools | 22GB |
|
||||
| `macos-sequoia-cua` | `latest`, `15.3` | macOS Sequoia 15.3 image compatible with the Computer interface | 24GB |
|
||||
| `ubuntu-noble-vanilla` | `latest`, `24.04.1` | [Ubuntu Server for ARM 24.04.1 LTS](https://ubuntu.com/download/server/arm) with Ubuntu Desktop | 20GB |
|
||||
| Image | Tag | Description | Logical Size |
|
||||
| ----------------------- | ------------------- | ----------------------------------------------------------------------------------------------- | ------------ |
|
||||
| `macos-sequoia-vanilla` | `latest`, `15.2` | macOS Sequoia 15.2 image | 20GB |
|
||||
| `macos-sequoia-xcode` | `latest`, `15.2` | macOS Sequoia 15.2 image with Xcode command line tools | 22GB |
|
||||
| `macos-sequoia-cua` | `latest`, `15.3` | macOS Sequoia 15.3 image compatible with the Computer interface | 24GB |
|
||||
| `ubuntu-noble-vanilla` | `latest`, `24.04.1` | [Ubuntu Server for ARM 24.04.1 LTS](https://ubuntu.com/download/server/arm) with Ubuntu Desktop | 20GB |
|
||||
|
||||
For additional disk space, resize the VM disk after pulling the image using the `lume set <name> --disk-size <size>` command. Note that the actual disk space used by sparse images will be much lower than the logical size listed.
|
||||
|
||||
## Local API Server
|
||||
|
||||
|
||||
`lume` exposes a local HTTP API server that listens on `http://localhost:7777/lume`, enabling automated management of VMs.
|
||||
|
||||
```bash
|
||||
|
||||
@@ -2,19 +2,48 @@
|
||||
title: Lumier
|
||||
---
|
||||
|
||||
<div align="center" style={{display: 'flex', gap: '10px', margin: '0 auto', width: '100%', justifyContent: 'center'}}>
|
||||
<a href="#"><img src="https://img.shields.io/badge/Swift_6-F54A2A?logo=swift&logoColor=white&labelColor=F54A2A" alt="Swift 6" /></a>
|
||||
<a href="#"><img src="https://img.shields.io/badge/macOS-000000?logo=apple&logoColor=F0F0F0" alt="macOS" /></a>
|
||||
<a href="https://discord.com/invite/mVnXXpdE85"><img src="https://img.shields.io/badge/Discord-%235865F2.svg?&logo=discord&logoColor=white" alt="Discord" /></a>
|
||||
<div
|
||||
align="center"
|
||||
style={{
|
||||
display: 'flex',
|
||||
gap: '10px',
|
||||
margin: '0 auto',
|
||||
width: '100%',
|
||||
justifyContent: 'center',
|
||||
}}
|
||||
>
|
||||
<a href="#">
|
||||
<img
|
||||
src="https://img.shields.io/badge/Swift_6-F54A2A?logo=swift&logoColor=white&labelColor=F54A2A"
|
||||
alt="Swift 6"
|
||||
/>
|
||||
</a>
|
||||
<a href="#">
|
||||
<img
|
||||
src="https://img.shields.io/badge/macOS-000000?logo=apple&logoColor=F0F0F0"
|
||||
alt="macOS"
|
||||
/>
|
||||
</a>
|
||||
<a href="https://discord.com/invite/mVnXXpdE85">
|
||||
<img
|
||||
src="https://img.shields.io/badge/Discord-%235865F2.svg?&logo=discord&logoColor=white"
|
||||
alt="Discord"
|
||||
/>
|
||||
</a>
|
||||
</div>
|
||||
|
||||
macOS and Linux virtual machines in a Docker container.
|
||||
|
||||
<div align="center">
|
||||
<video src="https://github.com/user-attachments/assets/2ecca01c-cb6f-4c35-a5a7-69bc58bd94e2" width="800" controls></video>
|
||||
<video
|
||||
src="https://github.com/user-attachments/assets/2ecca01c-cb6f-4c35-a5a7-69bc58bd94e2"
|
||||
width="800"
|
||||
controls
|
||||
></video>
|
||||
</div>
|
||||
|
||||
## What is Lumier?
|
||||
|
||||
**Lumier** is an interface for running macOS virtual machines with minimal setup. It uses Docker as a packaging system to deliver a pre-configured environment that connects to the `lume` virtualization service running on your host machine. With Lumier, you get:
|
||||
|
||||
- A ready-to-use macOS or Linux virtual machine in minutes
|
||||
@@ -29,6 +58,7 @@ Before using Lumier, make sure you have:
|
||||
1. **Docker for Apple Silicon** - download it [here](https://desktop.docker.com/mac/main/arm64/Docker.dmg) and follow the installation instructions.
|
||||
|
||||
2. **Lume** - This is the virtualization CLI that powers Lumier. Install it with this command:
|
||||
|
||||
```bash
|
||||
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/trycua/cua/main/libs/lume/scripts/install.sh)"
|
||||
```
|
||||
@@ -160,10 +190,10 @@ services:
|
||||
container_name: lumier-vm
|
||||
restart: unless-stopped
|
||||
ports:
|
||||
- "8006:8006" # Port for VNC access
|
||||
- '8006:8006' # Port for VNC access
|
||||
volumes:
|
||||
- ./storage:/storage # VM persistent storage
|
||||
- ./shared:/shared # Shared folder accessible in the VM
|
||||
- ./storage:/storage # VM persistent storage
|
||||
- ./shared:/shared # Shared folder accessible in the VM
|
||||
environment:
|
||||
- VM_NAME=lumier-vm
|
||||
- VERSION=ghcr.io/trycua/macos-sequoia-cua:latest
|
||||
@@ -239,6 +269,7 @@ When running Lumier, you'll need to configure a few things:
|
||||
- **Port forwarding** (`-p 8006:8006`): Makes the VM's VNC interface accessible in your browser. If port 8006 is already in use, you can use a different port like `-p 8007:8006`.
|
||||
|
||||
- **Environment variables** (`-e`): Configure your VM settings:
|
||||
|
||||
- `VM_NAME`: A name for your virtual machine
|
||||
- `VERSION`: The macOS image to use
|
||||
- `CPU_CORES`: Number of CPU cores to allocate
|
||||
@@ -253,6 +284,7 @@ When running Lumier, you'll need to configure a few things:
|
||||
This project was inspired by [dockur/windows](https://github.com/dockur/windows) and [dockur/macos](https://github.com/dockur/macos), which pioneered the approach of running Windows and macOS VMs in Docker containers.
|
||||
|
||||
Main differences with dockur/macos:
|
||||
|
||||
- Lumier is specifically designed for macOS virtualization
|
||||
- Lumier supports Apple Silicon (M1/M2/M3/M4) while dockur/macos only supports Intel
|
||||
- Lumier uses the Apple Virtualization Framework (Vz) through the `lume` CLI to create true virtual machines, while dockur relies on KVM.
|
||||
|
||||
@@ -2,14 +2,44 @@
|
||||
title: MCP Server
|
||||
---
|
||||
|
||||
<div align="center" style={{display: 'flex', gap: '10px', margin: '0 auto', width: '100%', justifyContent: 'center'}}>
|
||||
<a href="#"><img src="https://img.shields.io/badge/Swift_6-F54A2A?logo=swift&logoColor=white&labelColor=F54A2A" alt="Swift 6" /></a>
|
||||
<a href="#"><img src="https://img.shields.io/badge/macOS-000000?logo=apple&logoColor=F0F0F0" alt="macOS" /></a>
|
||||
<a href="https://discord.com/invite/mVnXXpdE85"><img src="https://img.shields.io/badge/Discord-%235865F2.svg?&logo=discord&logoColor=white" alt="Discord" /></a>
|
||||
<a href="https://pypi.org/project/cua-computer/"><img src="https://img.shields.io/pypi/v/cua-computer?color=333333" alt="Python" /></a>
|
||||
<div
|
||||
align="center"
|
||||
style={{
|
||||
display: 'flex',
|
||||
gap: '10px',
|
||||
margin: '0 auto',
|
||||
width: '100%',
|
||||
justifyContent: 'center',
|
||||
}}
|
||||
>
|
||||
<a href="#">
|
||||
<img
|
||||
src="https://img.shields.io/badge/Swift_6-F54A2A?logo=swift&logoColor=white&labelColor=F54A2A"
|
||||
alt="Swift 6"
|
||||
/>
|
||||
</a>
|
||||
<a href="#">
|
||||
<img
|
||||
src="https://img.shields.io/badge/macOS-000000?logo=apple&logoColor=F0F0F0"
|
||||
alt="macOS"
|
||||
/>
|
||||
</a>
|
||||
<a href="https://discord.com/invite/mVnXXpdE85">
|
||||
<img
|
||||
src="https://img.shields.io/badge/Discord-%235865F2.svg?&logo=discord&logoColor=white"
|
||||
alt="Discord"
|
||||
/>
|
||||
</a>
|
||||
<a href="https://pypi.org/project/cua-computer/">
|
||||
<img
|
||||
src="https://img.shields.io/pypi/v/cua-computer?color=333333"
|
||||
alt="Python"
|
||||
/>
|
||||
</a>
|
||||
</div>
|
||||
|
||||
**cua-mcp-server** is a MCP server for the Computer-Use Agent (CUA), allowing you to run CUA through Claude Desktop or other MCP clients.
|
||||
|
||||
### Get started with Agent
|
||||
|
||||
## Prerequisites
|
||||
@@ -32,8 +62,9 @@ pip install cua-mcp-server
|
||||
```
|
||||
|
||||
This will install:
|
||||
|
||||
- The MCP server
|
||||
- CUA agent and computer dependencies
|
||||
- CUA agent and computer dependencies
|
||||
- An executable `cua-mcp-server` script in your PATH
|
||||
|
||||
## Easy Setup Script
|
||||
@@ -45,6 +76,7 @@ curl -fsSL https://raw.githubusercontent.com/trycua/cua/main/libs/mcp-server/scr
|
||||
```
|
||||
|
||||
This script will:
|
||||
|
||||
- Create the ~/.cua directory if it doesn't exist
|
||||
- Generate a startup script at ~/.cua/start_mcp_server.sh
|
||||
- Make the script executable
|
||||
@@ -53,7 +85,7 @@ This script will:
|
||||
You can then use the script in your MCP configuration like this:
|
||||
|
||||
```json
|
||||
{
|
||||
{
|
||||
"mcpServers": {
|
||||
"cua-agent": {
|
||||
"command": "/bin/bash",
|
||||
@@ -92,6 +124,7 @@ If you want to develop with the cua-mcp-server directly without installation, yo
|
||||
```
|
||||
|
||||
This configuration:
|
||||
|
||||
- Uses the start_mcp_server.sh script which automatically sets up the Python path and runs the server module
|
||||
- Works with Claude Desktop, Cursor, or any other MCP client
|
||||
- Automatically uses your development code without requiring installation
|
||||
@@ -103,6 +136,7 @@ Just add this to your MCP client's configuration and it will use your local deve
|
||||
If you get a `/bin/bash: ~/cua/libs/mcp-server/scripts/start_mcp_server.sh: No such file or directory` error, try changing the path to the script to be absolute instead of relative.
|
||||
|
||||
To see the logs:
|
||||
|
||||
```
|
||||
tail -n 20 -f ~/Library/Logs/Claude/mcp*.log
|
||||
```
|
||||
@@ -127,20 +161,21 @@ For more information on MCP with Cursor, see the [official Cursor MCP documentat
|
||||
### First-time Usage Notes
|
||||
|
||||
**API Keys**: Ensure you have valid API keys:
|
||||
- Add your Anthropic API key, or other model provider API key in the Claude Desktop config (as shown above)
|
||||
- Or set it as an environment variable in your shell profile
|
||||
|
||||
- Add your Anthropic API key, or other model provider API key in the Claude Desktop config (as shown above)
|
||||
- Or set it as an environment variable in your shell profile
|
||||
|
||||
## Configuration
|
||||
|
||||
The server is configured using environment variables (can be set in the Claude Desktop config):
|
||||
|
||||
| Variable | Description | Default |
|
||||
|----------|-------------|---------|
|
||||
| `CUA_AGENT_LOOP` | Agent loop to use (OPENAI, ANTHROPIC, UITARS, OMNI) | OMNI |
|
||||
| `CUA_MODEL_PROVIDER` | Model provider (ANTHROPIC, OPENAI, OLLAMA, OAICOMPAT) | ANTHROPIC |
|
||||
| `CUA_MODEL_NAME` | Model name to use | None (provider default) |
|
||||
| `CUA_PROVIDER_BASE_URL` | Base URL for provider API | None |
|
||||
| `CUA_MAX_IMAGES` | Maximum number of images to keep in context | 3 |
|
||||
| Variable | Description | Default |
|
||||
| ----------------------- | ----------------------------------------------------- | ----------------------- |
|
||||
| `CUA_AGENT_LOOP` | Agent loop to use (OPENAI, ANTHROPIC, UITARS, OMNI) | OMNI |
|
||||
| `CUA_MODEL_PROVIDER` | Model provider (ANTHROPIC, OPENAI, OLLAMA, OAICOMPAT) | ANTHROPIC |
|
||||
| `CUA_MODEL_NAME` | Model name to use | None (provider default) |
|
||||
| `CUA_PROVIDER_BASE_URL` | Base URL for provider API | None |
|
||||
| `CUA_MAX_IMAGES` | Maximum number of images to keep in context | 3 |
|
||||
|
||||
## Available Tools
|
||||
|
||||
@@ -158,4 +193,4 @@ Once configured, you can simply ask Claude to perform computer tasks:
|
||||
- "Find all PDFs in my Downloads folder"
|
||||
- "Take a screenshot and highlight the error message"
|
||||
|
||||
Claude will automatically use your CUA agent to perform these tasks.
|
||||
Claude will automatically use your CUA agent to perform these tasks.
|
||||
|
||||
@@ -2,21 +2,45 @@
|
||||
title: PyLume
|
||||
---
|
||||
|
||||
<div align="center" style={{display: 'flex', gap: '10px', margin: '0 auto', width: '100%', justifyContent: 'center'}}>
|
||||
<a href="#"><img src="https://img.shields.io/badge/Python-333333?logo=python&logoColor=white&labelColor=333333" alt="Python" /></a>
|
||||
<a href="#"><img src="https://img.shields.io/badge/macOS-000000?logo=apple&logoColor=F0F0F0" alt="macOS" /></a>
|
||||
<a href="https://discord.com/invite/mVnXXpdE85"><img src="https://img.shields.io/badge/Discord-%235865F2.svg?&logo=discord&logoColor=white" alt="Discord" /></a>
|
||||
<a href="https://pypi.org/project/pylume/"><img src="https://img.shields.io/pypi/v/pylume?color=333333" alt="PyPI" /></a>
|
||||
<div
|
||||
align="center"
|
||||
style={{
|
||||
display: 'flex',
|
||||
gap: '10px',
|
||||
margin: '0 auto',
|
||||
width: '100%',
|
||||
justifyContent: 'center',
|
||||
}}
|
||||
>
|
||||
<a href="#">
|
||||
<img
|
||||
src="https://img.shields.io/badge/Python-333333?logo=python&logoColor=white&labelColor=333333"
|
||||
alt="Python"
|
||||
/>
|
||||
</a>
|
||||
<a href="#">
|
||||
<img
|
||||
src="https://img.shields.io/badge/macOS-000000?logo=apple&logoColor=F0F0F0"
|
||||
alt="macOS"
|
||||
/>
|
||||
</a>
|
||||
<a href="https://discord.com/invite/mVnXXpdE85">
|
||||
<img
|
||||
src="https://img.shields.io/badge/Discord-%235865F2.svg?&logo=discord&logoColor=white"
|
||||
alt="Discord"
|
||||
/>
|
||||
</a>
|
||||
<a href="https://pypi.org/project/pylume/">
|
||||
<img src="https://img.shields.io/pypi/v/pylume?color=333333" alt="PyPI" />
|
||||
</a>
|
||||
</div>
|
||||
|
||||
|
||||
**pylume** is a lightweight Python library based on [lume](https://github.com/trycua/lume) to create, run and manage macOS and Linux virtual machines (VMs) natively on Apple Silicon.
|
||||
|
||||
<div align="center">
|
||||
<img src="img/py.png" alt="lume-py"/>
|
||||
<img src="/img/py.png" alt="lume-py" />
|
||||
</div>
|
||||
|
||||
|
||||
```bash
|
||||
pip install pylume
|
||||
```
|
||||
@@ -27,7 +51,7 @@ Please refer to this [Notebook](https://github.com/trycua/cua/blob/main/notebook
|
||||
|
||||
## Prebuilt Images
|
||||
|
||||
Pre-built images are available on [ghcr.io/trycua](https://github.com/orgs/trycua/packages).
|
||||
Pre-built images are available on [ghcr.io/trycua](https://github.com/orgs/trycua/packages).
|
||||
These images come pre-configured with an SSH server and auto-login enabled.
|
||||
|
||||
## Contributing
|
||||
|
||||
@@ -2,11 +2,40 @@
|
||||
title: Set-of-Mark
|
||||
---
|
||||
|
||||
<div align="center" style={{display: 'flex', gap: '10px', margin: '0 auto', width: '100%', justifyContent: 'center'}}>
|
||||
<a href="#"><img src="https://img.shields.io/badge/Python-333333?logo=python&logoColor=white&labelColor=333333" alt="Python" /></a>
|
||||
<a href="#"><img src="https://img.shields.io/badge/macOS-000000?logo=apple&logoColor=F0F0F0" alt="macOS" /></a>
|
||||
<a href="https://discord.com/invite/mVnXXpdE85"><img src="https://img.shields.io/badge/Discord-%235865F2.svg?&logo=discord&logoColor=white" alt="Discord" /></a>
|
||||
<a href="https://pypi.org/project/cua-computer/"><img src="https://img.shields.io/pypi/v/cua-computer?color=333333" alt="PyPI" /></a>
|
||||
<div
|
||||
align="center"
|
||||
style={{
|
||||
display: 'flex',
|
||||
gap: '10px',
|
||||
margin: '0 auto',
|
||||
width: '100%',
|
||||
justifyContent: 'center',
|
||||
}}
|
||||
>
|
||||
<a href="#">
|
||||
<img
|
||||
src="https://img.shields.io/badge/Python-333333?logo=python&logoColor=white&labelColor=333333"
|
||||
alt="Python"
|
||||
/>
|
||||
</a>
|
||||
<a href="#">
|
||||
<img
|
||||
src="https://img.shields.io/badge/macOS-000000?logo=apple&logoColor=F0F0F0"
|
||||
alt="macOS"
|
||||
/>
|
||||
</a>
|
||||
<a href="https://discord.com/invite/mVnXXpdE85">
|
||||
<img
|
||||
src="https://img.shields.io/badge/Discord-%235865F2.svg?&logo=discord&logoColor=white"
|
||||
alt="Discord"
|
||||
/>
|
||||
</a>
|
||||
<a href="https://pypi.org/project/cua-computer/">
|
||||
<img
|
||||
src="https://img.shields.io/pypi/v/cua-computer?color=333333"
|
||||
alt="PyPI"
|
||||
/>
|
||||
</a>
|
||||
</div>
|
||||
|
||||
**Som** (Set-of-Mark) is a visual grounding component for the Computer-Use Agent (CUA) framework powering Cua, for detecting and analyzing UI elements in screenshots. Optimized for macOS Silicon with Metal Performance Shaders (MPS), it combines YOLO-based icon detection with EasyOCR text recognition to provide comprehensive UI element analysis.
|
||||
@@ -27,7 +56,6 @@ title: Set-of-Mark
|
||||
- Uses Metal Performance Shaders (MPS)
|
||||
- Multi-scale detection enabled
|
||||
- ~0.4s average detection time
|
||||
|
||||
- **Supported**: Any Python 3.11+ environment
|
||||
- Falls back to CPU if no GPU available
|
||||
- Single-scale detection on CPU
|
||||
@@ -74,7 +102,9 @@ for elem in result.elements:
|
||||
### Detection Parameters
|
||||
|
||||
#### Box Threshold (0.3)
|
||||
|
||||
Controls the confidence threshold for accepting detections:
|
||||
|
||||
```
|
||||
High Threshold (0.3): Low Threshold (0.01):
|
||||
+----------------+ +----------------+
|
||||
@@ -86,12 +116,15 @@ High Threshold (0.3): Low Threshold (0.01):
|
||||
+----------------+ +----------------+
|
||||
conf = 0.85 conf = 0.02
|
||||
```
|
||||
|
||||
- Higher values (0.3) yield more precise but fewer detections
|
||||
- Lower values (0.01) catch more potential icons but increase false positives
|
||||
- Default is 0.3 for optimal precision/recall balance
|
||||
|
||||
#### IOU Threshold (0.1)
|
||||
|
||||
Controls how overlapping detections are merged:
|
||||
|
||||
```
|
||||
IOU = Intersection Area / Union Area
|
||||
|
||||
@@ -106,6 +139,7 @@ Low Overlap (Keep Both): High Overlap (Merge):
|
||||
+----------+
|
||||
IOU ≈ 0.05 (Keep Both) IOU ≈ 0.7 (Merge)
|
||||
```
|
||||
|
||||
- Lower values (0.1) more aggressively remove overlapping boxes
|
||||
- Higher values (0.5) allow more overlapping detections
|
||||
- Default is 0.1 to handle densely packed UI elements
|
||||
@@ -113,6 +147,7 @@ IOU ≈ 0.05 (Keep Both) IOU ≈ 0.7 (Merge)
|
||||
### OCR Configuration
|
||||
|
||||
- **Engine**: EasyOCR
|
||||
|
||||
- Primary choice for all platforms
|
||||
- Fast initialization and processing
|
||||
- Built-in English language support
|
||||
@@ -129,6 +164,7 @@ IOU ≈ 0.05 (Keep Both) IOU ≈ 0.7 (Merge)
|
||||
### Hardware Acceleration
|
||||
|
||||
#### MPS (Metal Performance Shaders)
|
||||
|
||||
- Multi-scale detection (640px, 1280px, 1920px)
|
||||
- Test-time augmentation enabled
|
||||
- Half-precision (FP16)
|
||||
@@ -136,6 +172,7 @@ IOU ≈ 0.05 (Keep Both) IOU ≈ 0.7 (Merge)
|
||||
- Best for production use when available
|
||||
|
||||
#### CPU
|
||||
|
||||
- Single-scale detection (1280px)
|
||||
- Full-precision (FP32)
|
||||
- Average detection time: ~1.3s
|
||||
@@ -160,11 +197,13 @@ examples/output/
|
||||
## Development
|
||||
|
||||
### Test Data
|
||||
|
||||
- Place test screenshots in `examples/test_data/`
|
||||
- Not tracked in git to keep repository size manageable
|
||||
- Default test image: `test_screen.png` (1920x1080)
|
||||
|
||||
### Running Tests
|
||||
|
||||
```bash
|
||||
# Run benchmark with no OCR
|
||||
python examples/omniparser_examples.py examples/test_data/test_screen.png --runs 5 --ocr none
|
||||
|
||||
|
Before Width: | Height: | Size: 1.1 MiB After Width: | Height: | Size: 1.1 MiB |
|
Before Width: | Height: | Size: 161 KiB After Width: | Height: | Size: 161 KiB |
|
Before Width: | Height: | Size: 374 KiB After Width: | Height: | Size: 374 KiB |
|
Before Width: | Height: | Size: 1.6 MiB After Width: | Height: | Size: 1.6 MiB |