Merge pull request #493 from YeIIcw/docs/mcp-server-locally

Add Local Desktop Mode for MCP Server with updated docs
This commit is contained in:
Adam
2025-11-17 14:56:10 +00:00
committed by GitHub
11 changed files with 914 additions and 15 deletions

View File

@@ -6,6 +6,67 @@ title: Client Integrations
To use with Claude Desktop, add an entry to your Claude Desktop configuration (`claude_desktop_config.json`, typically found in `~/.config/claude-desktop/`):
### Package Installation Method
```json
{
"mcpServers": {
"cua-agent": {
"command": "/bin/bash",
"args": ["~/.cua/start_mcp_server.sh"],
"env": {
"CUA_MODEL_NAME": "anthropic/claude-sonnet-4-20250514",
"ANTHROPIC_API_KEY": "your-anthropic-api-key-here",
"CUA_MAX_IMAGES": "3",
"CUA_USE_HOST_COMPUTER_SERVER": "false"
}
}
}
}
```
### Development Method
If you're working with the CUA source code:
**Standard VM Mode:**
```json
{
"mcpServers": {
"cua-agent": {
"command": "/usr/bin/env",
"args": [
"bash", "-lc",
"export CUA_MODEL_NAME='anthropic/claude-sonnet-4-20250514'; export ANTHROPIC_API_KEY='your-anthropic-api-key-here'; /path/to/cua/libs/python/mcp-server/scripts/start_mcp_server.sh"
]
}
}
}
```
**Host Computer Control Mode:**
```json
{
"mcpServers": {
"cua-agent": {
"command": "/usr/bin/env",
"args": [
"bash", "-lc",
"export CUA_MODEL_NAME='anthropic/claude-sonnet-4-20250514'; export ANTHROPIC_API_KEY='your-anthropic-api-key-here'; export CUA_USE_HOST_COMPUTER_SERVER='true'; export CUA_MAX_IMAGES='1'; /path/to/cua/libs/python/mcp-server/scripts/start_mcp_server.sh"
]
}
}
}
```
**Note**: Replace `/path/to/cua` with the absolute path to your CUA repository directory.
**⚠️ Host Computer Control Setup**: When using `CUA_USE_HOST_COMPUTER_SERVER='true'`, you must also:
1. Install computer server dependencies: `python3 -m pip install uvicorn fastapi`
2. Install the computer server: `python3 -m pip install -e libs/python/computer-server --break-system-packages`
3. Start the computer server: `python -m computer_server --log-level debug`
4. The AI will have direct access to your desktop - use with caution!
For more information on MCP with Claude Desktop, see the [official MCP User Guide](https://modelcontextprotocol.io/quickstart/user).
## Cursor Integration
@@ -15,6 +76,43 @@ To use with Cursor, add an MCP configuration file in one of these locations:
- **Project-specific**: Create `.cursor/mcp.json` in your project directory
- **Global**: Create `~/.cursor/mcp.json` in your home directory
Example configuration for Cursor:
```json
{
"mcpServers": {
"cua-agent": {
"command": "/bin/bash",
"args": ["~/.cua/start_mcp_server.sh"],
"env": {
"CUA_MODEL_NAME": "anthropic/claude-sonnet-4-20250514",
"ANTHROPIC_API_KEY": "your-anthropic-api-key-here"
}
}
}
}
```
After configuration, you can simply tell Cursor's Agent to perform computer tasks by explicitly mentioning the CUA agent, such as "Use the computer control tools to open Safari."
For more information on MCP with Cursor, see the [official Cursor MCP documentation](https://docs.cursor.com/context/model-context-protocol).
## Other MCP Clients
The MCP server is compatible with any MCP-compliant client. The server exposes the following tools:
- `run_cua_task` - Execute single computer tasks
- `run_multi_cua_tasks` - Execute multiple tasks (sequential or concurrent)
- `screenshot_cua` - Capture screenshots
- `get_session_stats` - Monitor session statistics
- `cleanup_session` - Manage session lifecycle
### Configuration Options
All MCP clients can configure the server using environment variables:
- `CUA_MODEL_NAME` - Model to use for task execution
- `CUA_MAX_IMAGES` - Maximum images to keep in context
- `CUA_USE_HOST_COMPUTER_SERVER` - Use host system instead of VM
See the [Configuration](/docs/libraries/mcp-server/configuration) page for detailed configuration options.

View File

@@ -4,7 +4,66 @@ title: Configuration
The server is configured using environment variables (can be set in the Claude Desktop config):
| Variable | Description | Default |
| ---------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------ |
| `CUA_MODEL_NAME` | Model string (e.g., "anthropic/claude-3-5-sonnet-20241022", "openai/computer-use-preview", "huggingface-local/ByteDance-Seed/UI-TARS-1.5-7B", "omniparser+litellm/gpt-4o", "omniparser+ollama_chat/gemma3") | anthropic/claude-3-5-sonnet-20241022 |
| `CUA_MAX_IMAGES` | Maximum number of images to keep in context | 3 |
| Variable | Description | Default |
|----------|-------------|---------|
| `CUA_MODEL_NAME` | Model string (e.g., "anthropic/claude-sonnet-4-20250514", "anthropic/claude-3-5-sonnet-20240620", "openai/computer-use-preview", "huggingface-local/ByteDance-Seed/UI-TARS-1.5-7B", "omniparser+litellm/gpt-4o", "omniparser+ollama_chat/gemma3") | anthropic/claude-sonnet-4-20250514 |
| `ANTHROPIC_API_KEY` | Your Anthropic API key (required for Anthropic models) | None |
| `CUA_MAX_IMAGES` | Maximum number of images to keep in context | 3 |
| `CUA_USE_HOST_COMPUTER_SERVER` | Target your local desktop instead of a VM. Set to "true" to use your host system. **Warning:** AI models may perform risky actions. | false |
## Model Configuration
The `CUA_MODEL_NAME` environment variable supports various model providers through LiteLLM integration:
### Supported Providers
- **Anthropic**: `anthropic/claude-sonnet-4-20250514`, `anthropic/claude-3-5-sonnet-20240620`, `anthropic/claude-3-haiku-20240307`
- **OpenAI**: `openai/computer-use-preview`, `openai/gpt-4o`
- **Local Models**: `huggingface-local/ByteDance-Seed/UI-TARS-1.5-7B`
- **Omni + LiteLLM**: `omniparser+litellm/gpt-4o`, `omniparser+litellm/claude-3-haiku`
- **Ollama**: `omniparser+ollama_chat/gemma3`
### Example Configurations
**Claude Desktop Configuration:**
```json
{
"mcpServers": {
"cua-agent": {
"command": "/bin/bash",
"args": ["~/.cua/start_mcp_server.sh"],
"env": {
"CUA_MODEL_NAME": "anthropic/claude-sonnet-4-20250514",
"ANTHROPIC_API_KEY": "your-anthropic-api-key-here",
"CUA_MAX_IMAGES": "5",
"CUA_USE_HOST_COMPUTER_SERVER": "false"
}
}
}
}
```
**Local Model Configuration:**
```json
{
"mcpServers": {
"cua-agent": {
"command": "/bin/bash",
"args": ["~/.cua/start_mcp_server.sh"],
"env": {
"CUA_MODEL_NAME": "huggingface-local/ByteDance-Seed/UI-TARS-1.5-7B",
"CUA_MAX_IMAGES": "3"
}
}
}
}
```
## Session Management Configuration
The MCP server automatically manages sessions with the following defaults:
- **Max Concurrent Sessions**: 10
- **Session Timeout**: 10 minutes of inactivity
- **Computer Pool Size**: 5 instances
- **Automatic Cleanup**: Enabled
These settings are optimized for typical usage and don't require configuration for most users.

View File

@@ -7,3 +7,21 @@ github:
---
**cua-mcp-server** is a MCP server for the Computer-Use Agent (CUA), allowing you to run CUA through Claude Desktop or other MCP clients.
## Features
- **Multi-Client Support**: Concurrent sessions with automatic resource management
- **Progress Reporting**: Real-time progress updates during task execution
- **Error Handling**: Robust error recovery with screenshot capture
- **Concurrent Execution**: Run multiple tasks in parallel for improved performance
- **Session Management**: Automatic cleanup and resource pooling
- **LiteLLM Integration**: Support for multiple model providers
- **VM Safety**: Default VM execution with optional host system control
## Quick Start
1. **Install**: `pip install cua-mcp-server`
2. **Configure**: Add to your MCP client configuration
3. **Use**: Ask Claude to perform computer tasks
See the [Installation](/docs/libraries/mcp-server/installation) guide for detailed setup instructions.

View File

@@ -38,19 +38,98 @@ You can then use the script in your MCP configuration like this:
"command": "/bin/bash",
"args": ["~/.cua/start_mcp_server.sh"],
"env": {
"CUA_MODEL_NAME": "anthropic/claude-3-5-sonnet-20241022"
"CUA_MODEL_NAME": "anthropic/claude-sonnet-4-20250514",
"ANTHROPIC_API_KEY": "your-anthropic-api-key-here"
}
}
}
}
```
**Important**: You must include your Anthropic API key for the MCP server to work properly.
## Development Setup
If you're working with the CUA source code directly (like in the CUA repository), you can use the development script instead:
```json
{
"mcpServers": {
"cua-agent": {
"command": "/usr/bin/env",
"args": [
"bash", "-lc",
"export CUA_MODEL_NAME='anthropic/claude-sonnet-4-20250514'; export ANTHROPIC_API_KEY='your-anthropic-api-key-here'; /path/to/cua/libs/python/mcp-server/scripts/start_mcp_server.sh"
]
}
}
}
```
**For host computer control** (development setup):
1. **Install Computer Server Dependencies**:
```bash
python3 -m pip install uvicorn fastapi
python3 -m pip install -e libs/python/computer-server --break-system-packages
```
2. **Start the Computer Server**:
```bash
cd /path/to/cua
python -m computer_server --log-level debug
```
This will start the computer server on `http://localhost:8000` that controls your actual desktop.
3. **Configure Claude Desktop**:
```json
{
"mcpServers": {
"cua-agent": {
"command": "/usr/bin/env",
"args": [
"bash", "-lc",
"export CUA_MODEL_NAME='anthropic/claude-sonnet-4-20250514'; export ANTHROPIC_API_KEY='your-anthropic-api-key-here'; export CUA_USE_HOST_COMPUTER_SERVER='true'; export CUA_MAX_IMAGES='1'; /path/to/cua/libs/python/mcp-server/scripts/start_mcp_server.sh"
]
}
}
}
```
**Note**: Replace `/path/to/cua` with the absolute path to your CUA repository directory.
**⚠️ Important**: When using host computer control (`CUA_USE_HOST_COMPUTER_SERVER='true'`), the AI will have direct access to your desktop and can perform actions like opening applications, clicking, typing, and taking screenshots. Make sure you're comfortable with this level of access.
### Troubleshooting
If you get a `/bin/bash: ~/cua/libs/python/mcp-server/scripts/start_mcp_server.sh: No such file or directory` error, try changing the path to the script to be absolute instead of relative.
**Common Issues:**
To see the logs:
1. **"Claude's response was interrupted"** - This usually means:
- Missing API key: Add `ANTHROPIC_API_KEY` to your environment variables
- Invalid model name: Use a valid model like `anthropic/claude-sonnet-4-20250514`
- Check logs for specific error messages
```
2. **"Missing Anthropic API Key"** - Add your API key to the configuration:
```json
"env": {
"ANTHROPIC_API_KEY": "your-api-key-here"
}
```
3. **"model not found"** - Use a valid model name:
- ✅ `anthropic/claude-sonnet-4-20250514`
- ✅ `anthropic/claude-3-5-sonnet-20240620`
- ❌ `anthropic/claude-3-5-sonnet-20241022` (doesn't exist)
4. **Script not found** - If you get a `/bin/bash: ~/cua/libs/python/mcp-server/scripts/start_mcp_server.sh: No such file or directory` error, try changing the path to the script to be absolute instead of relative.
5. **Host Computer Control Issues** - If using `CUA_USE_HOST_COMPUTER_SERVER='true'`:
- **Computer Server not running**: Make sure you've started the computer server with `python -m computer_server --log-level debug`
- **Port 8000 in use**: Check if another process is using port 8000 with `lsof -i :8000`
- **Missing dependencies**: Install `uvicorn` and `fastapi` with `python3 -m pip install uvicorn fastapi`
- **Image size errors**: Use `CUA_MAX_IMAGES='1'` to reduce image context size
**Viewing Logs:**
```bash
tail -n 20 -f ~/Library/Logs/Claude/mcp*.log
```

View File

@@ -6,5 +6,58 @@ title: Tools
The MCP server exposes the following tools to Claude:
1. `run_cua_task` - Run a single Computer-Use Agent task with the given instruction
2. `run_multi_cua_tasks` - Run multiple tasks in sequence
### Core Task Execution Tools
1. **`run_cua_task`** - Run a single Computer-Use Agent task with the given instruction
- `task` (string): The task description for the agent to execute
- `session_id` (string, optional): Session ID for multi-client support. If not provided, a new session will be created
- Returns: Tuple of (combined text output, final screenshot)
2. **`run_multi_cua_tasks`** - Run multiple tasks in sequence or concurrently
- `tasks` (list of strings): List of task descriptions to execute
- `session_id` (string, optional): Session ID for multi-client support. If not provided, a new session will be created
- `concurrent` (boolean, optional): If true, run tasks concurrently. If false, run sequentially (default)
- Returns: List of tuples (combined text output, screenshot) for each task
### Utility Tools
3. **`screenshot_cua`** - Take a screenshot of the current screen
- `session_id` (string, optional): Session ID for multi-client support. If not provided, a new session will be created
- Returns: Screenshot image
4. **`get_session_stats`** - Get statistics about active sessions and resource usage
- Returns: Dictionary with session statistics including total sessions, active tasks, and session details
5. **`cleanup_session`** - Cleanup a specific session and release its resources
- `session_id` (string): The session ID to cleanup
- Returns: Confirmation message
## Session Management
The MCP server supports multi-client sessions with automatic resource management:
- **Session Isolation**: Each client can have its own session with isolated computer instances
- **Resource Pooling**: Computer instances are pooled for efficient resource usage
- **Automatic Cleanup**: Idle sessions are automatically cleaned up after 10 minutes
- **Concurrent Tasks**: Multiple tasks can run concurrently within the same session
- **Progress Reporting**: Real-time progress updates during task execution
## Usage Examples
### Basic Task Execution
```
"Open Chrome and navigate to github.com"
"Create a folder called 'Projects' on my desktop"
```
### Multi-Task Execution
```
"Run these tasks: 1) Open Finder, 2) Navigate to Documents, 3) Create a new folder called 'Work'"
```
### Session Management
```
"Take a screenshot of the current screen"
"Show me the session statistics"
"Cleanup session abc123"
```

View File

@@ -2,7 +2,7 @@
title: Usage
---
## Usage
## Basic Usage
Once configured, you can simply ask Claude to perform computer tasks:
@@ -13,9 +13,140 @@ Once configured, you can simply ask Claude to perform computer tasks:
Claude will automatically use your CUA agent to perform these tasks.
### First-time Usage Notes
## Advanced Features
### Progress Reporting
The MCP server provides real-time progress updates during task execution:
- Task progress is reported as percentages (0-100%)
- Multi-task operations show progress for each individual task
- Progress updates are streamed to the MCP client for real-time feedback
### Error Handling
Robust error handling ensures reliable operation:
- Failed tasks return error messages with screenshots when possible
- Session state is preserved even when individual tasks fail
- Automatic cleanup prevents resource leaks
- Detailed error logging for troubleshooting
### Concurrent Task Execution
For improved performance, multiple tasks can run concurrently:
- Set `concurrent=true` in `run_multi_cua_tasks` for parallel execution
- Each task runs in its own context with isolated state
- Progress tracking works for both sequential and concurrent modes
- Resource pooling ensures efficient computer instance usage
### Session Management
Multi-client support with automatic resource management:
- Each client gets isolated sessions with separate computer instances
- Sessions automatically clean up after 10 minutes of inactivity
- Resource pooling prevents resource exhaustion
- Session statistics available for monitoring
## Target Computer Options
By default, the MCP server runs CUA in a virtual machine for safety. However, you can also configure it to run on your local system.
### Default: Using a VM (Recommended)
The MCP server will automatically start and connect to a VM based on your platform. This is the safest option as AI actions are isolated from your host system.
No additional configuration is needed - this is the default behavior.
### Option: Targeting Your Local Desktop
<Callout type="warn">
**Warning:** When targeting your local system, AI models have direct access to your desktop and may perform risky actions. Use with caution.
</Callout>
To have the MCP server control your local desktop instead of a VM:
1. **Start the Computer Server on your host:**
```bash
pip install cua-computer-server
python -m computer_server
```
2. **Configure the MCP server to use your host system:**
Add the `CUA_USE_HOST_COMPUTER_SERVER` environment variable to your MCP client configuration:
<Tabs items={['Claude Desktop', 'Other MCP Clients']}>
<Tab value="Claude Desktop">
Update your Claude Desktop config (see [Installation](/docs/libraries/mcp-server/installation)) to include the environment variable:
```json
{
"mcpServers": {
"cua-agent": {
"command": "/bin/bash",
"args": ["~/.cua/start_mcp_server.sh"],
"env": {
"CUA_MODEL_NAME": "anthropic/claude-3-5-sonnet-20241022",
"CUA_USE_HOST_COMPUTER_SERVER": "true"
}
}
}
}
```
</Tab>
<Tab value="Other MCP Clients">
Set the environment variable in your MCP client configuration:
```bash
export CUA_USE_HOST_COMPUTER_SERVER=true
```
Then start your MCP client as usual.
</Tab>
</Tabs>
3. **Restart your MCP client** (e.g., Claude Desktop) to apply the changes.
Now Claude will control your local desktop directly when you ask it to perform computer tasks.
## Usage Examples
### Single Task Execution
```
"Open Safari and navigate to apple.com"
"Create a new folder on the desktop called 'My Projects'"
"Take a screenshot of the current screen"
```
### Multi-Task Execution (Sequential)
```
"Run these tasks in order: 1) Open Finder, 2) Navigate to Documents folder, 3) Create a new folder called 'Work'"
```
### Multi-Task Execution (Concurrent)
```
"Run these tasks simultaneously: 1) Open Chrome, 2) Open Safari, 3) Open Finder"
```
### Session Management
```
"Show me the current session statistics"
"Take a screenshot using session abc123"
"Cleanup session xyz789"
```
### Error Recovery
```
"Try to open a non-existent application and show me the error"
"Find all files with .tmp extension and delete them safely"
```
## First-time Usage Notes
**API Keys**: Ensure you have valid API keys:
- Add your Anthropic API key in the Claude Desktop config (as shown above)
- Or set it as an environment variable in your shell profile
- **Required**: The MCP server needs an API key to authenticate with the model provider
- Add your Anthropic API key, or other model provider API key in the Claude Desktop config (as shown above)
- Or set it as an environment variable in your shell profile
**Model Selection**: Choose the appropriate model for your needs:
- **Claude Sonnet 4**: Latest model with best performance (`anthropic/claude-sonnet-4-20250514`)
- **Claude 3.5 Sonnet**: Reliable performance (`anthropic/claude-3-5-sonnet-20240620`)
- **Computer-Use Preview**: Specialized for computer tasks (`openai/computer-use-preview`)
- **Local Models**: For privacy-sensitive environments
- **Ollama**: For offline usage