diff --git a/docs/content/docs/libraries/mcp-server/client-integrations.mdx b/docs/content/docs/libraries/mcp-server/client-integrations.mdx
index 8699cda0..6a79f5b3 100644
--- a/docs/content/docs/libraries/mcp-server/client-integrations.mdx
+++ b/docs/content/docs/libraries/mcp-server/client-integrations.mdx
@@ -6,6 +6,67 @@ title: Client Integrations
To use with Claude Desktop, add an entry to your Claude Desktop configuration (`claude_desktop_config.json`, typically found in `~/.config/claude-desktop/`):
+### Package Installation Method
+
+```json
+{
+ "mcpServers": {
+ "cua-agent": {
+ "command": "/bin/bash",
+ "args": ["~/.cua/start_mcp_server.sh"],
+ "env": {
+ "CUA_MODEL_NAME": "anthropic/claude-sonnet-4-20250514",
+ "ANTHROPIC_API_KEY": "your-anthropic-api-key-here",
+ "CUA_MAX_IMAGES": "3",
+ "CUA_USE_HOST_COMPUTER_SERVER": "false"
+ }
+ }
+ }
+}
+```
+
+### Development Method
+
+If you're working with the CUA source code:
+
+**Standard VM Mode:**
+```json
+{
+ "mcpServers": {
+ "cua-agent": {
+ "command": "/usr/bin/env",
+ "args": [
+ "bash", "-lc",
+ "export CUA_MODEL_NAME='anthropic/claude-sonnet-4-20250514'; export ANTHROPIC_API_KEY='your-anthropic-api-key-here'; /path/to/cua/libs/python/mcp-server/scripts/start_mcp_server.sh"
+ ]
+ }
+ }
+}
+```
+
+**Host Computer Control Mode:**
+```json
+{
+ "mcpServers": {
+ "cua-agent": {
+ "command": "/usr/bin/env",
+ "args": [
+ "bash", "-lc",
+ "export CUA_MODEL_NAME='anthropic/claude-sonnet-4-20250514'; export ANTHROPIC_API_KEY='your-anthropic-api-key-here'; export CUA_USE_HOST_COMPUTER_SERVER='true'; export CUA_MAX_IMAGES='1'; /path/to/cua/libs/python/mcp-server/scripts/start_mcp_server.sh"
+ ]
+ }
+ }
+}
+```
+
+**Note**: Replace `/path/to/cua` with the absolute path to your CUA repository directory.
+
+**⚠️ Host Computer Control Setup**: When using `CUA_USE_HOST_COMPUTER_SERVER='true'`, you must also:
+1. Install computer server dependencies: `python3 -m pip install uvicorn fastapi`
+2. Install the computer server: `python3 -m pip install -e libs/python/computer-server --break-system-packages`
+3. Start the computer server: `python -m computer_server --log-level debug`
+4. The AI will have direct access to your desktop - use with caution!
+
For more information on MCP with Claude Desktop, see the [official MCP User Guide](https://modelcontextprotocol.io/quickstart/user).
## Cursor Integration
@@ -15,6 +76,43 @@ To use with Cursor, add an MCP configuration file in one of these locations:
- **Project-specific**: Create `.cursor/mcp.json` in your project directory
- **Global**: Create `~/.cursor/mcp.json` in your home directory
+Example configuration for Cursor:
+
+```json
+{
+ "mcpServers": {
+ "cua-agent": {
+ "command": "/bin/bash",
+ "args": ["~/.cua/start_mcp_server.sh"],
+ "env": {
+ "CUA_MODEL_NAME": "anthropic/claude-sonnet-4-20250514",
+ "ANTHROPIC_API_KEY": "your-anthropic-api-key-here"
+ }
+ }
+ }
+}
+```
+
After configuration, you can simply tell Cursor's Agent to perform computer tasks by explicitly mentioning the CUA agent, such as "Use the computer control tools to open Safari."
-For more information on MCP with Cursor, see the [official Cursor MCP documentation](https://docs.cursor.com/context/model-context-protocol).
\ No newline at end of file
+For more information on MCP with Cursor, see the [official Cursor MCP documentation](https://docs.cursor.com/context/model-context-protocol).
+
+## Other MCP Clients
+
+The MCP server is compatible with any MCP-compliant client. The server exposes the following tools:
+
+- `run_cua_task` - Execute single computer tasks
+- `run_multi_cua_tasks` - Execute multiple tasks (sequential or concurrent)
+- `screenshot_cua` - Capture screenshots
+- `get_session_stats` - Monitor session statistics
+- `cleanup_session` - Manage session lifecycle
+
+### Configuration Options
+
+All MCP clients can configure the server using environment variables:
+
+- `CUA_MODEL_NAME` - Model to use for task execution
+- `CUA_MAX_IMAGES` - Maximum images to keep in context
+- `CUA_USE_HOST_COMPUTER_SERVER` - Use host system instead of VM
+
+See the [Configuration](/docs/libraries/mcp-server/configuration) page for detailed configuration options.
\ No newline at end of file
diff --git a/docs/content/docs/libraries/mcp-server/configuration.mdx b/docs/content/docs/libraries/mcp-server/configuration.mdx
index e5df8293..cce1957c 100644
--- a/docs/content/docs/libraries/mcp-server/configuration.mdx
+++ b/docs/content/docs/libraries/mcp-server/configuration.mdx
@@ -6,5 +6,64 @@ The server is configured using environment variables (can be set in the Claude D
| Variable | Description | Default |
|----------|-------------|---------|
-| `CUA_MODEL_NAME` | Model string (e.g., "anthropic/claude-3-5-sonnet-20241022", "openai/computer-use-preview", "huggingface-local/ByteDance-Seed/UI-TARS-1.5-7B", "omniparser+litellm/gpt-4o", "omniparser+ollama_chat/gemma3") | anthropic/claude-3-5-sonnet-20241022 |
+| `CUA_MODEL_NAME` | Model string (e.g., "anthropic/claude-sonnet-4-20250514", "anthropic/claude-3-5-sonnet-20240620", "openai/computer-use-preview", "huggingface-local/ByteDance-Seed/UI-TARS-1.5-7B", "omniparser+litellm/gpt-4o", "omniparser+ollama_chat/gemma3") | anthropic/claude-sonnet-4-20250514 |
+| `ANTHROPIC_API_KEY` | Your Anthropic API key (required for Anthropic models) | None |
| `CUA_MAX_IMAGES` | Maximum number of images to keep in context | 3 |
+| `CUA_USE_HOST_COMPUTER_SERVER` | Target your local desktop instead of a VM. Set to "true" to use your host system. **Warning:** AI models may perform risky actions. | false |
+
+## Model Configuration
+
+The `CUA_MODEL_NAME` environment variable supports various model providers through LiteLLM integration:
+
+### Supported Providers
+- **Anthropic**: `anthropic/claude-sonnet-4-20250514`, `anthropic/claude-3-5-sonnet-20240620`, `anthropic/claude-3-haiku-20240307`
+- **OpenAI**: `openai/computer-use-preview`, `openai/gpt-4o`
+- **Local Models**: `huggingface-local/ByteDance-Seed/UI-TARS-1.5-7B`
+- **Omni + LiteLLM**: `omniparser+litellm/gpt-4o`, `omniparser+litellm/claude-3-haiku`
+- **Ollama**: `omniparser+ollama_chat/gemma3`
+
+### Example Configurations
+
+**Claude Desktop Configuration:**
+```json
+{
+ "mcpServers": {
+ "cua-agent": {
+ "command": "/bin/bash",
+ "args": ["~/.cua/start_mcp_server.sh"],
+ "env": {
+ "CUA_MODEL_NAME": "anthropic/claude-sonnet-4-20250514",
+ "ANTHROPIC_API_KEY": "your-anthropic-api-key-here",
+ "CUA_MAX_IMAGES": "5",
+ "CUA_USE_HOST_COMPUTER_SERVER": "false"
+ }
+ }
+ }
+}
+```
+
+**Local Model Configuration:**
+```json
+{
+ "mcpServers": {
+ "cua-agent": {
+ "command": "/bin/bash",
+ "args": ["~/.cua/start_mcp_server.sh"],
+ "env": {
+ "CUA_MODEL_NAME": "huggingface-local/ByteDance-Seed/UI-TARS-1.5-7B",
+ "CUA_MAX_IMAGES": "3"
+ }
+ }
+ }
+}
+```
+
+## Session Management Configuration
+
+The MCP server automatically manages sessions with the following defaults:
+- **Max Concurrent Sessions**: 10
+- **Session Timeout**: 10 minutes of inactivity
+- **Computer Pool Size**: 5 instances
+- **Automatic Cleanup**: Enabled
+
+These settings are optimized for typical usage and don't require configuration for most users.
diff --git a/docs/content/docs/libraries/mcp-server/index.mdx b/docs/content/docs/libraries/mcp-server/index.mdx
index 87c9a342..a20b5d09 100644
--- a/docs/content/docs/libraries/mcp-server/index.mdx
+++ b/docs/content/docs/libraries/mcp-server/index.mdx
@@ -6,4 +6,22 @@ github:
- https://github.com/trycua/cua/tree/main/libs/python/mcp-server
---
-**cua-mcp-server** is a MCP server for the Computer-Use Agent (CUA), allowing you to run CUA through Claude Desktop or other MCP clients.
\ No newline at end of file
+**cua-mcp-server** is a MCP server for the Computer-Use Agent (CUA), allowing you to run CUA through Claude Desktop or other MCP clients.
+
+## Features
+
+- **Multi-Client Support**: Concurrent sessions with automatic resource management
+- **Progress Reporting**: Real-time progress updates during task execution
+- **Error Handling**: Robust error recovery with screenshot capture
+- **Concurrent Execution**: Run multiple tasks in parallel for improved performance
+- **Session Management**: Automatic cleanup and resource pooling
+- **LiteLLM Integration**: Support for multiple model providers
+- **VM Safety**: Default VM execution with optional host system control
+
+## Quick Start
+
+1. **Install**: `pip install cua-mcp-server`
+2. **Configure**: Add to your MCP client configuration
+3. **Use**: Ask Claude to perform computer tasks
+
+See the [Installation](/docs/libraries/mcp-server/installation) guide for detailed setup instructions.
\ No newline at end of file
diff --git a/docs/content/docs/libraries/mcp-server/installation.mdx b/docs/content/docs/libraries/mcp-server/installation.mdx
index c04a4917..ce4f87a6 100644
--- a/docs/content/docs/libraries/mcp-server/installation.mdx
+++ b/docs/content/docs/libraries/mcp-server/installation.mdx
@@ -36,18 +36,98 @@ You can then use the script in your MCP configuration like this:
"command": "/bin/bash",
"args": ["~/.cua/start_mcp_server.sh"],
"env": {
- "CUA_MODEL_NAME": "anthropic/claude-3-5-sonnet-20241022"
+ "CUA_MODEL_NAME": "anthropic/claude-sonnet-4-20250514",
+ "ANTHROPIC_API_KEY": "your-anthropic-api-key-here"
}
}
}
}
```
+**Important**: You must include your Anthropic API key for the MCP server to work properly.
+
+## Development Setup
+
+If you're working with the CUA source code directly (like in the CUA repository), you can use the development script instead:
+
+```json
+{
+ "mcpServers": {
+ "cua-agent": {
+ "command": "/usr/bin/env",
+ "args": [
+ "bash", "-lc",
+ "export CUA_MODEL_NAME='anthropic/claude-sonnet-4-20250514'; export ANTHROPIC_API_KEY='your-anthropic-api-key-here'; /path/to/cua/libs/python/mcp-server/scripts/start_mcp_server.sh"
+ ]
+ }
+ }
+}
+```
+
+**For host computer control** (development setup):
+
+1. **Install Computer Server Dependencies**:
+ ```bash
+ python3 -m pip install uvicorn fastapi
+ python3 -m pip install -e libs/python/computer-server --break-system-packages
+ ```
+
+2. **Start the Computer Server**:
+ ```bash
+ cd /path/to/cua
+ python -m computer_server --log-level debug
+ ```
+ This will start the computer server on `http://localhost:8000` that controls your actual desktop.
+
+3. **Configure Claude Desktop**:
+ ```json
+ {
+ "mcpServers": {
+ "cua-agent": {
+ "command": "/usr/bin/env",
+ "args": [
+ "bash", "-lc",
+ "export CUA_MODEL_NAME='anthropic/claude-sonnet-4-20250514'; export ANTHROPIC_API_KEY='your-anthropic-api-key-here'; export CUA_USE_HOST_COMPUTER_SERVER='true'; export CUA_MAX_IMAGES='1'; /path/to/cua/libs/python/mcp-server/scripts/start_mcp_server.sh"
+ ]
+ }
+ }
+ }
+ ```
+
+**Note**: Replace `/path/to/cua` with the absolute path to your CUA repository directory.
+
+**⚠️ Important**: When using host computer control (`CUA_USE_HOST_COMPUTER_SERVER='true'`), the AI will have direct access to your desktop and can perform actions like opening applications, clicking, typing, and taking screenshots. Make sure you're comfortable with this level of access.
+
### Troubleshooting
-If you get a `/bin/bash: ~/cua/libs/python/mcp-server/scripts/start_mcp_server.sh: No such file or directory` error, try changing the path to the script to be absolute instead of relative.
+**Common Issues:**
-To see the logs:
-```
+1. **"Claude's response was interrupted"** - This usually means:
+ - Missing API key: Add `ANTHROPIC_API_KEY` to your environment variables
+ - Invalid model name: Use a valid model like `anthropic/claude-sonnet-4-20250514`
+ - Check logs for specific error messages
+
+2. **"Missing Anthropic API Key"** - Add your API key to the configuration:
+ ```json
+ "env": {
+ "ANTHROPIC_API_KEY": "your-api-key-here"
+ }
+ ```
+
+3. **"model not found"** - Use a valid model name:
+ - ✅ `anthropic/claude-sonnet-4-20250514`
+ - ✅ `anthropic/claude-3-5-sonnet-20240620`
+ - ❌ `anthropic/claude-3-5-sonnet-20241022` (doesn't exist)
+
+4. **Script not found** - If you get a `/bin/bash: ~/cua/libs/python/mcp-server/scripts/start_mcp_server.sh: No such file or directory` error, try changing the path to the script to be absolute instead of relative.
+
+5. **Host Computer Control Issues** - If using `CUA_USE_HOST_COMPUTER_SERVER='true'`:
+ - **Computer Server not running**: Make sure you've started the computer server with `python -m computer_server --log-level debug`
+ - **Port 8000 in use**: Check if another process is using port 8000 with `lsof -i :8000`
+ - **Missing dependencies**: Install `uvicorn` and `fastapi` with `python3 -m pip install uvicorn fastapi`
+ - **Image size errors**: Use `CUA_MAX_IMAGES='1'` to reduce image context size
+
+**Viewing Logs:**
+```bash
tail -n 20 -f ~/Library/Logs/Claude/mcp*.log
```
\ No newline at end of file
diff --git a/docs/content/docs/libraries/mcp-server/tools.mdx b/docs/content/docs/libraries/mcp-server/tools.mdx
index edf29c0b..20e91311 100644
--- a/docs/content/docs/libraries/mcp-server/tools.mdx
+++ b/docs/content/docs/libraries/mcp-server/tools.mdx
@@ -6,5 +6,58 @@ title: Tools
The MCP server exposes the following tools to Claude:
-1. `run_cua_task` - Run a single Computer-Use Agent task with the given instruction
-2. `run_multi_cua_tasks` - Run multiple tasks in sequence
\ No newline at end of file
+### Core Task Execution Tools
+
+1. **`run_cua_task`** - Run a single Computer-Use Agent task with the given instruction
+ - `task` (string): The task description for the agent to execute
+ - `session_id` (string, optional): Session ID for multi-client support. If not provided, a new session will be created
+ - Returns: Tuple of (combined text output, final screenshot)
+
+2. **`run_multi_cua_tasks`** - Run multiple tasks in sequence or concurrently
+ - `tasks` (list of strings): List of task descriptions to execute
+ - `session_id` (string, optional): Session ID for multi-client support. If not provided, a new session will be created
+ - `concurrent` (boolean, optional): If true, run tasks concurrently. If false, run sequentially (default)
+ - Returns: List of tuples (combined text output, screenshot) for each task
+
+### Utility Tools
+
+3. **`screenshot_cua`** - Take a screenshot of the current screen
+ - `session_id` (string, optional): Session ID for multi-client support. If not provided, a new session will be created
+ - Returns: Screenshot image
+
+4. **`get_session_stats`** - Get statistics about active sessions and resource usage
+ - Returns: Dictionary with session statistics including total sessions, active tasks, and session details
+
+5. **`cleanup_session`** - Cleanup a specific session and release its resources
+ - `session_id` (string): The session ID to cleanup
+ - Returns: Confirmation message
+
+## Session Management
+
+The MCP server supports multi-client sessions with automatic resource management:
+
+- **Session Isolation**: Each client can have its own session with isolated computer instances
+- **Resource Pooling**: Computer instances are pooled for efficient resource usage
+- **Automatic Cleanup**: Idle sessions are automatically cleaned up after 10 minutes
+- **Concurrent Tasks**: Multiple tasks can run concurrently within the same session
+- **Progress Reporting**: Real-time progress updates during task execution
+
+## Usage Examples
+
+### Basic Task Execution
+```
+"Open Chrome and navigate to github.com"
+"Create a folder called 'Projects' on my desktop"
+```
+
+### Multi-Task Execution
+```
+"Run these tasks: 1) Open Finder, 2) Navigate to Documents, 3) Create a new folder called 'Work'"
+```
+
+### Session Management
+```
+"Take a screenshot of the current screen"
+"Show me the session statistics"
+"Cleanup session abc123"
+```
\ No newline at end of file
diff --git a/docs/content/docs/libraries/mcp-server/usage.mdx b/docs/content/docs/libraries/mcp-server/usage.mdx
index 19eef934..1748490a 100644
--- a/docs/content/docs/libraries/mcp-server/usage.mdx
+++ b/docs/content/docs/libraries/mcp-server/usage.mdx
@@ -2,7 +2,7 @@
title: Usage
---
-## Usage
+## Basic Usage
Once configured, you can simply ask Claude to perform computer tasks:
@@ -13,8 +13,140 @@ Once configured, you can simply ask Claude to perform computer tasks:
Claude will automatically use your CUA agent to perform these tasks.
-### First-time Usage Notes
+## Advanced Features
+
+### Progress Reporting
+The MCP server provides real-time progress updates during task execution:
+- Task progress is reported as percentages (0-100%)
+- Multi-task operations show progress for each individual task
+- Progress updates are streamed to the MCP client for real-time feedback
+
+### Error Handling
+Robust error handling ensures reliable operation:
+- Failed tasks return error messages with screenshots when possible
+- Session state is preserved even when individual tasks fail
+- Automatic cleanup prevents resource leaks
+- Detailed error logging for troubleshooting
+
+### Concurrent Task Execution
+For improved performance, multiple tasks can run concurrently:
+- Set `concurrent=true` in `run_multi_cua_tasks` for parallel execution
+- Each task runs in its own context with isolated state
+- Progress tracking works for both sequential and concurrent modes
+- Resource pooling ensures efficient computer instance usage
+
+### Session Management
+Multi-client support with automatic resource management:
+- Each client gets isolated sessions with separate computer instances
+- Sessions automatically clean up after 10 minutes of inactivity
+- Resource pooling prevents resource exhaustion
+- Session statistics available for monitoring
+
+## Target Computer Options
+
+By default, the MCP server runs CUA in a virtual machine for safety. However, you can also configure it to run on your local system.
+
+### Default: Using a VM (Recommended)
+
+The MCP server will automatically start and connect to a VM based on your platform. This is the safest option as AI actions are isolated from your host system.
+
+No additional configuration is needed - this is the default behavior.
+
+### Option: Targeting Your Local Desktop
+
+
+ **Warning:** When targeting your local system, AI models have direct access to your desktop and may perform risky actions. Use with caution.
+
+
+To have the MCP server control your local desktop instead of a VM:
+
+1. **Start the Computer Server on your host:**
+
+```bash
+pip install cua-computer-server
+python -m computer_server
+```
+
+2. **Configure the MCP server to use your host system:**
+
+Add the `CUA_USE_HOST_COMPUTER_SERVER` environment variable to your MCP client configuration:
+
+
+
+ Update your Claude Desktop config (see [Installation](/docs/libraries/mcp-server/installation)) to include the environment variable:
+
+ ```json
+ {
+ "mcpServers": {
+ "cua-agent": {
+ "command": "/bin/bash",
+ "args": ["~/.cua/start_mcp_server.sh"],
+ "env": {
+ "CUA_MODEL_NAME": "anthropic/claude-3-5-sonnet-20241022",
+ "CUA_USE_HOST_COMPUTER_SERVER": "true"
+ }
+ }
+ }
+ }
+ ```
+
+
+ Set the environment variable in your MCP client configuration:
+
+ ```bash
+ export CUA_USE_HOST_COMPUTER_SERVER=true
+ ```
+
+ Then start your MCP client as usual.
+
+
+
+3. **Restart your MCP client** (e.g., Claude Desktop) to apply the changes.
+
+Now Claude will control your local desktop directly when you ask it to perform computer tasks.
+
+## Usage Examples
+
+### Single Task Execution
+```
+"Open Safari and navigate to apple.com"
+"Create a new folder on the desktop called 'My Projects'"
+"Take a screenshot of the current screen"
+```
+
+### Multi-Task Execution (Sequential)
+```
+"Run these tasks in order: 1) Open Finder, 2) Navigate to Documents folder, 3) Create a new folder called 'Work'"
+```
+
+### Multi-Task Execution (Concurrent)
+```
+"Run these tasks simultaneously: 1) Open Chrome, 2) Open Safari, 3) Open Finder"
+```
+
+### Session Management
+```
+"Show me the current session statistics"
+"Take a screenshot using session abc123"
+"Cleanup session xyz789"
+```
+
+### Error Recovery
+```
+"Try to open a non-existent application and show me the error"
+"Find all files with .tmp extension and delete them safely"
+```
+
+## First-time Usage Notes
**API Keys**: Ensure you have valid API keys:
- - Add your Anthropic API key, or other model provider API key in the Claude Desktop config (as shown above)
+ - Add your Anthropic API key in the Claude Desktop config (as shown above)
- Or set it as an environment variable in your shell profile
+ - **Required**: The MCP server needs an API key to authenticate with the model provider
+
+**Model Selection**: Choose the appropriate model for your needs:
+ - **Claude Sonnet 4**: Latest model with best performance (`anthropic/claude-sonnet-4-20250514`)
+ - **Claude 3.5 Sonnet**: Reliable performance (`anthropic/claude-3-5-sonnet-20240620`)
+ - **Computer-Use Preview**: Specialized for computer tasks (`openai/computer-use-preview`)
+ - **Local Models**: For privacy-sensitive environments
+ - **Ollama**: For offline usage
diff --git a/libs/python/computer-server/computer_server/handlers/macos.py b/libs/python/computer-server/computer_server/handlers/macos.py
index ce341668..6a831c17 100644
--- a/libs/python/computer-server/computer_server/handlers/macos.py
+++ b/libs/python/computer-server/computer_server/handlers/macos.py
@@ -1287,7 +1287,15 @@ class MacOSAutomationHandler(BaseAutomationHandler):
if not isinstance(screenshot, Image.Image):
return {"success": False, "error": "Failed to capture screenshot"}
+ # Resize image to reduce size (max width 1920, maintain aspect ratio)
+ max_width = 1920
+ if screenshot.width > max_width:
+ ratio = max_width / screenshot.width
+ new_height = int(screenshot.height * ratio)
+ screenshot = screenshot.resize((max_width, new_height), Image.Resampling.LANCZOS)
+
buffered = BytesIO()
+ # Use PNG format with optimization to reduce file size
screenshot.save(buffered, format="PNG", optimize=True)
buffered.seek(0)
image_data = base64.b64encode(buffered.getvalue()).decode()
diff --git a/libs/python/mcp-server/mcp_server/session_manager.py b/libs/python/mcp-server/mcp_server/session_manager.py
index dc8d480b..a415feac 100644
--- a/libs/python/mcp-server/mcp_server/session_manager.py
+++ b/libs/python/mcp-server/mcp_server/session_manager.py
@@ -10,6 +10,7 @@ This module provides:
import asyncio
import logging
+import os
import time
import uuid
import weakref
@@ -57,7 +58,14 @@ class ComputerPool:
logger.debug("Creating new computer instance")
from computer import Computer
- computer = Computer(verbosity=logging.INFO)
+ # Check if we should use host computer server
+ use_host = os.getenv("CUA_USE_HOST_COMPUTER_SERVER", "false").lower() in (
+ "true",
+ "1",
+ "yes",
+ )
+
+ computer = Computer(verbosity=logging.INFO, use_host_computer_server=use_host)
await computer.run()
self._in_use.add(computer)
return computer