cua-driver: auto-delegate mcp to daemon for correct TCC context (#1479)

* cua-driver: auto-delegate mcp to daemon for correct TCC context (#1465)

`cua-driver mcp` (stdio MCP server) always failed Accessibility
permission when launched from an IDE terminal (Claude Code, Cursor,
VS Code, Warp), because macOS TCC attributes the subprocess to the
parent terminal — not to CuaDriver.app. `serve` already side-stepped
this by relaunching itself via `open -n -g -a CuaDriver --args serve`,
but `mcp` couldn't take the same path without disconnecting the stdio
pipes its MCP client owns.

Fix: when `mcp` detects it's running in the wrong TCC context (bare
binary symlink resolving into a CuaDriver.app bundle, ppid != launchd),
it auto-launches a `cua-driver serve` daemon and then runs an
in-process MCP server whose ListTools / CallTool handlers forward
every request through the daemon's existing Unix socket at
~/Library/Caches/cua-driver/cua-driver.sock. Tool semantics are
identical to the in-process path; the MCP client just sees an
ordinary stdio server. No Python bridge required.

When mcp IS launched with the right TCC context (from CuaDriver.app
directly, or with ppid == launchd post-LaunchServices), the
in-process path runs unchanged. Pass `--no-daemon-relaunch` (or
CUA_DRIVER_MCP_NO_RELAUNCH=1) to force in-process behavior.

Compat-mode (`--claude-code-computer-use-compat`) rewrites the
client-visible `screenshot` tool descriptor and translates inbound
`{pid, window_id}` screenshot calls into the daemon's native
`{window_id, format:"jpeg", quality:85}` shape.

Docs regenerated from source via the existing dump-docs pipeline.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* cua-driver: address CodeRabbit nits on MCP TCC auto-delegation

- Extract the "is this binary running from inside an installed
  CuaDriver.app bundle?" heuristic into a single shared
  `isExecutableInsideCuaDriverApp()` helper in
  `CuaDriverCLI/BundleHelpers.swift`. `ServeCommand` and `MCPCommand`
  both call into it now, instead of carrying byte-identical local
  copies. Subcommands can still wrap it with extra env/flag/ppid
  gating where their relaunch heuristics diverge.

- Make `CuaDriverMCPServer.fetchProxyToolList` (and therefore
  `makeProxy`) throwing. Previously a missing/unhealthy daemon
  silently returned an empty tool list, so the MCP client saw a
  "successful" handshake advertising zero tools and then errored on
  every subsequent `CallTool`. Now it throws a descriptive
  `MCPError.internalError` pointing at the socket path and the
  `open -n -g -a CuaDriver --args serve` recovery, which surfaces
  during proxy init and gets logged to stderr by `AppKitBootstrap`'s
  catch path before the process exits.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(mcp): document in-process vs daemon-proxy lifecycle

Add a dedicated `process-model` guide page covering the two `cua-driver mcp`
runtime modes introduced by the TCC auto-delegation work in this PR:

- In-process mode (default when spawned by CuaDriver.app) and daemon-proxy
  mode (auto-engaged from IDE terminals), with ASCII diagrams.
- How to tell which mode is active (the stderr log line, plus the
  heuristic the code uses).
- Daemon lifecycle — spawned once via `open -n -g -a CuaDriver --args
  serve`, survives mcp-client restarts, exits only on user `stop` / app
  quit / reboot / permissions denial.
- mcp-client lifecycle in proxy mode (exits when stdio closes; does NOT
  terminate the daemon).
- Failure modes — `open` fails, daemon doesn't appear within 10s, daemon
  refuses the initial ListTools (fail-fast per CodeRabbit nit 2), daemon
  dies mid-session (next CallTool raises MCPError.internalError with the
  daemon-restart hint).
- Forcing flags — `--no-daemon-relaunch`, `CUA_DRIVER_MCP_NO_RELAUNCH=1`,
  `--socket`, launching from CuaDriver.app directly.
- Recommendation for wrapper authors (Hermes, custom MCP shims): don't
  manage the daemon yourself, treat the stdio MCP transport as the
  contract, handle disconnects via MCP-level reconnect (matches
  trycua/hermes#22821's pattern).

Cross-link from the autogenerated `mcp-tools.mdx` TCC-auto-delegation
callout, and update the generator script so the link survives the next
regen.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Francesco Bonacci
2026-05-12 20:14:29 +02:00
committed by GitHub
parent e8e629107d
commit 41c6afdf53
12 changed files with 643 additions and 30 deletions
@@ -167,6 +167,10 @@ open -n -g -a CuaDriver --args serve
cua-driver check_permissions # forwards to the daemon — authoritative answer
```
### `cua-driver mcp` from an IDE terminal can't see Accessibility.
This is the same TCC-attribution issue as the previous question, applied to the stdio MCP server. `cua-driver mcp` detects it and auto-launches the daemon via `open -n -g -a CuaDriver --args serve`, then proxies every MCP tool call through the daemon's Unix socket. From the MCP client's perspective nothing changes — the same stdio server, the same tool names, the same response shapes — but every AX probe now hits a process that LaunchServices attributed to `CuaDriver.app`. No Python bridge needed. Force the in-process path with `--no-daemon-relaunch` (or `CUA_DRIVER_MCP_NO_RELAUNCH=1`) if you really want it, e.g. when `mcp` is launched from `CuaDriver.app` directly.
### I keep seeing the permissions dialog on every launch.
macOS is attributing the process to a different bundle id than the one you granted. Run `cua-driver diagnose` and share the output when filing an issue. It reports cdhash, team id, and which bundle TCC matched against.
@@ -155,6 +155,17 @@ cua-driver mcp-config | pbcopy
The client spawns `cua-driver mcp` on demand once registered.
<Callout type="info">
**TCC auto-delegation.** When MCP clients spawn `cua-driver mcp` from an IDE terminal (Claude Code,
Cursor, VS Code, Warp), macOS attributes the subprocess to the terminal — not `CuaDriver.app` — so
AX probes would silently fail against the wrong bundle id. `mcp` detects this and auto-launches a
`cua-driver serve` daemon via `open -n -g -a CuaDriver --args serve`, then proxies every tool call
through the daemon's Unix socket. The client sees an ordinary stdio MCP server; the daemon runs
under LaunchServices with the right TCC context. No Python bridge, no manual `serve` step required.
Pass `--no-daemon-relaunch` (or set `CUA_DRIVER_MCP_NO_RELAUNCH=1`) to opt out, e.g. when MCP is
launched from CuaDriver.app directly and already has the right TCC grants.
</Callout>
## Agent skills (auto-wired)
The bundle ships an Anthropic-format SKILL.md pack at `/Applications/CuaDriver.app/Contents/Resources/Skills/cua-driver/`. The installer detects the agents you have and creates symlinks pointing them at the bundle so the skill auto-loads:
@@ -3,5 +3,5 @@
"description": "Get up and running with Cua Driver",
"icon": "Rocket",
"defaultOpen": true,
"pages": ["introduction", "installation", "quickstart", "integrations", "swift-integration", "comparison", "faq"]
"pages": ["introduction", "installation", "quickstart", "integrations", "swift-integration", "process-model", "comparison", "faq"]
}
@@ -0,0 +1,187 @@
---
title: MCP process model
description: How cua-driver mcp runs — in-process vs daemon-proxy modes, lifecycles, failure modes, and wrapper-author guidance
---
import { Callout } from 'fumadocs-ui/components/callout';
`cua-driver mcp` runs in one of two modes depending on the TCC context it inherits from its parent process. Both expose the **exact same MCP tool surface** — same names, same arguments, same response shapes — so MCP clients never need to care which mode is active. This page documents how the two modes differ at the process level, so wrapper authors (Hermes, custom MCP shims, etc.) and operators debugging the lifecycle have a precise mental model.
## The two modes
### In-process mode (default when TCC context is correct)
```text
cua-driver.app (LaunchServices) → cua-driver mcp (stdio)
└─ AX + ScreenCaptureKit calls
attributed to com.trycua.driver
```
When `cua-driver mcp` is spawned by `CuaDriver.app` directly (LaunchServices-attributed), the process inherits the bundle's TCC grants. Every tool runs in-process: AX walks, screenshots, click synthesis, the agent-cursor overlay, all of it. Lifecycle is dead simple — the process lives for the duration of the stdio session and exits when stdin closes.
### Daemon-proxy mode (auto-engaged from IDE terminals)
```text
Claude Code / Cursor / Warp LaunchServices
│ │
└─ cua-driver mcp (stdio client) └─ CuaDriver.app
│ │
│ Unix socket (~/Library/Caches/ │
│ cua-driver/cua-driver.sock) │
└──────────────────────────────────────────┴─ cua-driver serve (daemon)
└─ AX + ScreenCaptureKit calls
attributed to com.trycua.driver
```
When `cua-driver mcp` is spawned from an IDE terminal (Claude Code, Cursor, VS Code, Warp), macOS attributes the subprocess to the terminal — not `CuaDriver.app` — so AX probes would silently fail against the wrong bundle id. To sidestep this, `mcp` does three things on startup:
1. Checks whether a `cua-driver serve` daemon is already listening on `~/Library/Caches/cua-driver/cua-driver.sock`.
2. If not, runs `open -n -g -a CuaDriver --args serve` to spawn one under LaunchServices (`-n` = new instance so we never reuse an existing CuaDriver.app process without `--args serve`; `-g` = stay backgrounded). Then polls the socket for up to 10s.
3. Proxies every MCP `ListTools` / `CallTool` request through the daemon's line-delimited JSON socket protocol. The mcp client process never touches AX directly.
Two processes, one stdio session. The daemon runs under the right TCC context; the mcp client just shuttles bytes.
## Detecting which mode is active
`cua-driver mcp` emits a single stderr line on startup when it engages the daemon-proxy path:
```text
cua-driver: mcp launched without CuaDriver.app's TCC grants; auto-launching the daemon
via `open -n -g -a CuaDriver --args serve` and proxying MCP requests through it.
Pass --no-daemon-relaunch to stay in-process.
```
In-process mode is silent at startup — no extra log line. So:
- **Stderr contains that line** → daemon-proxy mode.
- **No log line, mcp serving normally** → in-process mode.
You can also infer the mode from the heuristic the code uses. Daemon-proxy mode engages when **all** of the following are true:
- `--no-daemon-relaunch` is not set.
- `CUA_DRIVER_MCP_NO_RELAUNCH` is not truthy (`1`, `true`, `yes`, `on`).
- The binary's `Bundle.main.bundlePath` does not end in `.app` (i.e. we're invoked through the `~/.local/bin/cua-driver` symlink, not the bundle's main executable directly).
- The binary resolves into an installed `CuaDriver.app` bundle via `realpath` (raw `swift run` dev invocations fail this check and stay in-process).
- `getppid() != 1` (a ppid of 1 means launchd reparented us — we already have the right TCC context).
## Process lifecycles
### Daemon lifecycle (proxy mode only)
The daemon is **not** owned by the mcp client. Concretely:
- **Spawned once.** The first `cua-driver mcp` invocation from an IDE terminal calls `open -n -g -a CuaDriver --args serve`. macOS LaunchServices keeps a handle on the resulting process.
- **Survives mcp-client restarts.** When the MCP client (Claude Code, Cursor) closes stdio and the `cua-driver mcp` proxy exits, the daemon keeps running. The next `cua-driver mcp` invocation reuses the same daemon — it sees the socket already listening and skips the relaunch.
- **Persists across IDE restarts.** As long as nothing kills the `CuaDriver.app` process, the daemon is there to serve any number of mcp-client sessions.
- **Exits when:**
- The user runs `cua-driver stop` (sends a clean shutdown over the socket).
- The user quits `CuaDriver.app` (via the Activity Monitor, `killall CuaDriver`, or the menubar item).
- The Mac reboots / logs out.
- The daemon's own permissions gate is closed without granting Accessibility + Screen Recording (the daemon then exits with a diagnostic).
Run `cua-driver status` at any point to verify:
```bash
cua-driver status
# cua-driver daemon is running
# socket: /Users/you/Library/Caches/cua-driver/cua-driver.sock
# pid: 12345
```
### mcp-client lifecycle (proxy mode)
The `cua-driver mcp` process in proxy mode is a thin stdio→UDS pump:
- Lives for the duration of the stdio session — exits cleanly when the MCP client (parent) closes stdin or the host process is killed.
- Does **not** terminate the daemon on exit. The daemon stays running, ready for the next mcp client.
- Holds no AX caches or per-pid state of its own — everything lives on the daemon side. Restarting just the mcp client preserves whatever element-index caches the daemon has built up.
### mcp-client lifecycle (in-process mode)
Single process. Lives for the duration of the stdio session. Exits when stdin closes. All per-pid state (element-index cache, recording state, agent-cursor toggles) lives in the mcp process and is lost on exit. For element-indexed workflows that need to survive restarts, prefer daemon-proxy mode or run `cua-driver serve` explicitly.
## Failure modes
### Daemon unreachable at mcp startup
`mcp` tries to spawn the daemon via `open -n -g -a CuaDriver --args serve` and polls the socket for 10s. If the daemon never appears (CuaDriver.app missing, app refused to launch, permissions gate not yet granted), `mcp` exits with:
```text
cua-driver: daemon did not appear on <socket> within 10s. If this is the first launch,
grant Accessibility + Screen Recording to CuaDriver.app in System Settings and retry.
Pass --no-daemon-relaunch to stay in-process.
```
If `open` itself fails (CuaDriver.app not installed at all), the error is reported and the process exits with a clear pointer at the `--no-daemon-relaunch` escape hatch.
### Daemon present but refuses the initial ListTools
`makeProxy` caches the tool list with a single `list` RPC at startup. If that fails — daemon transport error, unexpected response, daemon reports failure — `mcp` fails fast with an `MCPError.internalError` carrying the daemon hint, **before** completing the MCP handshake. The MCP client sees a clear startup error rather than a successful handshake that advertises zero tools and then errors on every `CallTool`.
### Daemon dies mid-session
If the daemon is killed (user quits CuaDriver.app, `killall`, crash) while an mcp client is connected, the proxy can't tell until the next `CallTool` lands. That call raises:
```text
MCPError.internalError("cua-driver daemon not reachable on <socket>. Start it with
`open -n -g -a CuaDriver --args serve` and retry.")
```
The MCP client surfaces this as a tool error. The mcp-client process **does not** auto-restart the daemon mid-session. The recommended client behavior is: catch the error, optionally restart the daemon yourself (`open -n -g -a CuaDriver --args serve`), reconnect the MCP transport, and retry. Hermes' wrapper in [trycua/hermes#22821](https://github.com/trycua/hermes/pull/22821) implements exactly this pattern.
### Socket path ownership
The socket lives at `~/Library/Caches/cua-driver/cua-driver.sock`. macOS cache hygiene may sweep `~/Library/Caches/` periodically; the daemon recreates the socket on startup, and the `--socket` flag overrides the path if you want to pin it elsewhere. A stale socket file from a crashed daemon is detected by the next `serve` invocation's flock probe (`cua-driver.lock`) and replaced atomically.
## Forcing one mode or the other
### Force in-process mode
Useful when the calling context already has the right TCC grants (e.g. you launched `cua-driver mcp` directly from `CuaDriver.app`), or for diagnosing in-process failures:
```bash
# CLI flag
cua-driver mcp --no-daemon-relaunch
# Environment variable (same effect; useful in MCP client config)
CUA_DRIVER_MCP_NO_RELAUNCH=1 cua-driver mcp
```
Either path keeps the entire MCP server in-process. The mcp client takes care of TCC by itself — if Accessibility / Screen Recording aren't granted to the calling shell, AX probes fail.
### Force daemon-proxy mode against a specific socket
For multi-daemon setups, or pointing at a daemon running under an alternate user / install:
```bash
cua-driver mcp --socket /tmp/my-cua-driver.sock
```
`--socket` only overrides the path — the relaunch heuristic still applies (if the daemon isn't listening at that path, `mcp` runs `open -n -g -a CuaDriver --args serve`, which writes to the default socket, not the override). For non-default sockets, start the daemon explicitly first:
```bash
cua-driver serve --socket /tmp/my-cua-driver.sock &
cua-driver mcp --socket /tmp/my-cua-driver.sock
```
### Force in-process mode by launching from CuaDriver.app
Launching `cua-driver mcp` through the bundle's main executable (or any path that LaunchServices attributes to `CuaDriver.app`) makes `Bundle.main.bundlePath` end in `.app`, which trips the heuristic's first short-circuit. In-process mode runs automatically — no flag needed.
## Recommendations for wrapper authors
If you ship a tool that spawns `cua-driver mcp` as a subprocess (Hermes, custom MCP brokers, agent-platform shims):
1. **Don't try to manage the daemon yourself.** `cua-driver mcp` already auto-spawns the daemon on demand. Adding a manual `cua-driver serve` step in your install flow creates two ways to start the daemon — and the user will hit ordering bugs.
2. **Treat the stdio MCP transport as the contract.** Both modes look identical from the MCP-client side. Don't branch on which mode you think is active; the heuristic is the driver's responsibility.
3. **Handle mid-session disconnects via MCP-level reconnect.** If a `CallTool` raises `MCPError.internalError` mentioning "daemon not reachable," tear down the MCP transport and reconnect with a fresh `cua-driver mcp` subprocess. The new subprocess will either find the daemon healthy or relaunch it.
4. **Surface the stderr line.** `cua-driver mcp` writes the daemon-relaunch notice to stderr exactly once at startup. Forward your subprocess's stderr to your own logs so operators can see which mode is active when debugging.
5. **Don't pass `--no-daemon-relaunch` by default.** That flag exists for niche cases (in-process diagnostics, fully sandboxed runtimes where `open -n -g -a` won't work). Defaulting it on makes IDE-terminal users hit the wrong-bundle TCC trap.
## See also
- [Installation](./installation) — granting TCC permissions to `CuaDriver.app`.
- [FAQ](./faq) — common TCC-attribution gotchas in IDE terminals.
- [CLI reference](/cua-driver/reference/cli-reference) — full surface area of `cua-driver mcp` flags, `cua-driver serve`, `cua-driver status`, `cua-driver stop`.
@@ -106,11 +106,32 @@ Print a tool's full description and JSON input schema.
Run the stdio MCP server.
When invoked from a shell or IDE terminal (Claude Code, Cursor,
VS Code, Warp), macOS TCC attributes the process to the parent
terminal — not to CuaDriver.app — so AX probes silently fail
against the wrong bundle id. To sidestep this without breaking
the stdio MCP transport, `mcp` detects the context, ensures a
`cua-driver serve` daemon is running under LaunchServices
(relaunching via `open -n -g -a CuaDriver --args serve` if not),
and proxies every MCP tool call through the daemon's Unix
socket. Tool semantics are identical to the in-process path.
Pass `--no-daemon-relaunch` (or set CUA_DRIVER_MCP_NO_RELAUNCH=1)
to force in-process execution — useful when the calling context
already has the right TCC grants (e.g. spawned from CuaDriver.app
directly), or for diagnosing in-process failures.
**Options:**
| Name | Type | Default | Description |
| ---- | ---- | ------- | ----------- |
| `--socket` | String | — | Override the daemon Unix socket path used by the proxy fallback. |
**Flags:**
| Name | Description |
| ---- | ----------- |
| `--claude-code-computer-use-compat` | Expose normal CuaDriver tools, replacing only `screenshot` with a Claude Code-friendly window-only screenshot that establishes the vision coordinate frame. |
| `--no-daemon-relaunch` | Stay in the current process instead of auto-launching a daemon and proxying through its Unix socket when invoked from a shell without CuaDriver.app's TCC grants. Also toggleable via CUA_DRIVER_MCP_NO_RELAUNCH=1. |
### cua-driver serve
@@ -20,6 +20,10 @@ Tool names are `snake_case`. Responses are MCP `CallTool.Result` envelopes: a te
Tool names here match the CLI form exactly. `cua-driver list_apps` and the MCP `list_apps` tool run the same code path.
</Callout>
<Callout type="info">
**TCC auto-delegation.** When an MCP client spawns `cua-driver mcp` from an IDE terminal (Claude Code, Cursor, VS Code, Warp), macOS attributes the subprocess to the parent terminal — not `CuaDriver.app` — so AX probes fail against the wrong bundle id. `mcp` detects this and auto-launches a `cua-driver serve` daemon via `open -n -g -a CuaDriver --args serve`, then proxies every tool call through the daemon's Unix socket. Tool semantics are identical to the in-process path; no Python bridge is needed. Pass `--no-daemon-relaunch` (or set `CUA_DRIVER_MCP_NO_RELAUNCH=1`) to force in-process execution. See the [process model guide](/cua-driver/guide/getting-started/process-model) for the full lifecycle, failure modes, and wrapper-author guidance.
</Callout>
### check_permissions
Report TCC permission status for Accessibility and Screen Recording.
@@ -0,0 +1,34 @@
import Darwin
import Foundation
/// Shared "is this binary running from inside an installed CuaDriver.app
/// bundle?" heuristic used by both `ServeCommand` (for the
/// auto-relaunch-via-`open` path) and `MCPCommand` (for the daemon proxy
/// path). Resolves `Bundle.main.executablePath` (falling back to
/// `CommandLine.arguments.first`) through any symlinks via `realpath` and
/// checks whether the resolved path lives inside some
/// `CuaDriver.app/Contents/MacOS/` directory.
///
/// That's the "installed via install-local.sh / install.sh" shape
/// `~/.local/bin/cua-driver` is a symlink into `/Applications/CuaDriver.app`,
/// and `realpath` walks into the bundle. Returns `false` for `swift run` /
/// raw `.build/<config>/cua-driver` dev invocations, which have no installed
/// bundle to relaunch into.
///
/// Subcommands may wrap this with additional gating (env vars, flags,
/// parent-pid checks, etc.) when their relaunch heuristics diverge.
func isExecutableInsideCuaDriverApp() -> Bool {
// Prefer Foundation's executablePath (stable, absolute).
// Fall back to argv[0] when unset, which realpath() still
// resolves via $PATH lookup at the shell level good enough
// for the cases we care about.
let candidate = Bundle.main.executablePath
?? CommandLine.arguments.first
?? ""
guard !candidate.isEmpty else { return false }
var buffer = [CChar](repeating: 0, count: Int(PATH_MAX))
guard realpath(candidate, &buffer) != nil else { return false }
let resolved = String(cString: buffer)
return resolved.contains("/CuaDriver.app/Contents/MacOS/")
}
@@ -342,7 +342,22 @@ struct CuaDriverEntryPoint {
struct MCPCommand: ParsableCommand {
static let configuration = CommandConfiguration(
commandName: "mcp",
abstract: "Run the stdio MCP server."
abstract: "Run the stdio MCP server.",
discussion: """
When invoked from a shell or IDE terminal (Claude Code, Cursor, \
VS Code, Warp), macOS TCC attributes the process to the parent \
terminal — not to CuaDriver.app — so AX probes silently fail \
against the wrong bundle id. To sidestep this without breaking \
the stdio MCP transport, `mcp` detects the context, ensures a \
`cua-driver serve` daemon is running under LaunchServices \
(relaunching via `open -n -g -a CuaDriver --args serve` if not), \
and proxies every MCP tool call through the daemon's Unix \
socket. Tool semantics are identical to the in-process path. \
Pass `--no-daemon-relaunch` (or set CUA_DRIVER_MCP_NO_RELAUNCH=1) \
to force in-process execution — useful when the calling context \
already has the right TCC grants (e.g. spawned from CuaDriver.app \
directly), or for diagnosing in-process failures.
"""
)
@Flag(
@@ -356,7 +371,38 @@ struct MCPCommand: ParsableCommand {
)
var claudeCodeComputerUseCompat: Bool = false
@Flag(
name: .long,
help: """
Stay in the current process instead of auto-launching a daemon \
and proxying through its Unix socket when invoked from a shell \
without CuaDriver.app's TCC grants. Also toggleable via \
CUA_DRIVER_MCP_NO_RELAUNCH=1.
"""
)
var noDaemonRelaunch: Bool = false
@Option(
name: .long,
help: "Override the daemon Unix socket path used by the proxy fallback."
)
var socket: String?
func run() throws {
// TCC sidestep. Same heuristic the `serve` subcommand uses
// (shell-spawned bare binary that resolves into a CuaDriver.app
// bundle), gated by an explicit env / flag opt-out. When the
// shell already has the right TCC context (e.g. CuaDriver.app
// launched us directly), this returns false and we stay
// in-process exactly like before. The proxy path is purely
// additive: it gives stdio MCP clients spawned from IDE
// terminals a correct TCC context without requiring an external
// bridge.
if shouldUseDaemonProxy() {
try runViaDaemonProxy()
return
}
// MCP stdio runs for the lifetime of the host process, so we
// bootstrap AppKit here the agent cursor overlay (disabled
// by default, enabled via `set_agent_cursor_enabled`) needs a
@@ -404,6 +450,135 @@ struct MCPCommand: ParsableCommand {
}
}
extension MCPCommand {
/// Decide whether the current `mcp` invocation should auto-launch a
/// daemon and proxy every MCP tool call through its Unix socket.
/// Mirror of `ServeCommand.shouldRelaunchViaOpen()` same heuristic,
/// same env override convention, separate flag so callers can opt
/// each surface in/out independently.
fileprivate func shouldUseDaemonProxy() -> Bool {
if noDaemonRelaunch { return false }
if isEnvTruthy(ProcessInfo.processInfo.environment["CUA_DRIVER_MCP_NO_RELAUNCH"]) {
return false
}
// When AppKit already attributes us to CuaDriver.app either
// because LaunchServices spawned us, or the user invoked the
// bundle's main executable directly `Bundle.main.bundlePath`
// ends in `.app`. Either case has the right TCC context.
if Bundle.main.bundlePath.hasSuffix(".app") { return false }
// The bare-binary path must resolve into an installed
// CuaDriver.app bundle, otherwise there's nothing for the
// daemon side to land in. Raw `swift run` dev invocations fail
// this check and stay in-process.
guard isExecutableInsideCuaDriverApp() else { return false }
// ppid == 1 means launchd already reparented us we're
// post-LaunchServices and have the right TCC context.
if getppid() == 1 { return false }
return true
}
/// Ensure a `cua-driver serve` daemon is running under the right TCC
/// context, then run the MCP stdio server with `ListTools` /
/// `CallTool` handlers that forward every request through
/// `~/Library/Caches/cua-driver/cua-driver.sock`. Falls back to
/// in-process on launch failure with a diagnostic and a pointer at
/// the `--no-daemon-relaunch` escape hatch.
fileprivate func runViaDaemonProxy() throws {
let socketPath = socket ?? DaemonPaths.defaultSocketPath()
if !DaemonClient.isDaemonListening(socketPath: socketPath) {
FileHandle.standardError.write(
Data(
"cua-driver: mcp launched without CuaDriver.app's TCC grants; auto-launching the daemon via `open -n -g -a CuaDriver --args serve` and proxying MCP requests through it. Pass --no-daemon-relaunch to stay in-process.\n"
.utf8))
try launchDaemonViaOpen()
try waitForDaemon(socketPath: socketPath, timeout: 10.0)
}
let serverName = claudeCodeComputerUseCompat ? "computer-use" : "cua-driver"
let compat = claudeCodeComputerUseCompat
// The MCP `Server` actor + `StdioTransport` use Swift
// concurrency, so we need a live async runtime. Reuse
// `AppKitBootstrap` for that it's the same syncasync bridge
// the in-process path already takes, and the idle AppKit
// run-loop costs us nothing here (no AX work runs in this
// process). Critically we skip PermissionsGate entirely: the
// daemon owns TCC, and AX probes against this process would
// lie because we're attributed to the calling shell.
AppKitBootstrap.runBlockingAppKitWith {
let server = try await CuaDriverMCPServer.makeProxy(
serverName: serverName,
socketPath: socketPath,
claudeCodeComputerUseCompat: compat
)
let transport = StdioTransport()
try await server.start(transport: transport)
await server.waitUntilCompleted()
}
}
/// Spawn `/usr/bin/open -n -g -a CuaDriver --args serve`. Mirror of
/// `ServeCommand.relaunchViaOpen` minus the post-launch probe (we
/// poll separately via `waitForDaemon`, since the timeout there is
/// MCP-specific).
fileprivate func launchDaemonViaOpen() throws {
let process = Process()
process.executableURL = URL(fileURLWithPath: "/usr/bin/open")
// -n: force a new instance. CuaDriver.app may already be
// running from a previous `mcp` (different MCP client
// session); without -n, `open -a` would re-use it and
// drop our `--args serve`, leaving no daemon up.
// -g: keep the new instance backgrounded. CuaDriver.app is
// LSUIElement=true anyway, but this makes that explicit.
process.arguments = ["-n", "-g", "-a", "CuaDriver", "--args", "serve"]
process.standardOutput = FileHandle.nullDevice
process.standardError = FileHandle.nullDevice
do {
try process.run()
} catch {
FileHandle.standardError.write(
Data(
"cua-driver: failed to exec `/usr/bin/open`: \(error). Pass --no-daemon-relaunch to bypass.\n"
.utf8))
throw ExitCode(1)
}
process.waitUntilExit()
if process.terminationStatus != 0 {
FileHandle.standardError.write(
Data(
"cua-driver: `open -n -g -a CuaDriver --args serve` exited \(process.terminationStatus). Check that `/Applications/CuaDriver.app` is installed, or pass --no-daemon-relaunch to bypass.\n"
.utf8))
throw ExitCode(1)
}
}
/// Block (up to `timeout` seconds) until `socketPath` accepts a
/// protocol-speaking probe. Throws `ExitCode(1)` with a diagnostic
/// if the daemon never appears usually means the user hasn't
/// granted Accessibility / Screen Recording to CuaDriver.app yet
/// and the daemon's PermissionsGate is waiting on a dialog.
fileprivate func waitForDaemon(socketPath: String, timeout: TimeInterval) throws {
let deadline = Date().addingTimeInterval(timeout)
while Date() < deadline {
if DaemonClient.isDaemonListening(socketPath: socketPath) {
return
}
usleep(100_000) // 100ms
}
FileHandle.standardError.write(
Data(
"cua-driver: daemon did not appear on \(socketPath) within \(Int(timeout))s. If this is the first launch, grant Accessibility + Screen Recording to CuaDriver.app in System Settings and retry. Pass --no-daemon-relaunch to stay in-process.\n"
.utf8))
throw ExitCode(1)
}
private func isEnvTruthy(_ value: String?) -> Bool {
guard let value = value?.lowercased() else { return false }
return ["1", "true", "yes", "on"].contains(value)
}
}
/// Bootstrap AppKit on the main thread so `AgentCursor` can draw its
/// overlay window + CA animations. The caller's async work runs on a
/// detached Task; the main thread blocks inside `NSApplication.run()`
@@ -91,11 +91,28 @@ enum CLIDocExtractor {
CommandDoc(
name: "mcp",
abstract: "Run the stdio MCP server.",
discussion: nil,
discussion: """
When invoked from a shell or IDE terminal (Claude Code, Cursor,
VS Code, Warp), macOS TCC attributes the process to the parent
terminal — not to CuaDriver.app — so AX probes silently fail
against the wrong bundle id. To sidestep this without breaking
the stdio MCP transport, `mcp` detects the context, ensures a
`cua-driver serve` daemon is running under LaunchServices
(relaunching via `open -n -g -a CuaDriver --args serve` if not),
and proxies every MCP tool call through the daemon's Unix
socket. Tool semantics are identical to the in-process path.
Pass `--no-daemon-relaunch` (or set CUA_DRIVER_MCP_NO_RELAUNCH=1)
to force in-process execution — useful when the calling context
already has the right TCC grants (e.g. spawned from CuaDriver.app
directly), or for diagnosing in-process failures.
""",
arguments: [],
options: [],
options: [
OptionDoc(name: "socket", shortName: nil, help: "Override the daemon Unix socket path used by the proxy fallback.", type: "String", defaultValue: nil, isOptional: true),
],
flags: [
FlagDoc(name: "claude-code-computer-use-compat", shortName: nil, help: "Expose normal CuaDriver tools, replacing only `screenshot` with a Claude Code-friendly window-only screenshot that establishes the vision coordinate frame.", defaultValue: false),
FlagDoc(name: "no-daemon-relaunch", shortName: nil, help: "Stay in the current process instead of auto-launching a daemon and proxying through its Unix socket when invoked from a shell without CuaDriver.app's TCC grants. Also toggleable via CUA_DRIVER_MCP_NO_RELAUNCH=1.", defaultValue: false),
],
subcommands: []
)
@@ -212,7 +212,7 @@ extension ServeCommand {
// bundle on disk the symlink case. Raw `swift run` dev
// invocations resolve into `.build/<config>/cua-driver`
// instead, and have no bundle to relaunch into.
guard resolvedExecutableIsInsideCuaDriverApp() else { return false }
guard isExecutableInsideCuaDriverApp() else { return false }
// ppid == 1 means we're already a LaunchServices-spawned process
// (or orphaned into init, in which case relaunching wouldn't
// change anything useful anyway).
@@ -308,31 +308,6 @@ extension ServeCommand {
throw ExitCode(1)
}
/// True when the argv[0] / executablePath resolves (through any
/// symlinks) to a binary physically living inside some
/// `CuaDriver.app/Contents/MacOS/` directory. That's the "installed
/// via install-local.sh / install.sh" shape `~/.local/bin/cua-driver`
/// is a symlink into `/Applications/CuaDriver.app`, and `realpath`
/// walks into the bundle.
///
/// Returns false for `swift run` / raw `.build/<config>/cua-driver`
/// dev invocations, which have no installed bundle to relaunch into.
private func resolvedExecutableIsInsideCuaDriverApp() -> Bool {
// Prefer Foundation's executablePath (stable, absolute).
// Fall back to argv[0] when unset, which realpath() still
// resolves via $PATH lookup at the shell level good enough
// for the cases we care about.
let candidate = Bundle.main.executablePath
?? CommandLine.arguments.first
?? ""
guard !candidate.isEmpty else { return false }
var buffer = [CChar](repeating: 0, count: Int(PATH_MAX))
guard realpath(candidate, &buffer) != nil else { return false }
let resolved = String(cString: buffer)
return resolved.contains("/CuaDriver.app/Contents/MacOS/")
}
/// Accepts the same truthy-value conventions the rest of the CLI
/// uses for env overrides (see `UpdateCommand` / `TelemetryClient`).
private func isEnvTruthy(_ value: String?) -> Bool {
@@ -27,4 +27,183 @@ public enum CuaDriverMCPServer {
return server
}
/// Build an MCP Server whose `ListTools` / `CallTool` handlers forward
/// every request to a running `cua-driver serve` daemon over its Unix
/// domain socket. Used by the `mcp` subcommand's TCC-sidestep path:
/// when stdio MCP is spawned from an IDE terminal, the process inherits
/// the terminal's TCC responsibility chain so AX probes silently fail.
/// Proxying through the daemon which runs under LaunchServices and is
/// correctly attributed to `com.trycua.driver` gives MCP clients
/// identical behavior without requiring an external Python bridge.
///
/// `claudeCodeComputerUseCompat` advertises the compat tool set in
/// `ListTools`, but every `CallTool` still hits the daemon. The daemon
/// always exposes the full native registry; the shim is purely a
/// client-side rename of `screenshot` and is implemented entirely by
/// the in-process MCP layer. When proxying, we therefore rewrite the
/// `screenshot` tool advertised to the client into its compat-mode
/// shape and translate inbound `screenshot` calls back into the
/// equivalent native daemon call.
public static func makeProxy(
serverName: String = "cua-driver",
version: String = CuaDriverCore.version,
socketPath: String,
claudeCodeComputerUseCompat: Bool = false
) async throws -> Server {
let server = Server(
name: serverName,
version: version,
capabilities: Server.Capabilities(tools: .init(listChanged: false))
)
// Cache the tool list once at startup. Daemon registries are
// static every connected client sees the same handlers so a
// single fetch is enough for the life of the stdio MCP session.
// Fail fast on a missing/unhealthy daemon so the MCP client sees
// a clear startup error instead of a "successful" handshake that
// advertises zero tools and then errors on every `CallTool`.
let cachedToolList = try await fetchProxyToolList(
socketPath: socketPath,
claudeCodeComputerUseCompat: claudeCodeComputerUseCompat
)
await server.withMethodHandler(ListTools.self) { _ in
ListTools.Result(tools: cachedToolList)
}
await server.withMethodHandler(CallTool.self) { params in
let (name, args) = rewriteForProxy(
name: params.name,
arguments: params.arguments,
claudeCodeComputerUseCompat: claudeCodeComputerUseCompat
)
return try await forwardCallToDaemon(
name: name,
arguments: args,
socketPath: socketPath
)
}
return server
}
/// Translate `(name, arguments)` from the MCP client's view of the
/// compat tool surface into the native daemon registry's view.
///
/// Compat-mode `screenshot` takes `{pid, window_id}` and returns a
/// JPEG; the daemon's native `screenshot` takes `{window_id, format,
/// quality}` and defaults to PNG. We map the former onto the latter
/// by dropping the unused `pid` and pinning `format: "jpeg",
/// quality: 85` to match the compat shim's output shape.
///
/// Non-compat mode passes through unchanged.
private static func rewriteForProxy(
name: String,
arguments: [String: Value]?,
claudeCodeComputerUseCompat: Bool
) -> (String, [String: Value]?) {
guard claudeCodeComputerUseCompat else { return (name, arguments) }
if name == "screenshot" {
var rewritten: [String: Value] = [:]
if let windowID = arguments?["window_id"] {
rewritten["window_id"] = windowID
}
rewritten["format"] = .string("jpeg")
rewritten["quality"] = .int(85)
return (name, rewritten)
}
return (name, arguments)
}
/// One-shot daemon `list` over the UDS, with the compat-mode rename
/// applied client-side. Throws a descriptive `MCPError.internalError`
/// if the daemon is unreachable, transport-failed, or returned an
/// unexpected envelope surfacing the failure during `makeProxy`'s
/// init rather than producing a proxy that advertises zero tools and
/// errors on every subsequent `CallTool`.
private static func fetchProxyToolList(
socketPath: String,
claudeCodeComputerUseCompat: Bool
) async throws -> [Tool] {
let request = DaemonRequest(method: "list")
let result = DaemonClient.sendRequest(request, socketPath: socketPath)
let tools: [Tool]
switch result {
case .noDaemon:
throw MCPError.internalError(
"cua-driver daemon not reachable on \(socketPath). "
+ "Start it with `open -n -g -a CuaDriver --args serve` and retry."
)
case .error(let message):
throw MCPError.internalError(
"cua-driver daemon transport error while listing tools on \(socketPath): \(message)"
)
case .ok(let response):
guard response.ok, case let .list(listed) = response.result else {
let reason = response.error ?? "daemon returned unexpected result kind for list"
throw MCPError.internalError(
"cua-driver daemon refused tool list on \(socketPath): \(reason)"
)
}
tools = listed
}
if !claudeCodeComputerUseCompat {
return tools
}
// Compat mode: swap the native `screenshot` tool descriptor for
// the window-only shim's descriptor so MCP clients see the same
// schema they'd see in the in-process compat registry.
let compatHandlers = ClaudeCodeComputerUseCompatTools.all
let compatToolsByName = Dictionary(
uniqueKeysWithValues: compatHandlers.map { ($0.tool.name, $0.tool) }
)
return tools.map { tool in
compatToolsByName[tool.name] ?? tool
}
}
/// Forward a single `CallTool` invocation to the daemon and translate
/// the `DaemonResponse` back into an MCP `CallTool.Result` (or throw
/// `MCPError` on protocol-level failures).
///
/// Tool-level errors i.e. the tool ran but returned `isError: true`
/// round-trip cleanly as part of the `.call` payload, so MCP clients
/// see exactly the same error envelope they would in the in-process
/// path. Only daemon-level failures (socket gone, decode error, unknown
/// tool) throw.
private static func forwardCallToDaemon(
name: String,
arguments: [String: Value]?,
socketPath: String
) async throws -> CallTool.Result {
let request = DaemonRequest(method: "call", name: name, args: arguments)
// Match the daemon's own per-call read budget. AX-heavy tools
// (e.g. `screenshot`, `get_window_state`) regularly take a few
// seconds; the default 120s in `DaemonClient` is plenty.
let result = DaemonClient.sendRequest(request, socketPath: socketPath)
switch result {
case .noDaemon:
throw MCPError.internalError(
"cua-driver daemon not reachable on \(socketPath). "
+ "Start it with `open -n -g -a CuaDriver --args serve` and retry."
)
case .error(let message):
throw MCPError.internalError("daemon transport: \(message)")
case .ok(let response):
if !response.ok {
let reason = response.error ?? "daemon reported failure"
if response.exitCode == DaemonExit.usage {
throw MCPError.invalidParams(reason)
}
throw MCPError.internalError(reason)
}
guard case let .call(callResult) = response.result else {
throw MCPError.internalError(
"daemon returned unexpected result kind for call"
)
}
return callResult
}
}
}
+6
View File
@@ -599,6 +599,12 @@ export function generateMCPToolsMDX(docs: MCPDocumentation, releasedVersion: str
);
lines.push('</Callout>');
lines.push('');
lines.push('<Callout type="info">');
lines.push(
' **TCC auto-delegation.** When an MCP client spawns `cua-driver mcp` from an IDE terminal (Claude Code, Cursor, VS Code, Warp), macOS attributes the subprocess to the parent terminal — not `CuaDriver.app` — so AX probes fail against the wrong bundle id. `mcp` detects this and auto-launches a `cua-driver serve` daemon via `open -n -g -a CuaDriver --args serve`, then proxies every tool call through the daemon\'s Unix socket. Tool semantics are identical to the in-process path; no Python bridge is needed. Pass `--no-daemon-relaunch` (or set `CUA_DRIVER_MCP_NO_RELAUNCH=1`) to force in-process execution. See the [process model guide](/cua-driver/guide/getting-started/process-model) for the full lifecycle, failure modes, and wrapper-author guidance.'
);
lines.push('</Callout>');
lines.push('');
// Emit each tool
for (const tool of docs.tools) {