docs: expand CUA description with screenshot-VLM-action loop

Explain how computer-use agents work by capturing screenshots, feeding them to a VLM, and determining the next action in a continuous loop.
2026-01-05 04:50:08 -06:00 · 2025-12-08 21:16:57 -08:00
parent 4023b191ca
commit 7fe1b16070
1 changed files with 1 additions and 1 deletions
--- a/docs/content/docs/index.mdx
+++ b/docs/content/docs/index.mdx
@@ -17,7 +17,7 @@ import { Monitor, Code, BookOpen, Zap, Bot, Boxes, Rocket } from 'lucide-react';

 ## What is a Computer-Use Agent?

-Computer-Use Agents (CUAs) are AI systems that can autonomously interact with computer interfaces through visual understanding and action execution.
+Computer-Use Agents (CUAs) are AI systems that can autonomously interact with computer interfaces through visual understanding and action execution. They work by capturing screenshots, feeding them to a vision-language model (VLM), and letting the model determine the next action to take - such as clicking, typing, or scrolling - in a continuous loop until the task is complete.

 ## What is a Computer-Use Sandbox?