mirror of
https://github.com/trycua/computer.git
synced 2026-05-09 08:49:33 -05:00
Merge pull request #380 from trycua/feat/move-library-readmes-to-docs
Move miscellaneous library READMEs to docs
This commit is contained in:
@@ -0,0 +1,66 @@
|
||||
---
|
||||
title: Configuration
|
||||
---
|
||||
|
||||
### Detection Parameters
|
||||
|
||||
#### Box Threshold (0.3)
|
||||
Controls the confidence threshold for accepting detections:
|
||||
<img src="/docs/img/som_box_threshold.png" alt="Illustration of confidence thresholds in object detection, with a high-confidence detection accepted and a low-confidence detection rejected." width="500px" />
|
||||
- Higher values (0.3) yield more precise but fewer detections
|
||||
- Lower values (0.01) catch more potential icons but increase false positives
|
||||
- Default is 0.3 for optimal precision/recall balance
|
||||
|
||||
#### IOU Threshold (0.1)
|
||||
Controls how overlapping detections are merged:
|
||||
<img src="/docs/img/som_iou_threshold.png" alt="Diagram showing Intersection over Union (IOU) with low overlap between two boxes kept separate and high overlap leading to merging." width="500px" />
|
||||
- Lower values (0.1) more aggressively remove overlapping boxes
|
||||
- Higher values (0.5) allow more overlapping detections
|
||||
- Default is 0.1 to handle densely packed UI elements
|
||||
|
||||
### OCR Configuration
|
||||
|
||||
- **Engine**: EasyOCR
|
||||
- Primary choice for all platforms
|
||||
- Fast initialization and processing
|
||||
- Built-in English language support
|
||||
- GPU acceleration when available
|
||||
|
||||
- **Settings**:
|
||||
- Timeout: 5 seconds
|
||||
- Confidence threshold: 0.5
|
||||
- Paragraph mode: Disabled
|
||||
- Language: English only
|
||||
|
||||
## Performance
|
||||
|
||||
### Hardware Acceleration
|
||||
|
||||
#### MPS (Metal Performance Shaders)
|
||||
- Multi-scale detection (640px, 1280px, 1920px)
|
||||
- Test-time augmentation enabled
|
||||
- Half-precision (FP16)
|
||||
- Average detection time: ~0.4s
|
||||
- Best for production use when available
|
||||
|
||||
#### CPU
|
||||
- Single-scale detection (1280px)
|
||||
- Full-precision (FP32)
|
||||
- Average detection time: ~1.3s
|
||||
- Reliable fallback option
|
||||
|
||||
### Example Output Structure
|
||||
|
||||
```
|
||||
examples/output/
|
||||
├── {timestamp}_no_ocr/
|
||||
│ ├── annotated_images/
|
||||
│ │ └── screenshot_analyzed.png
|
||||
│ ├── screen_details.txt
|
||||
│ └── summary.json
|
||||
└── {timestamp}_ocr/
|
||||
├── annotated_images/
|
||||
│ └── screenshot_analyzed.png
|
||||
├── screen_details.txt
|
||||
└── summary.json
|
||||
```
|
||||
Binary file not shown.
|
After Width: | Height: | Size: 974 KiB |
Binary file not shown.
|
After Width: | Height: | Size: 2.2 MiB |
@@ -35,4 +35,11 @@ pip install cua-computer-server
|
||||
|
||||
Refer to this notebook for a step-by-step guide on how to use the Computer-Use Server on the host system or VM:
|
||||
|
||||
- [Computer-Use Server](../../notebooks/computer_server_nb.ipynb)
|
||||
- [Computer-Use Server](../../notebooks/computer_server_nb.ipynb)
|
||||
|
||||
## Docs
|
||||
|
||||
- [Commands](https://trycua.com/docs/libraries/computer-server/Commands)
|
||||
- [REST-API](https://trycua.com/docs/libraries/computer-server/REST-API)
|
||||
- [WebSocket-API](https://trycua.com/docs/libraries/computer-server/WebSocket-API)
|
||||
- [Index](https://trycua.com/docs/libraries/computer-server/index)
|
||||
@@ -75,93 +75,9 @@ for elem in result.elements:
|
||||
print(f"Text: '{elem.content}', confidence={elem.confidence:.3f}")
|
||||
```
|
||||
|
||||
## Configuration
|
||||
## Docs
|
||||
|
||||
### Detection Parameters
|
||||
|
||||
#### Box Threshold (0.3)
|
||||
Controls the confidence threshold for accepting detections:
|
||||
```
|
||||
High Threshold (0.3): Low Threshold (0.01):
|
||||
+----------------+ +----------------+
|
||||
| | | +--------+ |
|
||||
| Confident | | |Unsure?| |
|
||||
| Detection | | +--------+ |
|
||||
| (✓ Accept) | | (? Reject) |
|
||||
| | | |
|
||||
+----------------+ +----------------+
|
||||
conf = 0.85 conf = 0.02
|
||||
```
|
||||
- Higher values (0.3) yield more precise but fewer detections
|
||||
- Lower values (0.01) catch more potential icons but increase false positives
|
||||
- Default is 0.3 for optimal precision/recall balance
|
||||
|
||||
#### IOU Threshold (0.1)
|
||||
Controls how overlapping detections are merged:
|
||||
```
|
||||
IOU = Intersection Area / Union Area
|
||||
|
||||
Low Overlap (Keep Both): High Overlap (Merge):
|
||||
+----------+ +----------+
|
||||
| Box1 | | Box1 |
|
||||
| | vs. |+-----+ |
|
||||
+----------+ ||Box2 | |
|
||||
+----------+ |+-----+ |
|
||||
| Box2 | +----------+
|
||||
| |
|
||||
+----------+
|
||||
IOU ≈ 0.05 (Keep Both) IOU ≈ 0.7 (Merge)
|
||||
```
|
||||
- Lower values (0.1) more aggressively remove overlapping boxes
|
||||
- Higher values (0.5) allow more overlapping detections
|
||||
- Default is 0.1 to handle densely packed UI elements
|
||||
|
||||
### OCR Configuration
|
||||
|
||||
- **Engine**: EasyOCR
|
||||
- Primary choice for all platforms
|
||||
- Fast initialization and processing
|
||||
- Built-in English language support
|
||||
- GPU acceleration when available
|
||||
|
||||
- **Settings**:
|
||||
- Timeout: 5 seconds
|
||||
- Confidence threshold: 0.5
|
||||
- Paragraph mode: Disabled
|
||||
- Language: English only
|
||||
|
||||
## Performance
|
||||
|
||||
### Hardware Acceleration
|
||||
|
||||
#### MPS (Metal Performance Shaders)
|
||||
- Multi-scale detection (640px, 1280px, 1920px)
|
||||
- Test-time augmentation enabled
|
||||
- Half-precision (FP16)
|
||||
- Average detection time: ~0.4s
|
||||
- Best for production use when available
|
||||
|
||||
#### CPU
|
||||
- Single-scale detection (1280px)
|
||||
- Full-precision (FP32)
|
||||
- Average detection time: ~1.3s
|
||||
- Reliable fallback option
|
||||
|
||||
### Example Output Structure
|
||||
|
||||
```
|
||||
examples/output/
|
||||
├── {timestamp}_no_ocr/
|
||||
│ ├── annotated_images/
|
||||
│ │ └── screenshot_analyzed.png
|
||||
│ ├── screen_details.txt
|
||||
│ └── summary.json
|
||||
└── {timestamp}_ocr/
|
||||
├── annotated_images/
|
||||
│ └── screenshot_analyzed.png
|
||||
├── screen_details.txt
|
||||
└── summary.json
|
||||
```
|
||||
- [Configuration](http://localhost:8090/docs/libraries/som/configuration)
|
||||
|
||||
## Development
|
||||
|
||||
|
||||
@@ -1,22 +1,47 @@
|
||||
# Cua Core TypeScript Library
|
||||
<div align="center">
|
||||
<h1>
|
||||
<div class="image-wrapper" style="display: inline-block;">
|
||||
<picture>
|
||||
<source media="(prefers-color-scheme: dark)" alt="logo" height="150" srcset="https://raw.githubusercontent.com/trycua/cua/main/img/logo_white.png" style="display: block; margin: auto;">
|
||||
<source media="(prefers-color-scheme: light)" alt="logo" height="150" srcset="https://raw.githubusercontent.com/trycua/cua/main/img/logo_black.png" style="display: block; margin: auto;">
|
||||
<img alt="Shows my svg">
|
||||
</picture>
|
||||
</div>
|
||||
|
||||
The core cua library with support for telemetry and other utilities.
|
||||
[](#)
|
||||
[](#)
|
||||
[](https://discord.com/invite/mVnXXpdE85)
|
||||
[](https://www.npmjs.com/package/@trycua/core)
|
||||
</h1>
|
||||
</div>
|
||||
|
||||
**Cua Core** provides essential shared functionality and utilities used across the Cua ecosystem:
|
||||
|
||||
- Privacy-focused telemetry system for transparent usage analytics
|
||||
- Common helper functions and utilities used by other Cua packages
|
||||
- Core infrastructure components shared between modules
|
||||
|
||||
## Installation
|
||||
|
||||
```bash
|
||||
pnpm install @trycua/core
|
||||
```
|
||||
|
||||
## Development
|
||||
|
||||
- Install dependencies:
|
||||
Install dependencies:
|
||||
|
||||
```bash
|
||||
pnpm install
|
||||
```
|
||||
|
||||
- Run the unit tests:
|
||||
Run the unit tests:
|
||||
|
||||
```bash
|
||||
pnpm test
|
||||
```
|
||||
|
||||
- Build the library:
|
||||
Build the library:
|
||||
|
||||
```bash
|
||||
pnpm build
|
||||
|
||||
Reference in New Issue
Block a user