Commit Graph

76 Commits

Author SHA1 Message Date
James Murdza
deb2132aef Fix hackathon notebook 2025-09-13 03:20:08 -04:00
James Murdza
993d52527f Rename hackathon notebook 2025-09-13 01:50:01 -04:00
James Murdza
981a081672 Improve notebook structure 2025-09-13 01:42:06 -04:00
James Murdza
8938b37ca7 Change hackathon notebook to use Docker 2025-09-13 01:42:05 -04:00
James Murdza
77d91ef6e1 Clarify instructions in hackathon notebook 2025-09-13 00:57:14 -04:00
James Murdza
4ec4bbc888 Add link to HUD integration documentation 2025-09-12 21:41:14 -04:00
James Murdza
3552ef62a8 Add relevant links to docs 2025-09-12 18:41:38 -04:00
James Murdza
73fb0002f0 Improve notebook structure 2025-09-12 18:36:56 -04:00
James Murdza
4c52aaa298 Assert that HUD_API_KEY is set 2025-09-12 18:27:34 -04:00
James Murdza
b72d8da8a7 Add Prequisites section 2025-09-12 18:13:16 -04:00
James Murdza
ea1caea73c Reuse agent configuration for HUD evaluation 2025-09-12 18:00:27 -04:00
James Murdza
48e42d2334 Only load .env file in notebook directory 2025-09-12 18:00:27 -04:00
James Murdza
4b3e2077fb Remove dataset size limit during HUD evaluation 2025-09-12 12:55:43 -04:00
James Murdza
68ecdcc99a Assert Cua API keys exist in notebook 2025-09-12 12:55:22 -04:00
James Murdza
1aca043006 Automatically create .env file in notebook 2025-09-12 12:55:11 -04:00
James Murdza
28f206d824 Improve explanatory text in notebook 2025-09-12 12:54:50 -04:00
James Murdza
4dedd06c5b Improve notebook structure 2025-09-12 12:39:37 -04:00
Dillon DuPont
665e65cb85 Replaced computer shim with Docker computer 2025-09-09 11:00:52 -04:00
Dillon DuPont
f270af30e1 added notebook 2025-09-09 10:57:16 -04:00
James Murdza
796835b9e5 Add trajectory viewer and VNC instructions to notebook 2025-09-08 09:53:42 -04:00
James Murdza
64b555bb34 Update notebook to use OSWorld-Tiny dataset 2025-09-08 09:53:42 -04:00
James Murdza
cd59a63a49 Fix URL in example notebook 2025-09-08 09:53:42 -04:00
James Murdza
c5ca6e9e9f Add Jupyter notebook for the SOTA challenge 2025-09-05 07:43:45 -04:00
James Murdza
1882f099a2 Change HUD dataset name from OSWorld-Verified-XLang to OSWorld-Verified 2025-09-03 11:18:43 -04:00
James Murdza
c820d5124d Load environment variables in HUD notebook 2025-09-02 16:10:13 -04:00
James Murdza
33ce7515a5 Clear HUD notebook outputs 2025-09-02 16:08:24 -04:00
Dillon DuPont
c4ce791a49 Update OSWorld output 2025-08-28 12:07:56 -04:00
ddupont
311bbf9709 Merge pull request #371 from trycua/chore/hud-upgrade
[Agent] Upgrade HUD SDK to 0.4.12
2025-08-28 11:29:18 -04:00
Dillon DuPont
95cefc50f0 added extended kwargs, renamed callback to normalizer 2025-08-27 20:49:31 -04:00
Dillon DuPont
0d3f8ea3ff Improved trajectory saving 2025-08-27 16:48:57 -04:00
Dillon DuPont
e8eaf66e2a Added latest nb 2025-08-27 13:38:55 -04:00
Dillon DuPont
3c502354a8 added simple task id 2025-08-27 13:28:24 -04:00
Dillon DuPont
61a442da56 fixed getattr crash 2025-08-27 13:21:46 -04:00
James Murdza
afe01ff831 Add missing comma in example code 2025-08-27 12:53:55 -04:00
f-trycua
a6406ae179 Update notebooks for KASM Docker 2025-08-26 13:05:35 +00:00
James Murdza
69713ce677 Merge pull request #361 from onel/main
Added a readme file to the notebooks folder
2025-08-23 22:41:28 -04:00
James Murdza
f9a317c190 Use correct return type for run_command 2025-08-21 19:13:28 -04:00
James Murdza
3b76612d66 Initialize computer before taking screenshot 2025-08-21 19:13:28 -04:00
James Murdza
33f81131d9 Fix variable name 2025-08-21 19:13:28 -04:00
James Murdza
141c9d3ad7 Fix out-of-date agent SDK code in notebook 2025-08-21 19:13:28 -04:00
James Murdza
b302a44ccf Remove nonexistent script from computer server notebook 2025-08-21 19:13:28 -04:00
James Murdza
02e5e62591 Fix out-of-date agent SDK code in operator example 2025-08-21 19:13:28 -04:00
Andrei Onel
145a845232 Update README.md 2025-08-20 00:03:48 +03:00
copilot-swe-agent[bot]
8a51468162 Add README.md to notebooks folder explaining content and purpose
Co-authored-by: onel <1862405+onel@users.noreply.github.com>
2025-08-19 19:38:15 +00:00
James Murdza
9a4c7215d8 Upgrade Claude 3.5 snapshot to a version with computer use support 2025-08-19 11:40:05 -04:00
Dillon DuPont
a60cf26bb8 Added last run 2025-08-12 12:00:35 -04:00
Dillon DuPont
8bbcbec54b updated notebook 2025-08-08 19:46:19 -04:00
Dillon DuPont
5495529462 limited tasks in notebook 2025-08-08 18:26:44 -04:00
Dillon DuPont
f819c578b7 Add example notebook 2025-08-08 13:14:56 -04:00
James Murdza
d27ee728b5 Fix broken import after refactor in 5bfadf8f9a 2025-08-04 17:02:11 -04:00