Commit Graph

176 Commits

Author SHA1 Message Date
abelanger5 55eb63d9a4 fix: replay without group keys and status updates (#883) 2024-09-16 16:59:34 -04:00
Gabe Ruttner 2379e3638a fix: reset on replay (#875) 2024-09-16 17:01:51 +00:00
Gabe Ruttner af9ed49f1e fix: events list view (#878)
* fix: filter by event id

* fix: run count

* feat: filter by id api

* feat: filter by Event Id

* chore: default page is runs

* feat: cancel event runs

---------

Co-authored-by: Alexander Belanger <alexander@hatchet.run>
2024-09-16 16:46:31 +00:00
Gabe Ruttner c64c62f66a feat: improved workflow run details page (#821)
* wip: rip prisma

* wip

* wip

* fix: lint

* wip

* wip

* gen

* wip

* wip

* fix trigger

* hide overview

* revert db changes

* feat: wrap up frontend changes and perf

* chore: generate

* chore: frontend build

* fix: workflow transformer

* fix: avoid race conditions on simultaneous parent completions

* fix: 2025 started

* feat: toast for replay/cancel

* fix: toast

---------

Co-authored-by: Alexander Belanger <alexander@hatchet.run>
2024-09-16 15:39:49 +00:00
abelanger5 893637cb0f fix: improve LinkStepRunParents to prevent usage of temp files (#874) 2024-09-12 17:19:58 -04:00
abelanger5 bed2cb559a fix: add back sem slots, without row contention (#868)
* fix: add back sem slots, without row contention

* fix: serialize queue step runs to prevent dirty reads

* remove serializable for now

* statement timeouts on create workflow run

* statement timeout for reassign

* proper migration + cleanup

* remove old tables and code

* fix: worker slot state

* remove last unused table from workers
2024-09-11 20:47:49 +00:00
abelanger5 f4c5cd973e feat: more efficient step run timeouts (#863) 2024-09-10 18:23:11 -04:00
abelanger5 b635c875f6 fix: race conditions on release slot (#858)
* fix: race conditions on release slot

* better engine logs for ci

* fix: improve cancellation

* better debug logs and increase timeout
2024-09-10 14:22:32 -04:00
abelanger5 9efcebe6af fix: better logic for multiple restricted domains (#860) 2024-09-10 12:07:55 -04:00
abelanger5 77ab9460d3 fix: unset statement timeouts for session, print more debug info (#861) 2024-09-10 12:07:47 -04:00
abelanger5 a1324d43db fix: improve status updates and event writes (#855)
* fix: improve status updates and event writes

* chore: better debug line
2024-09-09 18:09:44 +00:00
abelanger5 e59ce100e3 fix: cleanup worker assign events + limit semaphore query (#854) 2024-09-09 11:11:44 -04:00
abelanger5 23daa89ba0 fix: list initial step runs query + better serialized queue operations (#851)
* fix: list initial step runs

* fix: better tenant operation logic
2024-09-08 19:29:11 -04:00
abelanger5 478c897035 fix: proper migration to counts, startable step runs query (#850) 2024-09-08 12:48:51 -04:00
abelanger5 70855ecc6b refactor: bulk step run updates (#847)
* wip: v4 of queue

* fix: correct query for updating counts

* tmp: save migration files

* feat: wrap up initial queue

* fix compilation

* fix: reassigns

* temp: refactor on step runs

* feat: bulk step run updates and processing

* fix: cancellation

* address review comments
2024-09-06 17:11:19 -04:00
abelanger5 891514b461 feat: queue v4 (#842)
* wip: v4 of queue

* fix: correct query for updating counts

* tmp: save migration files

* feat: wrap up initial queue

* fix compilation

* fix: reassigns
2024-09-06 16:12:22 -04:00
abelanger5 7308876776 fix: use separate database pool for queueing, statement timeouts on tx (#839)
* fix: different queue pool and statement timeouts on step runs

* fix: implement prepareTx

* fix: defer rollback properly

* fix: race condition
2024-09-03 21:07:26 +00:00
abelanger5 b5014f6b3d chore: more visibility and debug lines for queues (#836)
* chore: more visibility and debug options for queues

* better debug lines on queue repo

* don't log so much in load test
2024-08-29 14:49:24 -04:00
abelanger5 17b7e84876 fix: delete queue items when no longer used (#831) 2024-08-28 17:12:31 -04:00
abelanger5 263eaf069b feat: pass otel through msgqueue (#802)
* feat: pass otel through msgqueue

* feat: more spans on scheduling

* otel increase batch size
2024-08-28 14:45:02 +00:00
abelanger5 6317f86793 refactor: consolidate partition logic (#826)
* refactor: consolidate partition logic

* fix: race on scheduler

* fix: move partition uuid to db query

* fix: generate
2024-08-27 15:28:53 -04:00
Gabe Ruttner 526e7ef308 feat: expose priority queue (#814)
* feat: workflow default priority

* feat: write priority on run

* feat: propagate to queue

* chore: squash migrations

* chore: generate
2024-08-26 14:11:28 -04:00
Gabe Ruttner ee5d86796f fix: required affinity (#812)
* fix: required affinity

* chore: rm dead code
2024-08-23 15:19:29 -04:00
Gabe Ruttner 53be615d5f Enhancement webhook usability (#807)
* feat: secret copier

* feat: improved form

* fix: quotes

* wip: improved flow

* feat: health check logging

* fix: page design

* fix: hard delete, no upsert

* fix: reset modal state

* fix: empty text

* fix: worker state

* fix: update only token

* fix: dont delete name

* fix: logs component

* fix: sort order

* chore: build

* fix: webhook worker cleanup

* chore: squash migrations

* Update api-contracts/openapi/paths/webhook-worker/webhook-worker.yaml

Co-authored-by: abelanger5 <belanger@sas.upenn.edu>

* chore: rename

* fix: wrong query

---------

Co-authored-by: abelanger5 <belanger@sas.upenn.edu>
2024-08-23 10:09:09 -04:00
abelanger5 dd8a4144cb fix: hard sticky assignment to workers when no desired worker id (#809) 2024-08-23 07:42:52 -04:00
abelanger5 7a3c06884f fix: don't queue cancelled step runs (#805)
* fix: cancelled step runs should not be assigned

* check cancellations before planning

* remove reassign logic from controller
2024-08-22 11:21:49 -04:00
abelanger5 e40d77d9d7 fix: reassignment should reset scheduling timeout (#804) 2024-08-22 10:07:10 -04:00
Gabe Ruttner 9bea55438a Fix webhook healthcheck race (#797)
* fix: race

* fix: partition no rows

* chore: move to workers tab

* feat: redirect empty worker path to all

* chore: add worker type and webhook id

* fix: upsert webhook worker

* fix: update by webhookId

* fix: only stub on create

* feat: url on worker

* chore: migration version

* fix: move

* fix: upsert

* fix: upert

* chore: fix migration

* fix: migrations

* chore: generate
2024-08-21 19:23:24 +00:00
Gabe Ruttner 46956a4153 fix: sort (#803) 2024-08-21 17:55:15 +00:00
Gabe Ruttner 651be542c3 fix: migration state (#800)
* fix: migration state

* almost there...

* fix: hack for constraints

* chore: lint
2024-08-21 17:07:00 +00:00
abelanger5 6b7e6de4c4 fix: use single queue limit (#801) 2024-08-21 12:46:05 -04:00
abelanger5 84f7334a06 feat: add windows for metrics and selector (#794)
* feat: add windows for metrics and selector

* better placeholder
2024-08-20 15:29:32 +00:00
abelanger5 67357cfa64 fix: add correct indexes for get group key runs, improve queries (#786)
* fix: add correct indexes for get group key runs

* chore: generate

* fix: hash
2024-08-20 09:31:30 -04:00
Gabe Ruttner 31e5b70441 fix: filter on joins (#790) 2024-08-19 13:21:31 -04:00
abelanger5 dd50515aeb fix: cancelling status on frontend (#779)
* fix: cancelling status on frontend

* fix: slot grid and dedupe on slots
2024-08-13 17:01:26 +00:00
Gabe Ruttner 4ea4712d4d refactor: performance and throughput (#756)
Refactors the queueing logic to be fairly balanced between actions, with each action backed as a separate FIFO queue. Also adds support for priority queueing and custom queues, though those aren't exposed on the API layer yet. Improves throughput to be > 5000 tasks/second on a single queue. 

---------

Co-authored-by: Alexander Belanger <alexander@hatchet.run>
2024-08-12 14:38:47 +00:00
Viktor Szépe 0948598749 Fix typos (#775) 2024-08-10 10:58:33 +00:00
abelanger5 652f604873 fix: add max msg size as env var (#759) 2024-08-01 09:25:05 -04:00
Gabe Ruttner b4670af138 Fix qos otel config (#754)
* feat: otel trace id ratio

* feat: rabbitmq qos

* feat: requeue limit

* fix: tests
2024-07-30 18:11:10 -04:00
Gabe Ruttner b802f9f45f feat: stream by addl meta (#751)
* feat: prop schedule and run

* wip

* fix: filter wfrid

* feat: hangup

* chore: rm debug log

* chore: func name

* fix: cancelled payload

* fix: load

* fix: cleanup the cahce

* fix: single proto

* fix: key -> val

* chore: case

* chore: rm dead code

* chore: rm dead code

* feat: go and docs

* fix: docs
2024-07-29 19:09:51 +00:00
abelanger5 a245151d91 feat: add workflow kind to workflow versions (#750)
* feat: support workflow kinds

* chore: generate
2024-07-29 12:07:34 -07:00
Gabe Ruttner 5c5f1c5b7b feat: prop schedule and run (#749) 2024-07-29 06:38:53 -07:00
abelanger5 9efd9368fd feat: deduplicated enqueue error and additional context methods (#747)
* feat: additional context fields and dedupe error

* fix: case on error properly
2024-07-26 18:32:56 +00:00
abelanger5 aafdd278db make max msg size configurable (#745) 2024-07-26 10:58:16 -07:00
Gabe Ruttner fd947cb5bc feat: go worker assignment (#741)
* feat: create worker with label

* feat: worker context

* feat: dynamic labels

* feat: affinity

* fix: ptr

* fix: nil labels

* feat: sticky dag

* feat: sticky docs

* feat: sticky children

* chore: lint

* fix: tests

* fix: possibly nil workerId

* chore: cleanup unneeded pointers
2024-07-26 10:19:11 -07:00
abelanger5 1ea4dfc5de feat: deduplicated enqueue (#735)
* wip

* wip: functional query

* feat: expose affinity config

* feat: add weight to proto

* feat: upsert affinity state on worker start

* fix: linting

* feat: add upsert proto

* feat: upsert handler

* feat: revise model

* fix: labels

* feat: functional desired worker

* wip: ui

* feat: add state to step run events

* fix: filter empty keys

* fix: labels as badges

* feat: empty state and descriptive text

* chore: add todo

* chore: whitespace

* chore: cleanup

* chore: cleanup

* chore: fix hash

* chore: squash migrations

* fix: fair worker assignment

* fix: remaining slots on valid desired workers

* wip: sticky

* fix: count slots

* chore: rm log line

* feat: expose sticky config

* wip: sticky dag

* feat: expose desired worker id to trigger

* feat: trigger on desired worker

* feat: typescript docs

* feat: sticky python

* feat: py sticky children

* wip: py affinity

* serverless note

* feat: complete python examples

* linting

* feat: deduplicated enqueue

* fix: address changes from PR review

* chore: generate

---------

Co-authored-by: gabriel ruttner <gabriel.ruttner@gmail.com>
2024-07-26 16:47:46 +00:00
Gabe Ruttner 2711fb84cb fix: small resolver (#742) 2024-07-26 09:31:09 -07:00
abelanger5 77193df928 feat: expose additional fields on assigned action (#738) 2024-07-26 09:17:05 -07:00
abelanger5 c0b01f1b9b fix: workflow runs replays and show workflow run input (#744) 2024-07-25 17:35:10 +00:00
Gabe Ruttner ee68786d69 feat: sticky workers (#695)
* wip

* wip: functional query

* feat: expose affinity config

* feat: add weight to proto

* feat: upsert affinity state on worker start

* fix: linting

* feat: add upsert proto

* feat: upsert handler

* feat: revise model

* fix: labels

* feat: functional desired worker

* wip: ui

* feat: add state to step run events

* fix: filter empty keys

* fix: labels as badges

* feat: empty state and descriptive text

* chore: add todo

* chore: whitespace

* chore: cleanup

* chore: cleanup

* chore: fix hash

* chore: squash migrations

* fix: fair worker assignment

* fix: remaining slots on valid desired workers

* wip: sticky

* fix: count slots

* chore: rm log line

* feat: expose sticky config

* wip: sticky dag

* feat: expose desired worker id to trigger

* feat: trigger on desired worker

* feat: typescript docs

* feat: sticky python

* feat: py sticky children

* wip: py affinity

* serverless note

* feat: complete python examples

* linting

* fix: doc link

* chore: rm debug log

* fix: simplify list labels

* fix: typo
2024-07-22 17:20:23 -04:00