Commit Graph

70 Commits

Author SHA1 Message Date
abelanger5
ac968e94b8 fix: concurrency issues and a few small improvements (#1324) 2025-03-12 16:30:34 -04:00
abelanger5
1f2096313d feat: v1 engine (#1318) 2025-03-11 14:57:13 -04:00
Gabe Ruttner
0e91542d87 wip: backoff state (#1225)
* wip: backoff state

* fix: retry state and step run start condition

* fix: missing key

* fix: gen

* chore: squash migration

* chore: rm todos

* ops: upgrade proto
2025-01-28 19:16:12 +00:00
abelanger5
dcb67a1dac feat: postgres-backed message queue (#1119) 2024-12-18 09:00:54 -05:00
abelanger5
e12e700980 feat: CANCEL_NEWEST strategy and make cancel in progress more reliable (#1127) 2024-12-18 01:40:14 +00:00
Gabe Ruttner
23c8523a28 fix: add missing index to LogLines table (#1124) 2024-12-16 17:45:28 -05:00
Sean Reilly
e32f353587 Speed up the delete worker query (#1103)
* add an index on lastHeartbeatAt and don't do highly related actions concurrently



---------

Co-authored-by: Sean Reilly <sean@hatchet.run>
2024-12-12 20:49:22 -05:00
abelanger5
db6558a8a8 fix: v0.52.5 migrations (#1094) 2024-12-05 15:31:07 -05:00
abelanger5
b0c6c7cd46 feat(go-sdk): cron and schedules API, minor fixes (#1083)
* feat(go-sdk): cron and schedules API, minor fixes

* try to improve code block and docs

* revert pre-commit

* fix: generate

* fix: put overflow in right place

* remove branch specs
2024-12-04 21:18:05 +00:00
abelanger5
e6cd65300f fix: make cron migration more performant (#1076)
* fix: make cron migration more performant

* whitespace
2024-11-27 19:50:00 +00:00
abelanger5
fbbe02fa33 fix: revert previous migration for new build of 0.52.0 (#1072)
* fix: revert previous migration for new build of 0.52.0

* also remove identityId
2024-11-25 14:03:36 -05:00
Gabe Ruttner
574eb0b67e feat: dynamic crons (#1000)
* wip: stub schedule page

* wip: stub list

* fix: 2025 bug...

* feat: wip cron list

* feat: addl meta

* feat: expose metadata column

* feat: sort and created at

* cron to recurring

* scheduled: with statuses

* fix: links

* feat: expose schedule ids

* feat: delete run

* fix: remove search

* feat: filterable scheduled

* fix: remove broken features

* chore: lint

* rm metadata for now

* chore: lint

* chore: recurring to cron job

* fix: review comments

* fix: populator

* wip cron changes

* fix: ids are helpful

* fix: populator

* wip

* wip: create crons, stub scheduled

* wip: create schedule

* wip add trigger buttons to all the pages

* wip: reusable trigger form

* fix: hash

* fixes: cron bugs

* fixes: cron sort

* fix: out of order migrations

* fix: add internalRetryCount

* feat: api things survive version transitions

* feat: table things

* feat: delete disabled for non api

* feat: prevent delete non api

* feat: filters

* require cron name for api

* default name

* fix: migrations

* frontend improvements and migrations

* fix: pagination

---------

Co-authored-by: Alexander Belanger <alexander@hatchet.run>
2024-11-21 16:18:24 -05:00
abelanger5
197bdd1f88 feat: exponential backoff (#1062)
* initial migration

* feat: exp backoff, fix linting

* fix utc issue and cleanup
2024-11-21 13:39:02 -05:00
abelanger5
06c4f642ea fix: out of order migrations (#1061)
* fix: out of order migrations

* fix: add internalRetryCount
2024-11-21 10:18:07 -05:00
Sean Reilly
42afe083cf Partition Step Run and Remove Prisma (#982)
* add in the migration for now

* Update step_runs.sql

remove TODO

* change the schema so we don't undo it

* add the migration for step run partition. remove prisma. add a helper task for recreating the db

* do a manual merge of the schema.sql

* add in the serial

* update docs

* PR feedback

* add Identity to all tables that don't have a Bigserial

* do the atlas hash with the new migration

* squash the migrations

---------

Co-authored-by: Sean Reilly <sean@hatchet.run>
2024-11-20 15:20:36 -08:00
Sean Reilly
9a5acc5179 modify the Event created at to be a clock_timestamp instead of a transaction timestamp so we maintain ordering of inserted events - also extend the length of the timestamp so we have enough significant bits (#1044)
* add the migration for the timestamp and clock

* regenerate

---------

Co-authored-by: Sean Reilly <sean@hatchet.run>
2024-11-14 11:15:45 -08:00
abelanger5
780496e7fb fix: prevent infinite reassign loop (#1028) 2024-11-07 17:28:12 +00:00
Gabe Ruttner
44addbb47e Feat scheduled improvements (#992)
* wip: stub schedule page

* wip: stub list

* fix: 2025 bug...

* feat: wip cron list

* feat: addl meta

* feat: expose metadata column

* feat: sort and created at

* cron to recurring

* scheduled: with statuses

* fix: links

* feat: expose schedule ids

* feat: delete run

* fix: remove search

* feat: filterable scheduled

* fix: remove broken features

* chore: lint

* rm metadata for now

* chore: lint

* chore: recurring to cron job

* fix: review comments

* fix: populator
2024-11-01 07:16:20 -04:00
Gabe Ruttner
4932e7f863 Feat sdk runtime (#942)
* feat: runtime signature

* feat: add sdk runtime to worker model

* feat: post runtime

* feat: expose sdk version on worker

* feat: go inf

* chore: gen

* chore: migrations and generation

* fix: simpler runtime

* feat: hatchet sdk ver

* fix: rm debug line
2024-10-28 13:47:12 -07:00
abelanger5
718d8f59c9 fix: rewrite queries for checking child workflows (#983)
* rewrite queries for child workflows

* add index

* fix: remove tenant id where it's not needed
2024-10-23 19:18:26 -04:00
abelanger5
2cdee59aea refactor: optimize v0.50.0 release (#975)
- Simplifies architecture for splitting engine services into different components. The three supported services are now `grpc-api`, `scheduler`, and `controllers`. The `grpc-api` service is the only one which needs to be exposed for workers. The other two can run as unexposed services.
- Fixes a set of bugs and race conditions in the `v2` scheduler
- Adds a `lastActive` time to the `Queue` table and includes a migration which sets this `lastActive` time for the most recent 24 hours of queues. Effectively this means that the max scheduling time in a queue is 24 hours. 
- Rewrites the `ListWorkflowsForEvent` query to improve performance and select far fewer rows.
2024-10-23 12:05:16 +00:00
Gabe Ruttner
7cd08077d5 feat: improved sdk ack (#931)
* feat: add step run event reasons

* feat: ack

* fix: remove rejected reason

* fix: merge

* fix: correct buffer

* fix: consistent message

* chore: rm todo
2024-10-15 15:52:42 +00:00
abelanger5
67a96d7166 feat(throughput): single process per queue (#956)
* feat(throughput): single process per queue

* fix data race

* fix: golint and data race on load test

* wrap up initial v2 scheduler

* fix: more debug logs and tighten channel logic/blocking sends

* improved casing on dispatcher and lease manager

* fix: data race on min id

* increase wait on load test, fix data race

* fix: trylock -> lock

* clean up queue when no longer in set

* fix: clean up cache on exit

* ensure cleanup is only called once

* address review comments
2024-10-15 11:05:19 -04:00
Sean Reilly
29721cd1f0 Feat bulk workflows (#940)
Adds support for inserting workflows in bulk via the API and an optional buffered insert on the engine.
2024-10-14 15:35:29 -04:00
Gabe Ruttner
c8711f7f83 fix: id constraint (#957)
* fix: id constraint

* chore: gen
2024-10-11 18:00:12 -04:00
Gabe Ruttner
3340ec8626 fix: event keys (#951)
* feat: insert unique event keys

* fix: list query

* feat: bulk

* chore: gen
2024-10-10 08:54:52 -04:00
abelanger5
fd4ee804d3 refactor: buffered writes of step run statuses (#941)
* (wip) handle step run updates without deferred updates

* refactor: buffered writes of step run statuses

* fix: add more safety on tenant pools

* add configurable flush period, remove wait for started

* flush immediately if last flush time plus flush period is in the past

* feat: add configurable flush internal/max items
2024-10-04 15:08:21 -04:00
Sean Reilly
27736fa30f bulk insert buffering (#913)
Adds bulk inserts to event writes, and adds a generic buffer which can be used by future batch implementations.
2024-10-03 16:26:12 -04:00
abelanger5
117533c1b5 fix: remove more fks (#922)
* fix: remove more fks

* chore: generate
2024-09-30 16:53:38 -04:00
Gabe Ruttner
7d7e43d4e1 feat: pauseable workflows (#879)
* feat: pause workflow state

* feat: dont run paused workflows

* feat: skipped paused

* implement unpaused behavior for workflow runs

* fix: frontend

* fix: more frontend

* fix: imports

---------

Co-authored-by: Alexander Belanger <alexander@hatchet.run>
2024-09-29 10:58:10 -04:00
abelanger5
6172956bbd refactor: remove foreign keys from unchanged/non-cascading parent tables (#918)
* refactor: remove fks from unchanged/non-cascading parent tables

* fix: cleanup cache for engine repository

* fix: remove streamevent
2024-09-27 14:21:45 -04:00
abelanger5
a1a10b4073 feat: dynamic rate limits (#904)
* wip: step run expressions on rate limits

* feat: dynamic rate limits

* chore: v0.47.0

* chore: address changes from PR review

* fix: improved error handling

* address pr review

* better error messages for step run cels, remove debug logs

* fix: hash

---------

Co-authored-by: gabriel ruttner <gabriel.ruttner@gmail.com>
2024-09-26 22:00:34 +00:00
abelanger5
baf13bd577 fix: duration int -> bigint (#902) 2024-09-23 08:30:16 -07:00
abelanger5
d23e5d9963 feat: expression-based concurrency keys (#889)
* feat: expression-based concurrency keys

* fix: build

* fix: typos

* fix: gen

* fix: migration

* fix: remove print statements

* fix: reassignment bugs, retries on closed transport, pr review
2024-09-19 10:32:22 -04:00
Gabe Ruttner
c64c62f66a feat: improved workflow run details page (#821)
* wip: rip prisma

* wip

* wip

* fix: lint

* wip

* wip

* gen

* wip

* wip

* fix trigger

* hide overview

* revert db changes

* feat: wrap up frontend changes and perf

* chore: generate

* chore: frontend build

* fix: workflow transformer

* fix: avoid race conditions on simultaneous parent completions

* fix: 2025 started

* feat: toast for replay/cancel

* fix: toast

---------

Co-authored-by: Alexander Belanger <alexander@hatchet.run>
2024-09-16 15:39:49 +00:00
abelanger5
bed2cb559a fix: add back sem slots, without row contention (#868)
* fix: add back sem slots, without row contention

* fix: serialize queue step runs to prevent dirty reads

* remove serializable for now

* statement timeouts on create workflow run

* statement timeout for reassign

* proper migration + cleanup

* remove old tables and code

* fix: worker slot state

* remove last unused table from workers
2024-09-11 20:47:49 +00:00
abelanger5
f4c5cd973e feat: more efficient step run timeouts (#863) 2024-09-10 18:23:11 -04:00
abelanger5
478c897035 fix: proper migration to counts, startable step runs query (#850) 2024-09-08 12:48:51 -04:00
abelanger5
891514b461 feat: queue v4 (#842)
* wip: v4 of queue

* fix: correct query for updating counts

* tmp: save migration files

* feat: wrap up initial queue

* fix compilation

* fix: reassigns
2024-09-06 16:12:22 -04:00
abelanger5
b5014f6b3d chore: more visibility and debug lines for queues (#836)
* chore: more visibility and debug options for queues

* better debug lines on queue repo

* don't log so much in load test
2024-08-29 14:49:24 -04:00
Gabe Ruttner
526e7ef308 feat: expose priority queue (#814)
* feat: workflow default priority

* feat: write priority on run

* feat: propagate to queue

* chore: squash migrations

* chore: generate
2024-08-26 14:11:28 -04:00
Gabe Ruttner
53be615d5f Enhancement webhook usability (#807)
* feat: secret copier

* feat: improved form

* fix: quotes

* wip: improved flow

* feat: health check logging

* fix: page design

* fix: hard delete, no upsert

* fix: reset modal state

* fix: empty text

* fix: worker state

* fix: update only token

* fix: dont delete name

* fix: logs component

* fix: sort order

* chore: build

* fix: webhook worker cleanup

* chore: squash migrations

* Update api-contracts/openapi/paths/webhook-worker/webhook-worker.yaml

Co-authored-by: abelanger5 <belanger@sas.upenn.edu>

* chore: rename

* fix: wrong query

---------

Co-authored-by: abelanger5 <belanger@sas.upenn.edu>
2024-08-23 10:09:09 -04:00
Gabe Ruttner
9bea55438a Fix webhook healthcheck race (#797)
* fix: race

* fix: partition no rows

* chore: move to workers tab

* feat: redirect empty worker path to all

* chore: add worker type and webhook id

* fix: upsert webhook worker

* fix: update by webhookId

* fix: only stub on create

* feat: url on worker

* chore: migration version

* fix: move

* fix: upsert

* fix: upert

* chore: fix migration

* fix: migrations

* chore: generate
2024-08-21 19:23:24 +00:00
Gabe Ruttner
651be542c3 fix: migration state (#800)
* fix: migration state

* almost there...

* fix: hack for constraints

* chore: lint
2024-08-21 17:07:00 +00:00
abelanger5
67357cfa64 fix: add correct indexes for get group key runs, improve queries (#786)
* fix: add correct indexes for get group key runs

* chore: generate

* fix: hash
2024-08-20 09:31:30 -04:00
abelanger5
d4d3512a28 fix: add constraint to priority (#777)
* fix: add constraint to priority

* fix: new index

* fix: drop old index

* fix: typo
2024-08-12 17:29:57 +00:00
Gabe Ruttner
4ea4712d4d refactor: performance and throughput (#756)
Refactors the queueing logic to be fairly balanced between actions, with each action backed as a separate FIFO queue. Also adds support for priority queueing and custom queues, though those aren't exposed on the API layer yet. Improves throughput to be > 5000 tasks/second on a single queue. 

---------

Co-authored-by: Alexander Belanger <alexander@hatchet.run>
2024-08-12 14:38:47 +00:00
abelanger5
a245151d91 feat: add workflow kind to workflow versions (#750)
* feat: support workflow kinds

* chore: generate
2024-07-29 12:07:34 -07:00
abelanger5
1ea4dfc5de feat: deduplicated enqueue (#735)
* wip

* wip: functional query

* feat: expose affinity config

* feat: add weight to proto

* feat: upsert affinity state on worker start

* fix: linting

* feat: add upsert proto

* feat: upsert handler

* feat: revise model

* fix: labels

* feat: functional desired worker

* wip: ui

* feat: add state to step run events

* fix: filter empty keys

* fix: labels as badges

* feat: empty state and descriptive text

* chore: add todo

* chore: whitespace

* chore: cleanup

* chore: cleanup

* chore: fix hash

* chore: squash migrations

* fix: fair worker assignment

* fix: remaining slots on valid desired workers

* wip: sticky

* fix: count slots

* chore: rm log line

* feat: expose sticky config

* wip: sticky dag

* feat: expose desired worker id to trigger

* feat: trigger on desired worker

* feat: typescript docs

* feat: sticky python

* feat: py sticky children

* wip: py affinity

* serverless note

* feat: complete python examples

* linting

* feat: deduplicated enqueue

* fix: address changes from PR review

* chore: generate

---------

Co-authored-by: gabriel ruttner <gabriel.ruttner@gmail.com>
2024-07-26 16:47:46 +00:00
Gabe Ruttner
ee68786d69 feat: sticky workers (#695)
* wip

* wip: functional query

* feat: expose affinity config

* feat: add weight to proto

* feat: upsert affinity state on worker start

* fix: linting

* feat: add upsert proto

* feat: upsert handler

* feat: revise model

* fix: labels

* feat: functional desired worker

* wip: ui

* feat: add state to step run events

* fix: filter empty keys

* fix: labels as badges

* feat: empty state and descriptive text

* chore: add todo

* chore: whitespace

* chore: cleanup

* chore: cleanup

* chore: fix hash

* chore: squash migrations

* fix: fair worker assignment

* fix: remaining slots on valid desired workers

* wip: sticky

* fix: count slots

* chore: rm log line

* feat: expose sticky config

* wip: sticky dag

* feat: expose desired worker id to trigger

* feat: trigger on desired worker

* feat: typescript docs

* feat: sticky python

* feat: py sticky children

* wip: py affinity

* serverless note

* feat: complete python examples

* linting

* fix: doc link

* chore: rm debug log

* fix: simplify list labels

* fix: typo
2024-07-22 17:20:23 -04:00