Commit Graph

81 Commits

Author SHA1 Message Date
Matt Kaye
1eeb8e915d Fix: Queue blocking on many concurrency keys + failures (#1622)
* fix: move around case ordering

* feat: move worker fixture around

* fix: clean up fixtures

* feat: expand test, try to repro

* debug: more minimal repro

* fix: bug bashing

* fix: factor out concurrency queries into overwrite

* feat: improve test

* fix: improve test

* fix: lint

* feat: migration for trigger

* Fix: Retry + Cancel Bugs (#1620)

* fix: send original retry count to cancel

* fix: key threads, contexts, and tasks on retry count

* fix: store the key on the action and use it everywhere

* refactor: signature consistency

* fix: instrumentor types

* chore: version

* feat: comment

* fix: thank you mypy

* fix: simplify callback

* fix: ts implementation

* chore: lint

* fix: rework how retries are passed

* Fix: Add Retries to Log Pushes (#1594)

* fix: retry log pushes

* chore: version
2025-04-25 21:49:30 -04:00
Matt Kaye
80137736af Feat: Priority (#1513)
* feat: initial work wiring up priorities

* fix: add default to default prio in the db

* feat: wire priority through api on wf creation

* feat: extend python test

* feat: priority for scheduled workflows

* feat: wire priority through python api

* feat: more wiring priority through the api

* feat: I think it works?

* feat: e2e test for priority

* it works!

* feat: expand tests for default priorities

* feat: e2e scheduling test

* fix: skip broken test for now

* fix: lint

* feat: add priority columns to cron and schedule ref  tables

* feat: update inserts to include prio

* feat: wire up more apis

* feat: more wiring

* feat: wire up more rest api fields

* chore: cruft

* fix: more wiring

* fix: lint

* chore: gen + wire up priorities

* fix: retries

* fix: try changing fixture scope

* chore: bump version again

* feat: send priority with action payload

* fix: generate script

* Feat  priority ts (#1518)

* feat: initial work wiring up priorities

* fix: add default to default prio in the db

* feat: wire priority through api on wf creation

* feat: extend python test

* feat: priority for scheduled workflows

* feat: wire priority through python api

* feat: more wiring priority through the api

* feat: I think it works?

* feat: e2e test for priority

* it works!

* feat: expand tests for default priorities

* feat: e2e scheduling test

* chore: minor version for priority

* fix: skip broken test for now

* fix: lint

* feat: add priority columns to cron and schedule ref  tables

* feat: update inserts to include prio

* feat: wire up more apis

* feat: more wiring

* feat: wire up more rest api fields

* chore: cruft

* fix: more wiring

* fix: lint

* chore: gen + wire up priorities

* fix: increase timeout

* fix: retries

* fix: try changing fixture scope

* chore: generate

* fix: set schedule priority

* feat: priority

* fix: move priority to wf

* release: 1.2.0

* rm log

* fix: import

* fix: add priority to step

---------

Co-authored-by: mrkaye97 <mrkaye97@gmail.com>

* fix: add dummy runs to priority test to prevent race conditions

* fix: non-breaking field

* fix: gen

* feat: initial pass at docs

* feat: priority in go sdk

* feat: initial work on go example

* fix: doc examples

* fix: proofread

* chore: version

* feat: go sdk

* fix: lint

* fix: declarations and add back RunAsChild

* fix: child workflows

* fix: namespace

* fix: faster child workflows

* fix: sticky

* add back run as child

---------

Co-authored-by: Gabe Ruttner <gabriel.ruttner@gmail.com>
Co-authored-by: Alexander Belanger <alexander@hatchet.run>
2025-04-14 16:22:00 -04:00
abelanger5
d4e489996c fix: v1 edge cases on concurrency, go SDK, parent outputs (#1497)
* fix: v1 edge cases on concurrency, go SDK, parent outputs

* fix: overflow on queue metrics

* revert changes to DAG

* fix: remove prefix on error for Result method

* cleanup schema, fix migrations

* fix panic edge case
2025-04-07 08:19:13 -04:00
abelanger5
b6d077f96d feat: show concurrency queue counts in the UI (#1495)
* feat: show concurrency queue counts in the UI

* fix: parent concurrency queues
2025-04-04 12:22:14 -04:00
abelanger5
1bad2de35f fix: dag statuses should wait for all tasks to be created (#1428) 2025-03-27 15:22:01 -07:00
abelanger5
c54bf9266c feat(v1): tenant limits (#1388)
* feat(v1): tenant limits

* fix: migration

* fix: kill metered cache
2025-03-23 19:03:55 -07:00
abelanger5
00c4bbff09 feat(v1): new gRPC API endpoints (#1367)
* wip: api contracts

* feat: implement put workflow version endpoint

* add support for match existing data, get scaffolding in place for additional triggers

* create additional matches

* feat: durable sleep, user event matching

* update protos

* fix: working poc of user events, durable sleep

* add migration

* fix: migration column

* feat: durable event listener

* fix: skip overrides

* fix: input -> output
2025-03-23 18:58:20 -07:00
abelanger5
e91047d7b3 feat: add back tenant alerting to v1 (#1372) 2025-03-19 17:50:42 -04:00
abelanger5
4164b80f13 fix: race conditions on retries w/out backoff and concurrency keys (#1368) 2025-03-19 12:36:36 -04:00
Gabe Ruttner
3670b94fc4 Feat v1 UI tweaks (#1344)
* fix: drop uncached loader

* feat: upgrade modal

* add beta

* hacky feature flag

* fix: build

* refetch interval

* 5s

* stop flashing on load

* lint

* fix: map

* fix: last redir

* nil check

* small styling and wording things, change default canUpgrade -> true

* switch link to github discussion

---------

Co-authored-by: Alexander Belanger <alexander@hatchet.run>
2025-03-15 09:23:32 -04:00
abelanger5
4cbde4405a fix: more v1 bug bashing (#1334) 2025-03-13 17:13:04 -04:00
abelanger5
ac968e94b8 fix: concurrency issues and a few small improvements (#1324) 2025-03-12 16:30:34 -04:00
abelanger5
1f2096313d feat: v1 engine (#1318) 2025-03-11 14:57:13 -04:00
Gabe Ruttner
0e91542d87 wip: backoff state (#1225)
* wip: backoff state

* fix: retry state and step run start condition

* fix: missing key

* fix: gen

* chore: squash migration

* chore: rm todos

* ops: upgrade proto
2025-01-28 19:16:12 +00:00
abelanger5
dcb67a1dac feat: postgres-backed message queue (#1119) 2024-12-18 09:00:54 -05:00
abelanger5
e12e700980 feat: CANCEL_NEWEST strategy and make cancel in progress more reliable (#1127) 2024-12-18 01:40:14 +00:00
Gabe Ruttner
23c8523a28 fix: add missing index to LogLines table (#1124) 2024-12-16 17:45:28 -05:00
Sean Reilly
e32f353587 Speed up the delete worker query (#1103)
* add an index on lastHeartbeatAt and don't do highly related actions concurrently



---------

Co-authored-by: Sean Reilly <sean@hatchet.run>
2024-12-12 20:49:22 -05:00
abelanger5
db6558a8a8 fix: v0.52.5 migrations (#1094) 2024-12-05 15:31:07 -05:00
abelanger5
b0c6c7cd46 feat(go-sdk): cron and schedules API, minor fixes (#1083)
* feat(go-sdk): cron and schedules API, minor fixes

* try to improve code block and docs

* revert pre-commit

* fix: generate

* fix: put overflow in right place

* remove branch specs
2024-12-04 21:18:05 +00:00
abelanger5
e6cd65300f fix: make cron migration more performant (#1076)
* fix: make cron migration more performant

* whitespace
2024-11-27 19:50:00 +00:00
abelanger5
fbbe02fa33 fix: revert previous migration for new build of 0.52.0 (#1072)
* fix: revert previous migration for new build of 0.52.0

* also remove identityId
2024-11-25 14:03:36 -05:00
Gabe Ruttner
574eb0b67e feat: dynamic crons (#1000)
* wip: stub schedule page

* wip: stub list

* fix: 2025 bug...

* feat: wip cron list

* feat: addl meta

* feat: expose metadata column

* feat: sort and created at

* cron to recurring

* scheduled: with statuses

* fix: links

* feat: expose schedule ids

* feat: delete run

* fix: remove search

* feat: filterable scheduled

* fix: remove broken features

* chore: lint

* rm metadata for now

* chore: lint

* chore: recurring to cron job

* fix: review comments

* fix: populator

* wip cron changes

* fix: ids are helpful

* fix: populator

* wip

* wip: create crons, stub scheduled

* wip: create schedule

* wip add trigger buttons to all the pages

* wip: reusable trigger form

* fix: hash

* fixes: cron bugs

* fixes: cron sort

* fix: out of order migrations

* fix: add internalRetryCount

* feat: api things survive version transitions

* feat: table things

* feat: delete disabled for non api

* feat: prevent delete non api

* feat: filters

* require cron name for api

* default name

* fix: migrations

* frontend improvements and migrations

* fix: pagination

---------

Co-authored-by: Alexander Belanger <alexander@hatchet.run>
2024-11-21 16:18:24 -05:00
abelanger5
197bdd1f88 feat: exponential backoff (#1062)
* initial migration

* feat: exp backoff, fix linting

* fix utc issue and cleanup
2024-11-21 13:39:02 -05:00
abelanger5
06c4f642ea fix: out of order migrations (#1061)
* fix: out of order migrations

* fix: add internalRetryCount
2024-11-21 10:18:07 -05:00
Sean Reilly
42afe083cf Partition Step Run and Remove Prisma (#982)
* add in the migration for now

* Update step_runs.sql

remove TODO

* change the schema so we don't undo it

* add the migration for step run partition. remove prisma. add a helper task for recreating the db

* do a manual merge of the schema.sql

* add in the serial

* update docs

* PR feedback

* add Identity to all tables that don't have a Bigserial

* do the atlas hash with the new migration

* squash the migrations

---------

Co-authored-by: Sean Reilly <sean@hatchet.run>
2024-11-20 15:20:36 -08:00
Sean Reilly
9a5acc5179 modify the Event created at to be a clock_timestamp instead of a transaction timestamp so we maintain ordering of inserted events - also extend the length of the timestamp so we have enough significant bits (#1044)
* add the migration for the timestamp and clock

* regenerate

---------

Co-authored-by: Sean Reilly <sean@hatchet.run>
2024-11-14 11:15:45 -08:00
abelanger5
780496e7fb fix: prevent infinite reassign loop (#1028) 2024-11-07 17:28:12 +00:00
Gabe Ruttner
44addbb47e Feat scheduled improvements (#992)
* wip: stub schedule page

* wip: stub list

* fix: 2025 bug...

* feat: wip cron list

* feat: addl meta

* feat: expose metadata column

* feat: sort and created at

* cron to recurring

* scheduled: with statuses

* fix: links

* feat: expose schedule ids

* feat: delete run

* fix: remove search

* feat: filterable scheduled

* fix: remove broken features

* chore: lint

* rm metadata for now

* chore: lint

* chore: recurring to cron job

* fix: review comments

* fix: populator
2024-11-01 07:16:20 -04:00
Gabe Ruttner
4932e7f863 Feat sdk runtime (#942)
* feat: runtime signature

* feat: add sdk runtime to worker model

* feat: post runtime

* feat: expose sdk version on worker

* feat: go inf

* chore: gen

* chore: migrations and generation

* fix: simpler runtime

* feat: hatchet sdk ver

* fix: rm debug line
2024-10-28 13:47:12 -07:00
abelanger5
718d8f59c9 fix: rewrite queries for checking child workflows (#983)
* rewrite queries for child workflows

* add index

* fix: remove tenant id where it's not needed
2024-10-23 19:18:26 -04:00
abelanger5
2cdee59aea refactor: optimize v0.50.0 release (#975)
- Simplifies architecture for splitting engine services into different components. The three supported services are now `grpc-api`, `scheduler`, and `controllers`. The `grpc-api` service is the only one which needs to be exposed for workers. The other two can run as unexposed services.
- Fixes a set of bugs and race conditions in the `v2` scheduler
- Adds a `lastActive` time to the `Queue` table and includes a migration which sets this `lastActive` time for the most recent 24 hours of queues. Effectively this means that the max scheduling time in a queue is 24 hours. 
- Rewrites the `ListWorkflowsForEvent` query to improve performance and select far fewer rows.
2024-10-23 12:05:16 +00:00
Gabe Ruttner
7cd08077d5 feat: improved sdk ack (#931)
* feat: add step run event reasons

* feat: ack

* fix: remove rejected reason

* fix: merge

* fix: correct buffer

* fix: consistent message

* chore: rm todo
2024-10-15 15:52:42 +00:00
abelanger5
67a96d7166 feat(throughput): single process per queue (#956)
* feat(throughput): single process per queue

* fix data race

* fix: golint and data race on load test

* wrap up initial v2 scheduler

* fix: more debug logs and tighten channel logic/blocking sends

* improved casing on dispatcher and lease manager

* fix: data race on min id

* increase wait on load test, fix data race

* fix: trylock -> lock

* clean up queue when no longer in set

* fix: clean up cache on exit

* ensure cleanup is only called once

* address review comments
2024-10-15 11:05:19 -04:00
Sean Reilly
29721cd1f0 Feat bulk workflows (#940)
Adds support for inserting workflows in bulk via the API and an optional buffered insert on the engine.
2024-10-14 15:35:29 -04:00
Gabe Ruttner
c8711f7f83 fix: id constraint (#957)
* fix: id constraint

* chore: gen
2024-10-11 18:00:12 -04:00
Gabe Ruttner
3340ec8626 fix: event keys (#951)
* feat: insert unique event keys

* fix: list query

* feat: bulk

* chore: gen
2024-10-10 08:54:52 -04:00
abelanger5
fd4ee804d3 refactor: buffered writes of step run statuses (#941)
* (wip) handle step run updates without deferred updates

* refactor: buffered writes of step run statuses

* fix: add more safety on tenant pools

* add configurable flush period, remove wait for started

* flush immediately if last flush time plus flush period is in the past

* feat: add configurable flush internal/max items
2024-10-04 15:08:21 -04:00
Sean Reilly
27736fa30f bulk insert buffering (#913)
Adds bulk inserts to event writes, and adds a generic buffer which can be used by future batch implementations.
2024-10-03 16:26:12 -04:00
abelanger5
117533c1b5 fix: remove more fks (#922)
* fix: remove more fks

* chore: generate
2024-09-30 16:53:38 -04:00
Gabe Ruttner
7d7e43d4e1 feat: pauseable workflows (#879)
* feat: pause workflow state

* feat: dont run paused workflows

* feat: skipped paused

* implement unpaused behavior for workflow runs

* fix: frontend

* fix: more frontend

* fix: imports

---------

Co-authored-by: Alexander Belanger <alexander@hatchet.run>
2024-09-29 10:58:10 -04:00
abelanger5
6172956bbd refactor: remove foreign keys from unchanged/non-cascading parent tables (#918)
* refactor: remove fks from unchanged/non-cascading parent tables

* fix: cleanup cache for engine repository

* fix: remove streamevent
2024-09-27 14:21:45 -04:00
abelanger5
a1a10b4073 feat: dynamic rate limits (#904)
* wip: step run expressions on rate limits

* feat: dynamic rate limits

* chore: v0.47.0

* chore: address changes from PR review

* fix: improved error handling

* address pr review

* better error messages for step run cels, remove debug logs

* fix: hash

---------

Co-authored-by: gabriel ruttner <gabriel.ruttner@gmail.com>
2024-09-26 22:00:34 +00:00
abelanger5
baf13bd577 fix: duration int -> bigint (#902) 2024-09-23 08:30:16 -07:00
abelanger5
d23e5d9963 feat: expression-based concurrency keys (#889)
* feat: expression-based concurrency keys

* fix: build

* fix: typos

* fix: gen

* fix: migration

* fix: remove print statements

* fix: reassignment bugs, retries on closed transport, pr review
2024-09-19 10:32:22 -04:00
Gabe Ruttner
c64c62f66a feat: improved workflow run details page (#821)
* wip: rip prisma

* wip

* wip

* fix: lint

* wip

* wip

* gen

* wip

* wip

* fix trigger

* hide overview

* revert db changes

* feat: wrap up frontend changes and perf

* chore: generate

* chore: frontend build

* fix: workflow transformer

* fix: avoid race conditions on simultaneous parent completions

* fix: 2025 started

* feat: toast for replay/cancel

* fix: toast

---------

Co-authored-by: Alexander Belanger <alexander@hatchet.run>
2024-09-16 15:39:49 +00:00
abelanger5
bed2cb559a fix: add back sem slots, without row contention (#868)
* fix: add back sem slots, without row contention

* fix: serialize queue step runs to prevent dirty reads

* remove serializable for now

* statement timeouts on create workflow run

* statement timeout for reassign

* proper migration + cleanup

* remove old tables and code

* fix: worker slot state

* remove last unused table from workers
2024-09-11 20:47:49 +00:00
abelanger5
f4c5cd973e feat: more efficient step run timeouts (#863) 2024-09-10 18:23:11 -04:00
abelanger5
478c897035 fix: proper migration to counts, startable step runs query (#850) 2024-09-08 12:48:51 -04:00
abelanger5
891514b461 feat: queue v4 (#842)
* wip: v4 of queue

* fix: correct query for updating counts

* tmp: save migration files

* feat: wrap up initial queue

* fix compilation

* fix: reassigns
2024-09-06 16:12:22 -04:00