hatchet

mirror of https://github.com/hatchet-dev/hatchet.git synced 2026-04-23 10:39:45 -05:00

Author	SHA1	Message	Date
abelanger5	fbbe02fa33	fix: revert previous migration for new build of 0.52.0 (#1072 ) * fix: revert previous migration for new build of 0.52.0 * also remove identityId	2024-11-25 14:03:36 -05:00
Gabe Ruttner	574eb0b67e	feat: dynamic crons (#1000 ) * wip: stub schedule page * wip: stub list * fix: 2025 bug... * feat: wip cron list * feat: addl meta * feat: expose metadata column * feat: sort and created at * cron to recurring * scheduled: with statuses * fix: links * feat: expose schedule ids * feat: delete run * fix: remove search * feat: filterable scheduled * fix: remove broken features * chore: lint * rm metadata for now * chore: lint * chore: recurring to cron job * fix: review comments * fix: populator * wip cron changes * fix: ids are helpful * fix: populator * wip * wip: create crons, stub scheduled * wip: create schedule * wip add trigger buttons to all the pages * wip: reusable trigger form * fix: hash * fixes: cron bugs * fixes: cron sort * fix: out of order migrations * fix: add internalRetryCount * feat: api things survive version transitions * feat: table things * feat: delete disabled for non api * feat: prevent delete non api * feat: filters * require cron name for api * default name * fix: migrations * frontend improvements and migrations * fix: pagination --------- Co-authored-by: Alexander Belanger <alexander@hatchet.run>	2024-11-21 16:18:24 -05:00
Sean Reilly	31e425a858	lets make retry configurable and do not retry for unavailable because the retry is slower than regular heartbeat (#1046 ) Co-authored-by: Sean Reilly <sean@hatchet.run>	2024-11-21 13:39:31 -05:00
abelanger5	197bdd1f88	feat: exponential backoff (#1062 ) * initial migration * feat: exp backoff, fix linting * fix utc issue and cleanup	2024-11-21 13:39:02 -05:00
Sean Reilly	42afe083cf	Partition Step Run and Remove Prisma (#982 ) * add in the migration for now * Update step_runs.sql remove TODO * change the schema so we don't undo it * add the migration for step run partition. remove prisma. add a helper task for recreating the db * do a manual merge of the schema.sql * add in the serial * update docs * PR feedback * add Identity to all tables that don't have a Bigserial * do the atlas hash with the new migration * squash the migrations --------- Co-authored-by: Sean Reilly <sean@hatchet.run>	2024-11-20 15:20:36 -08:00
Sean Reilly	b5de6e26ff	Add a dynamic strategy for flushing as a function of currently flushing (#1055 ) * add a dynamic strategy for flushing where we make the trigger for flush a funciton of the depth of the concurrency * default value for tests and New for FlushStrategy * clean up the currently flushing locking and add deadlock.Mutex * don't wait as long for the buffer * lets see if this 2ms thing is what is causing things to break * lets error for this to see if we are actually hitting these limits * put a really short deadline on the lock timeout to see if github actions will blow up * lets use RW mutexs se we don't block as much * lets extend this out to 100ms * lets just do fewer locks * add a lock to prevent a queue behind the semaphore * deal with potential data races * a simpler loop fib and now locks * lets get rid of the wait for flush * remove the deadlock stuff * mod tidy --------- Co-authored-by: Sean Reilly <sean@hatchet.run>	2024-11-20 19:49:30 +00:00
abelanger5	ae5df5b88d	fix: make race condition on reassignment more rare (#1052 ) * fix: make race condition on reassignment more rare * fix: proper concurrency on bulk dispatch * prevent concurrent err assignments	2024-11-15 14:17:51 -05:00
abelanger5	c40b9154d8	fix: tenant race conditions, cleanup logic, old workers getting assigned (#1050 )	2024-11-15 09:19:36 -05:00
Gabe Ruttner	4eaa9e7fd9	feat: configurable internal retry (#1049 ) * feat: configurable internal retry * fix: bump default to 3	2024-11-15 09:19:24 -05:00
Sean Reilly	9a5acc5179	modify the Event created at to be a clock_timestamp instead of a transaction timestamp so we maintain ordering of inserted events - also extend the length of the timestamp so we have enough significant bits (#1044 ) * add the migration for the timestamp and clock * regenerate --------- Co-authored-by: Sean Reilly <sean@hatchet.run>	2024-11-14 11:15:45 -08:00
Gabe Ruttner	3850964a98	feat: initial doc pages (#1020 ) * generate initial cloud client * feat: initial doc pages * feat: cloud register id, action filtering * feat:cloud register * fix: env var * chore:lint --------- Co-authored-by: Alexander Belanger <alexander@hatchet.run>	2024-11-08 07:46:43 -08:00
abelanger5	48aadc6ace	fix: avoid panics in lease manager (#1029 )	2024-11-07 16:07:01 -05:00
abelanger5	780496e7fb	fix: prevent infinite reassign loop (#1028 )	2024-11-07 17:28:12 +00:00
Gabe Ruttner	c531c36870	fix: filter-cancel-cases (#1027 ) * fix: filter-cancel-cases * fix: case CANCELLED_BY_CONCURRENCY_LIMIT	2024-11-07 11:18:50 -05:00
Alexander Belanger	5b59af076e	fix: cancellation status propagation and minimap view	2024-11-07 11:13:14 -05:00
Gabe Ruttner	3871df01ee	fix: dont bump deleted (#1024 )	2024-11-06 16:11:36 -05:00
Gabe Ruttner	5759311574	fix: ratelimit and invalid output blocking queue (#1023 ) * fix: rm unused offending code, handle unacked * fix: handle invalid outputs * fix: dont reset failed * fix: case on json err * fix: completed step run ids * fix: scope	2024-11-06 18:21:22 +00:00
abelanger5	71e01b3b5a	fix: compute wording and add user callback (#1018 ) * user callbacks and move location of managed workers * rename pools to compute * move managed workers to right fs location, remove prefix on /workers	2024-11-05 20:14:57 +00:00
abelanger5	9d133bc15c	fix: catch all nack cases for rate limits (#1015 ) * fix: properly nack rate limit when failing to schedule * more nack cases	2024-11-05 11:37:47 -05:00
abelanger5	68bc5a0197	fix: unacked messages in the queuer (#1014 ) * fix: when scheduling fails with schedule timeouts, we never ack the queue item * add error line if we don't process everything we pass into the scheduler	2024-11-05 10:27:53 -05:00
abelanger5	75a89d00f0	use essential pool for dispatcher heartbeats too (#1007 )	2024-11-01 08:55:54 -04:00
Gabe Ruttner	abdd81c1eb	fix: orderby (#1008 )	2024-11-01 08:48:09 -04:00
Sean Reilly	b456382429	add multiple rate limiter in grpc using a token bucket (#984 ) * add multiple rate limiter in grpc using a token bucket * PR feedback * add in client retry for go client * update test files * remove log line only retry on ResourceExhausted and Unavailable * add some concurrency limits so we don't swamp ourselves * add some logging for when we are getting backed up * lets not queue up when we are too full to prevent OOM problems * fix spelling * add config options for maximum concurrent and how long to wait for flush , let the wait for flush setting be used as back pressure and a signal to writers that we are slowing up * lots of changes to buffering * fix data race * add some comments explaing how this works, change errors to be ResourceExhausted now that we have client retry and limit how many gofuncs we can create on cleanup and wait for them to finish before we exit * hooking up the config values so they go to the right place * Update config.go to default to 1 ms waitForFlush * disable grpc_retry for client streams * explicitly set the limit if it is 0 * weirdness because we were using an older version of the lib --------- Co-authored-by: Sean Reilly <sean@hatchet.run> Co-authored-by: Alexander Belanger <alexander@hatchet.run>	2024-11-01 11:48:23 +00:00
Gabe Ruttner	1003a1f5e7	fix: filter alert runs by failure only (#1001 ) * fix: filter runs by failure only * fix: post-lookup filter * fix: filtered failures --------- Co-authored-by: Alexander Belanger <alexander@hatchet.run>	2024-11-01 11:46:27 +00:00
Gabe Ruttner	44addbb47e	Feat scheduled improvements (#992 ) * wip: stub schedule page * wip: stub list * fix: 2025 bug... * feat: wip cron list * feat: addl meta * feat: expose metadata column * feat: sort and created at * cron to recurring * scheduled: with statuses * fix: links * feat: expose schedule ids * feat: delete run * fix: remove search * feat: filterable scheduled * fix: remove broken features * chore: lint * rm metadata for now * chore: lint * chore: recurring to cron job * fix: review comments * fix: populator	2024-11-01 07:16:20 -04:00
Sean Reilly	7d5b41b082	add an essential pool for heatbeats (#1003 ) * add an essential pool for heatbeats * add some telemetry spans to heartbeat and capture any errors --------- Co-authored-by: Sean Reilly <sean@hatchet.run>	2024-11-01 07:09:45 -04:00
Sean Reilly	ea682f5c6b	Feat concurrency limit for flush (#991 ) * add some concurrency limits so we don't swamp ourselves * lets not queue up when we are too full to prevent OOM problems * add config options for maximum concurrent and how long to wait for flush , let the wait for flush setting be used as back pressure and a signal to writers that we are slowing up --------- Co-authored-by: Sean Reilly <sean@hatchet.run>	2024-10-31 09:43:21 -07:00
abelanger5	a9936ef687	fix: set otel insecure flag for all telemetry instantiations (#999 )	2024-10-30 17:34:36 -04:00
abelanger5	6158aa2a4c	feat: docs for performance (#997 ) * feat: docs for performance * wrap up perf doc * address review comments	2024-10-29 18:29:03 -04:00
Gabe Ruttner	4932e7f863	Feat sdk runtime (#942 ) * feat: runtime signature * feat: add sdk runtime to worker model * feat: post runtime * feat: expose sdk version on worker * feat: go inf * chore: gen * chore: migrations and generation * fix: simpler runtime * feat: hatchet sdk ver * fix: rm debug line	2024-10-28 13:47:12 -07:00
abelanger5	3e0f15c0d8	fix: divide by zero panic (#995 ) * fix: divide by zero panic * fix: add continue	2024-10-25 19:57:55 -04:00
Sean Reilly	9f4b63817d	add a serial write for step run events (#990 ) * add a serial write for step run events * update other problematic queries * tmp: don't upsert queue * add SerialBuffer to the config * revert the change to config * fix: add back queue upsert * add statement timeout to upsert queue --------- Co-authored-by: Sean Reilly <sean@hatchet.run> Co-authored-by: Alexander Belanger <alexander@hatchet.run>	2024-10-25 16:56:38 +00:00
abelanger5	509542b804	fix: duplicate assignments in queuer (#993 ) * wip: individual mutexes for actions * tmp: debug panic * remove debug code * remove deadlocks package and don't write unassigned events * fix: race condition in scheduler and add internal retries * fix: data race	2024-10-25 16:52:43 +00:00
abelanger5	718d8f59c9	fix: rewrite queries for checking child workflows (#983 ) * rewrite queries for child workflows * add index * fix: remove tenant id where it's not needed	2024-10-23 19:18:26 -04:00
abelanger5	dd5bc90497	fix: more efficient step run events, reduce caching on queue (#981 )	2024-10-23 16:23:59 -04:00
Sean Reilly	35b115cb4f	don't need to filter on tenant id for step runs & some debug for buffers (#980 ) Co-authored-by: Sean Reilly <sean@hatchet.run>	2024-10-23 15:04:11 -04:00
abelanger5	2cdee59aea	refactor: optimize v0.50.0 release (#975 ) - Simplifies architecture for splitting engine services into different components. The three supported services are now `grpc-api`, `scheduler`, and `controllers`. The `grpc-api` service is the only one which needs to be exposed for workers. The other two can run as unexposed services. - Fixes a set of bugs and race conditions in the `v2` scheduler - Adds a `lastActive` time to the `Queue` table and includes a migration which sets this `lastActive` time for the most recent 24 hours of queues. Effectively this means that the max scheduling time in a queue is 24 hours. - Rewrites the `ListWorkflowsForEvent` query to improve performance and select far fewer rows.	2024-10-23 12:05:16 +00:00
abelanger5	7b701ed209	fix: proper deletion of tenants from the scheduling pool (#974 ) * fix: proper deletion of tenants from the scheduling pool * adds some assignment spans * feat: caching for rankings * remove cache	2024-10-17 15:47:15 -04:00
Sean Reilly	ecb9ce1e1e	rejig the query for creating multiple sticky states (#973 ) * rejig the query for creating multiple sticky states * fix: sticky strategy of soft and improve query * fix: sort method was using indexes that didn't necessarilly correspond to original indexes, leading to inconsistent behavior --------- Co-authored-by: Sean Reilly <sean@hatchet.run> Co-authored-by: Alexander Belanger <alexander@hatchet.run>	2024-10-17 13:29:19 +00:00
abelanger5	17dc80cad8	fix: don't append invalid slots with a hard sticky strategy (#972 )	2024-10-16 20:21:39 +00:00
abelanger5	c86a50711b	fix: don't reset input for concurrency keys on replay (#970 )	2024-10-16 15:55:28 -04:00
abelanger5	e4af494f69	fix: add slot expiry and delete actions from scheduler properly (#969 ) * fix: add back slot expiry * fix: remove action if all slots are inactive	2024-10-16 15:55:18 -04:00
abelanger5	cb39c938b3	fix: ack rate limits properly (#968 )	2024-10-16 13:32:10 -04:00
Sean Reilly	7e526de381	fix: deadlocks on events and incorrect step run ordering query (#966 ) * make it so the bulk example succeeds * make the bulk workflows work a little harder * add some ordering to mitigate deadlocks * fix: link step run parents bad query, improvements to locking * add timed mutex and telemetry * remove for update on cancel --------- Co-authored-by: Sean Reilly <sean@hatchet.run> Co-authored-by: Alexander Belanger <alexander@hatchet.run>	2024-10-16 10:28:33 -04:00
Gabe Ruttner	7cd08077d5	feat: improved sdk ack (#931 ) * feat: add step run event reasons * feat: ack * fix: remove rejected reason * fix: merge * fix: correct buffer * fix: consistent message * chore: rm todo	2024-10-15 15:52:42 +00:00
abelanger5	19e151e29a	fix: RunWorkflow and SpawnWorkflow should respond with consistent APIs (#965 )	2024-10-15 11:09:58 -04:00
abelanger5	67a96d7166	feat(throughput): single process per queue (#956 ) * feat(throughput): single process per queue * fix data race * fix: golint and data race on load test * wrap up initial v2 scheduler * fix: more debug logs and tighten channel logic/blocking sends * improved casing on dispatcher and lease manager * fix: data race on min id * increase wait on load test, fix data race * fix: trylock -> lock * clean up queue when no longer in set * fix: clean up cache on exit * ensure cleanup is only called once * address review comments	2024-10-15 11:05:19 -04:00
Sean Reilly	29721cd1f0	Feat bulk workflows (#940 ) Adds support for inserting workflows in bulk via the API and an optional buffered insert on the engine.	2024-10-14 15:35:29 -04:00
Gabe Ruttner	c8711f7f83	fix: id constraint (#957 ) * fix: id constraint * chore: gen	2024-10-11 18:00:12 -04:00
Gabe Ruttner	6af75638f2	feat: add helpful context to alert email (#954 )	2024-10-11 09:53:28 -04:00

1 2 3 4 5 ...

257 Commits