Commit Graph

10 Commits

Author SHA1 Message Date
Gabe Ruttner 4ea4712d4d refactor: performance and throughput (#756)
Refactors the queueing logic to be fairly balanced between actions, with each action backed as a separate FIFO queue. Also adds support for priority queueing and custom queues, though those aren't exposed on the API layer yet. Improves throughput to be > 5000 tasks/second on a single queue. 

---------

Co-authored-by: Alexander Belanger <alexander@hatchet.run>
2024-08-12 14:38:47 +00:00
abelanger5 7c3ddfca32 feat: api server extensions (#614)
* feat: allow extending the api server

* chore: remove internal packages to pkg

* chore: update db_gen.go

* fix: expose auth

* fix: move logger to pkg

* fix: don't generate gitignore for prisma client

* fix: allow extensions to register their own api spec

* feat: expose pool on server config

* fix: nil pointer exception on empty opts

* fix: run.go file
2024-06-19 09:36:13 -04:00
abelanger5 7b7fbe3668 fix: update Requeue and Reassign logic to fix performance degradation when many events are queued (#310)
Logic for requeueing and reassigning did not limit the number of step runs to requeue, so when events accumulate with no worker present it causes memory to spike along with a very high query latency on the database. This commit limits the number of step runs returned in the requeue and reassign queries, and also properly locks step run rows for these queries so only a step run in a PENDING or PENDING_ASSIGNMENT state can be requeued.

It also improves performance of the `AssignStepRunToWorker` query and ensures that `maxRuns` on workers are always respected through the introduction of a `WorkerSemaphore` model. The value gets decremented when a step run is assigned and incremented when a step run is in a final state. 

Co-authored-by: Luca Steeb <contact@luca-steeb.com>

* Update controller.go

---------

Co-authored-by: steebchen <contact@luca-steeb.com>
2024-04-01 12:33:18 -04:00
Luca Steeb 8183dd509a test(rampup): add load ramp up test (#273)
* test(rampup): add load ramp up test

* disable debug logging

* actual implementation

* refactor

* max acceptable schedule

* check for non-executed events

* fixes

* chore: set log level to error in engine tests

---------

Co-authored-by: abelanger5 <belanger@sas.upenn.edu>
2024-03-31 19:14:30 -04:00
Luca Steeb 9b68115fb5 refactor: cleanup functions in api + worker (#192) 2024-03-02 00:37:02 +07:00
Luca Steeb ae4841031b feat(engine): standalone tests and engine teardown (#172) 2024-02-28 00:15:25 +07:00
abelanger5 df3f540748 feat: add retries to the engine and SDKs (#171)
This PR adds support for retrying failed step runs against the engine and SDKs. This was tested up to 30 retries per step run, with both failure and success at the 30th step run. Each SDK now has a `retries` configurable param for steps when declaring a workflow.
2024-02-16 13:00:22 -05:00
Luca Steeb 00111d823c test(load): add load tests CLI & e2e tests (#157) 2024-02-16 23:47:34 +07:00
abelanger5 52fde1e704 feat: dag-style execution (#108)
* feat: dag-style execution

* docs: update to reflect new context

* ensure no cycles

* remove example cycle

* linting

* lint and small fixes

* update deferred rollback

* last rollback handling

* unset max issues

* fix requeue edge case
2024-01-16 11:31:24 -05:00
abelanger5 ac0c4e934a fix: rabbitmq concurrent processing (#92) 2024-01-09 21:15:19 -05:00