* fix: v1 edge cases on concurrency, go SDK, parent outputs
* fix: overflow on queue metrics
* revert changes to DAG
* fix: remove prefix on error for Result method
* cleanup schema, fix migrations
* fix panic edge case
* wip: api contracts
* feat: implement put workflow version endpoint
* add support for match existing data, get scaffolding in place for additional triggers
* create additional matches
* feat: durable sleep, user event matching
* update protos
* fix: working poc of user events, durable sleep
* add migration
* fix: migration column
* feat: durable event listener
* fix: skip overrides
* fix: input -> output
* wip: backoff state
* fix: retry state and step run start condition
* fix: missing key
* fix: gen
* chore: squash migration
* chore: rm todos
* ops: upgrade proto
* add in the migration for now
* Update step_runs.sql
remove TODO
* change the schema so we don't undo it
* add the migration for step run partition. remove prisma. add a helper task for recreating the db
* do a manual merge of the schema.sql
* add in the serial
* update docs
* PR feedback
* add Identity to all tables that don't have a Bigserial
* do the atlas hash with the new migration
* squash the migrations
---------
Co-authored-by: Sean Reilly <sean@hatchet.run>
* feat: runtime signature
* feat: add sdk runtime to worker model
* feat: post runtime
* feat: expose sdk version on worker
* feat: go inf
* chore: gen
* chore: migrations and generation
* fix: simpler runtime
* feat: hatchet sdk ver
* fix: rm debug line
- Simplifies architecture for splitting engine services into different components. The three supported services are now `grpc-api`, `scheduler`, and `controllers`. The `grpc-api` service is the only one which needs to be exposed for workers. The other two can run as unexposed services.
- Fixes a set of bugs and race conditions in the `v2` scheduler
- Adds a `lastActive` time to the `Queue` table and includes a migration which sets this `lastActive` time for the most recent 24 hours of queues. Effectively this means that the max scheduling time in a queue is 24 hours.
- Rewrites the `ListWorkflowsForEvent` query to improve performance and select far fewer rows.
* feat(throughput): single process per queue
* fix data race
* fix: golint and data race on load test
* wrap up initial v2 scheduler
* fix: more debug logs and tighten channel logic/blocking sends
* improved casing on dispatcher and lease manager
* fix: data race on min id
* increase wait on load test, fix data race
* fix: trylock -> lock
* clean up queue when no longer in set
* fix: clean up cache on exit
* ensure cleanup is only called once
* address review comments
* (wip) handle step run updates without deferred updates
* refactor: buffered writes of step run statuses
* fix: add more safety on tenant pools
* add configurable flush period, remove wait for started
* flush immediately if last flush time plus flush period is in the past
* feat: add configurable flush internal/max items
* fix: add back sem slots, without row contention
* fix: serialize queue step runs to prevent dirty reads
* remove serializable for now
* statement timeouts on create workflow run
* statement timeout for reassign
* proper migration + cleanup
* remove old tables and code
* fix: worker slot state
* remove last unused table from workers
Refactors the queueing logic to be fairly balanced between actions, with each action backed as a separate FIFO queue. Also adds support for priority queueing and custom queues, though those aren't exposed on the API layer yet. Improves throughput to be > 5000 tasks/second on a single queue.
---------
Co-authored-by: Alexander Belanger <alexander@hatchet.run>