Commit Graph

160 Commits

Author SHA1 Message Date
Sean Reilly
5811929928 feat: bulk inserts of events (#887)
* progress commit of bulk inserts

* in_flight: Add changes to metering finish the bulk insert

* remove an attempt to overide enforce limits

* merge in PR fixes

* update docs to add in an additional section in the User guide to describe pushing single events and pushing multiple events

* run lint fix

---------

Co-authored-by: Sean Reilly <sean@hatchet.run>
2024-09-23 09:19:39 -07:00
abelanger5
6e1b6c91d2 fix: color of logs view (#896)
* fix: color of logs view

* remove concurrency from cancel test
2024-09-21 12:27:44 -07:00
abelanger5
d23e5d9963 feat: expression-based concurrency keys (#889)
* feat: expression-based concurrency keys

* fix: build

* fix: typos

* fix: gen

* fix: migration

* fix: remove print statements

* fix: reassignment bugs, retries on closed transport, pr review
2024-09-19 10:32:22 -04:00
Steinway Wu
44d03af852 fix: propagating additional metadata for child workflows (#882)
* fix: propagating additional metadata for child workflows

* add unit test
2024-09-19 13:28:46 +00:00
Gabe Ruttner
4ea4712d4d refactor: performance and throughput (#756)
Refactors the queueing logic to be fairly balanced between actions, with each action backed as a separate FIFO queue. Also adds support for priority queueing and custom queues, though those aren't exposed on the API layer yet. Improves throughput to be > 5000 tasks/second on a single queue. 

---------

Co-authored-by: Alexander Belanger <alexander@hatchet.run>
2024-08-12 14:38:47 +00:00
Gabe Ruttner
b802f9f45f feat: stream by addl meta (#751)
* feat: prop schedule and run

* wip

* fix: filter wfrid

* feat: hangup

* chore: rm debug log

* chore: func name

* fix: cancelled payload

* fix: load

* fix: cleanup the cahce

* fix: single proto

* fix: key -> val

* chore: case

* chore: rm dead code

* chore: rm dead code

* feat: go and docs

* fix: docs
2024-07-29 19:09:51 +00:00
abelanger5
a245151d91 feat: add workflow kind to workflow versions (#750)
* feat: support workflow kinds

* chore: generate
2024-07-29 12:07:34 -07:00
Gabe Ruttner
fd947cb5bc feat: go worker assignment (#741)
* feat: create worker with label

* feat: worker context

* feat: dynamic labels

* feat: affinity

* fix: ptr

* fix: nil labels

* feat: sticky dag

* feat: sticky docs

* feat: sticky children

* chore: lint

* fix: tests

* fix: possibly nil workerId

* chore: cleanup unneeded pointers
2024-07-26 10:19:11 -07:00
Luca Steeb
a51681ddc6 fix(webhooks): use PUT for healthcheck (#644) 2024-06-26 10:39:11 -04:00
Luca Steeb
1490d88954 feat: webhook workers (#542)
Adds serverless support via the concept of webhook workers. Allows any webhook to be registered as a serverless endpoint for executing a step.
2024-06-25 17:06:43 -04:00
abelanger5
7c3ddfca32 feat: api server extensions (#614)
* feat: allow extending the api server

* chore: remove internal packages to pkg

* chore: update db_gen.go

* fix: expose auth

* fix: move logger to pkg

* fix: don't generate gitignore for prisma client

* fix: allow extensions to register their own api spec

* feat: expose pool on server config

* fix: nil pointer exception on empty opts

* fix: run.go file
2024-06-19 09:36:13 -04:00
Gabe Ruttner
b728616161 feat: improve reassign and timeout behavior and visibility (#484)
* feat: create step run event

* fix: shorten reassign heartbeat

* feat: add reassign event

* feat: fail timeout instead of cancel

* chore: squash migration

* chore: clarify copy

* docs: improve timeouts doc

* chore: linting

* chore: generate

* fix: test

* fix: send cancellation signal on timeout failure

* fix: rm retry check

* chore: update migration for release

---------

Co-authored-by: Alexander Belanger <belanger@sas.upenn.edu>
2024-05-14 16:47:00 -04:00
Gabe Ruttner
0586af8e8c Feat add rate limit durations (#466)
* feat: add day, week, month, year durations

* feat: add options to client
2024-05-08 19:35:39 -04:00
Gabe Ruttner
c8b59d1cc1 Feat add additional meta to trigger (#458)
* feat: add additional metadata to sdk

* docs: metadata

* fix: build

* feat: filter gif
2024-05-08 10:07:28 -04:00
Gabe Ruttner
fa07400159 feat: event and workflow run metadata (#446)
Adds additional user-defined metadata to events and workflow runs.
2024-05-06 17:10:33 -04:00
abelanger5
7543a0c2a5 add jobs which always run on failure (#445)
* (wip) prisma schema

* feat: on-failure steps

* chore: address changes from PR review

* chore: bump migration number
2024-05-06 15:39:22 -04:00
abelanger5
4ce1dd8632 feat: multi-workflow runs listener on a single endpoint
* new api-contract for workflow run events

* feat: initial implementation for new subscribe listener

* fix: sync issues and send workflow runs immediately

* refactor: add context to all engine db queries, fix deadlocking query

* fix: use new ctx for deleting dispatcher and ticker

* add cancellation reasons

* fix: docs linting

---------

Co-authored-by: gabriel ruttner <gabriel.ruttner@gmail.com>
2024-04-18 20:55:11 -04:00
Gabe Ruttner
f43f32283c feat(py/go): namespaces (#354)
* feat: namespaced python

* wip: namespaced go

* fix: service name

* fix: tests

* feat: client WithNamespace

* feat: namespace example

* feat: namespaced event triggers

* docs: namespace docs

* chore: linting

---------

Co-authored-by: gabriel ruttner <gabe@hatchet.run>
2024-04-15 11:26:05 -07:00
Alexander Belanger
75172c4a05 fix: retries cause semaphore to go to zero 2024-04-15 13:50:16 -04:00
Luca Steeb
2f7483a3be test(timeout): add test for timeout (#344) 2024-04-11 00:50:24 +07:00
Luca Steeb
32aebd97b4 test(cancellation): add test for cancelled run (#339) 2024-04-10 23:58:18 +07:00
abelanger5
d6004bbe0c fix: stale queries in update semaphore (#333)
* fix: stale queries in update semaphore

* fix: set correct alias for semaphore update
2024-04-04 12:58:32 -04:00
abelanger5
066b3c5b71 feat(engine): initial rate-limiting engine implementation (#324)
* feat(engine): initial rate-limiting engine implementation

* fixes and implement go sdk rate limiting
2024-04-02 10:53:03 -04:00
Gabe Ruttner
d8b6843dec feat: streaming events (#309)
* feat: add stream event model

* docs: how to work with db models

* feat: put stream event

* chore: rm comments

* feat: add stream resource type

* feat: enqueue stream event

* fix: contracts

* feat: protos

* chore: set properties correctly for typing

* fix: stream example

* chore: rm old example

* fix: async on

* fix: bytea type

* fix: worker

* feat: put stream data

* feat: stream type

* fix: correct queue

* feat: streaming payloads

* fix: cleanup

* fix: validation

* feat: example file streaming

* chore: rm unused query

* fix: tenant check and read only consumer

* fix: check tenant-steprun relation

* Update prisma/schema.prisma

Co-authored-by: abelanger5 <belanger@sas.upenn.edu>

* chore: generate protos

* chore: rename migration

* release: 0.20.0

* feat(go-sdk): implement streaming in go

---------

Co-authored-by: gabriel ruttner <gabe@hatchet.run>
Co-authored-by: abelanger5 <belanger@sas.upenn.edu>
2024-04-01 15:46:21 -04:00
abelanger5
7b7fbe3668 fix: update Requeue and Reassign logic to fix performance degradation when many events are queued (#310)
Logic for requeueing and reassigning did not limit the number of step runs to requeue, so when events accumulate with no worker present it causes memory to spike along with a very high query latency on the database. This commit limits the number of step runs returned in the requeue and reassign queries, and also properly locks step run rows for these queries so only a step run in a PENDING or PENDING_ASSIGNMENT state can be requeued.

It also improves performance of the `AssignStepRunToWorker` query and ensures that `maxRuns` on workers are always respected through the introduction of a `WorkerSemaphore` model. The value gets decremented when a step run is assigned and incremented when a step run is in a final state. 

Co-authored-by: Luca Steeb <contact@luca-steeb.com>

* Update controller.go

---------

Co-authored-by: steebchen <contact@luca-steeb.com>
2024-04-01 12:33:18 -04:00
Luca Steeb
8183dd509a test(rampup): add load ramp up test (#273)
* test(rampup): add load ramp up test

* disable debug logging

* actual implementation

* refactor

* max acceptable schedule

* check for non-executed events

* fixes

* chore: set log level to error in engine tests

---------

Co-authored-by: abelanger5 <belanger@sas.upenn.edu>
2024-03-31 19:14:30 -04:00
abelanger5
77e5d2b77c feat(go-sdk): spawnWorkflow method and get up to speed with other sdks (#297)
* feat(go-sdk): spawnWorkflow method and get up to speed with other sdks

* fix: manual trigger example

* fix: linting errors

* fix: double serialization from go sdk

* fix: spawn workflow logic and procedural example

* test(e2e): add procedural test

* fix: panic in e2e test

* fix: e2e test preparation

* fix: api server url in test.yml

* fix: load test server url

* chore: make num children configurable

* address pr review
2024-03-29 14:07:39 -07:00
abelanger5
092f54c64f refactor: separate api and engine repositories, change ticker logic (#281)
* refactor: separate api and engine repositories, change ticker logic

* fix: nil error blocks

* fix: run migration on load test

* fix: generate db package in load test

* fix: test.yml

* fix: add pnpm to load test

* fix: don't lock CTEs with columns that don't get updated

* fix: update heartbeat for worker every 4 seconds, not 5

* chore: remove dead code

* chore: update python sdk

* chore: add back telemetry attributes
2024-03-21 14:10:34 -04:00
abelanger5
65224753c1 fix(go-sdk): support tls strategy of none, with docs (#269)
* fix(go-sdk): support tls strategy of none, with docs

* chore: errorf -> sprintf in examples

* Apply suggestions from code review

Co-authored-by: Luca Steeb <contact@luca-steeb.com>

* fix: remove time from example

---------

Co-authored-by: Luca Steeb <contact@luca-steeb.com>
2024-03-18 14:02:53 -04:00
Luca Steeb
713b8c95c6 fix: eliminate remaining race conditions (#220) 2024-03-02 23:47:50 +07:00
Luca Steeb
9b68115fb5 refactor: cleanup functions in api + worker (#192) 2024-03-02 00:37:02 +07:00
Luca Steeb
ae4841031b feat(engine): standalone tests and engine teardown (#172) 2024-02-28 00:15:25 +07:00
abelanger5
6ea38a99f2 feat: support maxRuns parameter on workers (#195)
* feat: round robin queueing

* feat: configurable max runs per worker

* fix: address PR review

* docs for max runs and group round robin
2024-02-26 00:48:46 -05:00
abelanger5
2d625fec81 feat: round robin queueing (#194) 2024-02-26 00:16:40 -05:00
abelanger5
df3f540748 feat: add retries to the engine and SDKs (#171)
This PR adds support for retrying failed step runs against the engine and SDKs. This was tested up to 30 retries per step run, with both failure and success at the 30th step run. Each SDK now has a `retries` configurable param for steps when declaring a workflow.
2024-02-16 13:00:22 -05:00
Luca Steeb
00111d823c test(load): add load tests CLI & e2e tests (#157) 2024-02-16 23:47:34 +07:00
abelanger5
c2ea09f375 feat: step reruns from the dashboard (#143) 2024-02-03 01:26:09 -05:00
abelanger5
82d7995343 feat: manual triggers and give clients a hook into step run events (#141)
* feat: pubsub for clients, more qol stuff

* fix: generate sqlc files

* chore: linting and comments
2024-02-02 12:52:34 -05:00
abelanger5
aed11c3958 feat: workflow visualization and qol improvements (#140)
* feat: workflow visualization and qol improvements

* fix: npm build
2024-02-02 01:35:05 -05:00
abelanger5
d63b66a837 feat: concurrency groups (#135)
* first pass at moving controllers around

* feat: concurrency limits for strategy CANCEL_IN_PROGRESS

* fix: linting

* chore: bump python sdk version
2024-01-30 00:00:28 -05:00
abelanger5
78685d0098 feat(security): multiple encryption options, API tokens, easier setup (#125)
* (wip) encryption

* feat: api tokens

* chore: add api token generation command

* fix: e2e tests

* chore: set timeout for e2e job

* fix: e2e tests, remove client-side certs

* chore: address PR review comments

* fix: token tests

* chore: address review comments and fix tests
2024-01-26 15:38:36 -05:00
Luca Steeb
8b379ee9d1 feat(events): add workflow filter (#114)
* feat(events): add workflow filter

* cast to uuid

---------

Co-authored-by: Alexander Belanger <belanger@sas.upenn.edu>
2024-01-21 22:33:58 -05:00
abelanger5
52fde1e704 feat: dag-style execution (#108)
* feat: dag-style execution

* docs: update to reflect new context

* ensure no cycles

* remove example cycle

* linting

* lint and small fixes

* update deferred rollback

* last rollback handling

* unset max issues

* fix requeue edge case
2024-01-16 11:31:24 -05:00
abelanger5
752d5b0ab7 feat: support passing inputs to scheduled workflows (#104)
* fix: usage of RegisterAction

* make registered actions callable

* chore: update yaml example

* docs: register action documented

* feat: support input to scheduled workflows

* add worker

* add client

* docs: add input to schedule workflow

* chore: generate with updated protoc

* chore: sqlc generate
2024-01-12 00:27:34 -05:00
abelanger5
1f7baacb94 Fix usage of RegisterAction and update docs (#103)
* fix: usage of RegisterAction

* make registered actions callable

* chore: update yaml example

* docs: register action documented
2024-01-11 18:42:20 -05:00
Luca Steeb
6d8c7ab073 feat(test): introduce basic e2e tests (#97) 2024-01-11 13:36:15 -05:00
abelanger5
7011416cb7 fix: simple example (#95) 2024-01-10 11:05:54 -05:00
abelanger5
ac0c4e934a fix: rabbitmq concurrent processing (#92) 2024-01-09 21:15:19 -05:00
abelanger5
fe76c724d1 feat: add ticker reassignment to engine (#86) 2024-01-08 14:11:30 -05:00
abelanger5
62445dc37f feat: support one-time scheduled workflows (#84)
* feat: support one-time scheduled workflows

* refactor: move schedule out of workflow trigger def

* docs: add scheduling workflows section

* docs: update creating workflow

* only cancel schedules that are in the future
2024-01-08 10:03:32 -05:00