* add multiple rate limiter in grpc using a token bucket
* PR feedback
* add in client retry for go client
* update test files
* remove log line only retry on ResourceExhausted and Unavailable
* add some concurrency limits so we don't swamp ourselves
* add some logging for when we are getting backed up
* lets not queue up when we are too full to prevent OOM problems
* fix spelling
* add config options for maximum concurrent and how long to wait for flush , let the wait for flush setting be used as back pressure and a signal to writers that we are slowing up
* lots of changes to buffering
* fix data race
* add some comments explaing how this works, change errors to be ResourceExhausted now that we have client retry and limit how many gofuncs we can create on cleanup and wait for them to finish before we exit
* hooking up the config values so they go to the right place
* Update config.go to default to 1 ms waitForFlush
* disable grpc_retry for client streams
* explicitly set the limit if it is 0
* weirdness because we were using an older version of the lib
---------
Co-authored-by: Sean Reilly <sean@hatchet.run>
Co-authored-by: Alexander Belanger <alexander@hatchet.run>
* feat: allow extending the api server
* chore: remove internal packages to pkg
* chore: update db_gen.go
* fix: expose auth
* fix: move logger to pkg
* fix: don't generate gitignore for prisma client
* fix: allow extensions to register their own api spec
* feat: expose pool on server config
* fix: nil pointer exception on empty opts
* fix: run.go file
* feat: alerting. implements slack alerting, email, and refactors tenant settings to make them more manageable
* chore: generate
* chore: generate sqlc after migrate
* new api-contract for workflow run events
* feat: initial implementation for new subscribe listener
* fix: sync issues and send workflow runs immediately
* refactor: add context to all engine db queries, fix deadlocking query
* fix: use new ctx for deleting dispatcher and ticker
* add cancellation reasons
* fix: docs linting
---------
Co-authored-by: gabriel ruttner <gabriel.ruttner@gmail.com>
* refactor: separate api and engine repositories, change ticker logic
* fix: nil error blocks
* fix: run migration on load test
* fix: generate db package in load test
* fix: test.yml
* fix: add pnpm to load test
* fix: don't lock CTEs with columns that don't get updated
* fix: update heartbeat for worker every 4 seconds, not 5
* chore: remove dead code
* chore: update python sdk
* chore: add back telemetry attributes