Adds a queue that gets triggered whenever a cron is created, updated, or deleted that will automatically update the list of crons running in the ticker.
* feat: reduced cold starts for new workers and queues
* address changes from pr review
* fix: data race
* set logs to debug on the harness
* debug for queue level as well
* debug lines for queuer
* fix: add queue notifier to v0 workflow registration
* revert: lease manager interval
* revert log level changes
* add more debug, revert reverts
* more debug
* add debug to lease manager
* do it, try it
* fix: call upsertQueue as part of workflow version put
* change log level to error again
* pr review changes
* fix: add type override in sqlc.yaml
* chore: gen sqlc
* chore: big find and replace
* chore: more
* fix: clean up bunch of outdated `.Valid` refs
* refactor: remove `sqlchelpers.uuidFromStr()` in favor of `uuid.MustParse()`
* refactor: remove uuidToStr
* fix: lint
* fix: use pointers for null uuids
* chore: clean up more null pointers
* chore: clean up a bunch more
* fix: couple more
* fix: some types on the api
* fix: incorrectly non-null param
* fix: more nullable params
* fix: more refs
* refactor: start replacing tenant id strings with uuids
* refactor: more tenant id uuid casting
* refactor: fix a bunch more
* refactor: more
* refactor: more
* refactor: is that all of them?!
* fix: panic
* fix: rm scans
* fix: unwind some broken things
* chore: tests
* fix: rebase issues
* fix: more tests
* fix: nil checks
* Refactor: Make all UUIDs into `uuid.UUID` (#2897)
* refactor: remove a bunch more string uuids
* refactor: pointers and lists
* refactor: fix all the refs
* refactor: fix a few more
* fix: config loader
* fix: revert some changes
* fix: tests
* fix: test
* chore: proto
* fix: durable listener
* fix: some more string types
* fix: python health worker sleep
* fix: remove a bunch of `MustParse`s from the various gRPC servers
* fix: rm more uuid.MustParse calls
* fix: rm mustparse from api
* fix: test
* fix: merge issues
* fix: handle a bunch more uses of `MustParse` everywhere
* fix: nil id for worker label
* fix: more casting in the oss
* fix: more id parsing
* fix: stringify jwt opt
* fix: couple more bugs in untyped calls
* fix: more types
* fix: broken test
* refactor: implement `GetKeyUuid`
* chore: regen sqlc
* chore: replace pgtype.UUID again
* fix: bunch more type errors
* fix: panic
* feat: recursively split payload list into chunks
* fix: use slices.Chunk and run sequentially
* fix: return error if only one payload
* fix: log error
* fix: couple edge cases
* feat: Send create:user event from OAuth flow
* feat: Implement user and tenant creation events in callbacks
* move callback into cb.Do
---------
Co-authored-by: Alexander Belanger <alexander@hatchet.run>
* Revert "Revert "Feat: Hatchet Metrics Monitoring, I (#2480)" (#2698)"
This reverts commit b87150767a.
* go mod tidy
---------
Co-authored-by: Mohammed Nafees <hello@mnafees.me>
* aggressively log errors when rmq retry more than 5 times
* revisit comments
* new vars and fix integration test
* fix test
* log only after 5 retries
* debug: remove event pub
* add additional spans to publish message
* debug: don't publish payloads
* fix: persistent messages on olap
* add back other payloads
* remove pub buffers temporarily
* fix: correct queue
* hacky partitioning
* add back pub buffers to scheduler
* don't send no worker events
* add attributes for queue name and message id to publish
* add back pub buffers to grpc api
* remove pubs again, no worker writes though
* task processing queue hashes
* remove payloads again
* gzip compression over 5kb
* add back task controller payloads
* add back no worker requeueing event, with expirable lru cache
* add back pub buffers
* remove hash partitioned queues
* small fixes
* ignore lru cache top fn
* config vars for compression, disable by default
---------
Co-authored-by: Alexander Belanger <alexander@hatchet.run>
* feat: add table for storing payloads
* feat: add payload type enum
* feat: gen sqlc
* feat: initial sql impl
* feat: add payload store repo to shared
* feat: add overwrite
* fix: impl
* feat: bulk op
* feat: initial wiring of inputs for task triggers
* feat: wire up dag matches
* feat: create V1TaskWithPayload and use it everywhere
* fix: couple bugs
* fix: clean up types
* fix: overwrite
* fix: rm input from replay
* fix: move payload store to shared repo
* fix: schema
* refactor: repo setup
* refactor: repos
* fix: gen
* chore: lint
* fix: rename
* feat: naming, write dag inputs
* fix: more naming, trigger bug
* fix: dual writes for now
* fix: pass in tx
* feat: initial work on offloader
* feat: improve external offloader
* fix: some refs
* add withExternalHandler
* fix: improve impl of external store
* feat: implement offloading, fix other impls
* feat: add query to update JSON
* fix: implement offloading + updating records in payloads table
* feat: add WAL table
* feat: add queries for polling WAL and evicting
* feat: wire up writes into WAL
* fix: get job working
* refactor: improve types
* fix: infinite loop
* feat: improve offloading logic to run in two separate txes
* refactor: rework how overrides work
* fix: lint
* fix: migration number
* fix: migration
* fix: migration version
* fix: revert back to reading payloads out
* fix: fall back to previous input, part i
* fix: input fallback
* fix: add back input to replay
* fix: input fallback in dispatcher
* fix: nil check
* feat: advisory locks, part i
* fix: no skip locked
* feat: hash partitioned wal table
* fix: modify queries a bit, tweak crud enum
* fix: pk order, function to find tenants
* feat: wal processing
* fix: only write wal if an external store is enabled, fix offloading logic
* fix: spacing
* feat: schema cleanup
* fix: rm external store loc name
* fix: set content to null when offloading
* fix: cleanup, naming
* fix: pass overwrite payload store along
* debug: add some logging
* Revert "debug: add some logging"
This reverts commit 43e71eadf1.
* fix: typo
* fx: add offloatAt to store opts for offloading
* fix: handle leasing with advisory lock
* fix: struct def
* fix: requeue on payloads not found
* fix: rm hack for triggers
* fix: revert empty input on write
* fix: write input
* feat: env var for enabling / disabling dual writes
* feat: wire up dual writes
* fix: comments
* feat: generics!
* fix: panic from type cast
* fix: migration
* fix: generic
* fix: hack for T key in map
* fix: cleanup
* fix: bug with json parsing failing
* fix: hang up on cancel and fail
* fix: pub stream events even if tenant pubs are disabled
* fix: condition
* fix: eq
* test: improves testing harness for engine
* update CI test
* fix: race condition in test
* make tests more stable
* cleanup pub and sub buffers
* fix: goleak on rampup test
* feat: matrix tests for engine
* clean up rabbit mq session stuff, add a quick ack and error processing for AddMessage
* bit more paranoid about getting stuck in chans
* first pass at locking the message to deal with the failed states better
* clean up the access to ready for the mq
* make sure we don't block sending this ack
- Simplifies architecture for splitting engine services into different components. The three supported services are now `grpc-api`, `scheduler`, and `controllers`. The `grpc-api` service is the only one which needs to be exposed for workers. The other two can run as unexposed services.
- Fixes a set of bugs and race conditions in the `v2` scheduler
- Adds a `lastActive` time to the `Queue` table and includes a migration which sets this `lastActive` time for the most recent 24 hours of queues. Effectively this means that the max scheduling time in a queue is 24 hours.
- Rewrites the `ListWorkflowsForEvent` query to improve performance and select far fewer rows.