Commit Graph

216 Commits

Author SHA1 Message Date
Gabe Ruttner
2fdc47a6af feat: multiple slot types (#2927)
* feat: adds support for multiple slot types, primarily motivated by durable slots

---------

Co-authored-by: mrkaye97 <mrkaye97@gmail.com>
2026-02-17 05:43:47 -08:00
matt
ac98a33992 Fix: Remove null bytes from error message to prevent db crash (#3010)
* fix: replace null byte in error message

* fix: rm space
2026-02-12 18:07:28 -05:00
Mohammed Nafees
5db655e9aa Do not replay invalid tasks (#2976)
* filter valid tasks when replaying

* renamings

* query optim

* slice len

* fix method signature

* PR comments

* rename var

* PR comments
2026-02-10 13:08:38 -05:00
matt
961682d704 Hotfix: More panics (#2945)
* fix: check if dispatcher id is nil

* explicit nil check

* proposal: extra return out of `GetDispatcherIdsForWorkers`
2026-02-04 19:08:17 -05:00
matt
a782d9fd01 Hotfix: UUID Panics (#2944)
* fix: panic

* fix: more panic risks

* fix: two more possible panics
2026-02-04 15:11:08 -05:00
abelanger5
2ddcbd2672 refactor: use typed maps (#2928)
* refactor: use typed maps

* self-review comments
2026-02-03 19:35:09 -05:00
matt
058968c06b Refactor: Attempt II at removing pgtype.UUID everywhere + convert string UUIDs into uuid.UUID (#2894)
* fix: add type override in sqlc.yaml

* chore: gen sqlc

* chore: big find and replace

* chore: more

* fix: clean up bunch of outdated `.Valid` refs

* refactor: remove `sqlchelpers.uuidFromStr()` in favor of `uuid.MustParse()`

* refactor: remove uuidToStr

* fix: lint

* fix: use pointers for null uuids

* chore: clean up more null pointers

* chore: clean up a bunch more

* fix: couple more

* fix: some types on the api

* fix: incorrectly non-null param

* fix: more nullable params

* fix: more refs

* refactor: start replacing tenant id strings with uuids

* refactor: more tenant id uuid casting

* refactor: fix a bunch more

* refactor: more

* refactor: more

* refactor: is that all of them?!

* fix: panic

* fix: rm scans

* fix: unwind some broken things

* chore: tests

* fix: rebase issues

* fix: more tests

* fix: nil checks

* Refactor: Make all UUIDs into `uuid.UUID` (#2897)

* refactor: remove a bunch more string uuids

* refactor: pointers and lists

* refactor: fix all the refs

* refactor: fix a few more

* fix: config loader

* fix: revert some changes

* fix: tests

* fix: test

* chore: proto

* fix: durable listener

* fix: some more string types

* fix: python health worker sleep

* fix: remove a bunch of `MustParse`s from the various gRPC servers

* fix: rm more uuid.MustParse calls

* fix: rm mustparse from api

* fix: test

* fix: merge issues

* fix: handle a bunch more uses of `MustParse` everywhere

* fix: nil id for worker label

* fix: more casting in the oss

* fix: more id parsing

* fix: stringify jwt opt

* fix: couple more bugs in untyped calls

* fix: more types

* fix: broken test

* refactor: implement `GetKeyUuid`

* chore: regen sqlc

* chore: replace pgtype.UUID again

* fix: bunch more type errors

* fix: panic
2026-02-03 11:02:59 -05:00
abelanger5
d56dee4266 feat: durable user event log (#2861)
* placeholder

* feat: db tables for user events (#2862)

* feat: db tables for user events

* move event payloads to payloads table, fix env var loading

* fix: address pr review comments

* missed save

* feat: optimistic scheduling (#2867)

* feat: db tables for user events

* move event payloads to payloads table, fix env var loading

* refactor: small changes to prepare optimistic txs

* feat: optimistic scheduling

* address pr review comments

* rm comments

* fix: rampup test race condition

* fix: goleak

* feat: grpc-side triggers

* fix: config and sem logic

* fix: respect optimistic scheduling env var

* add optimistic to testing matrix, remove pg-only mode

* fix cleanup of pubbuffers

* merge migrations

* last testing fixes
2026-02-02 18:04:02 -05:00
Andrei Gaspar
4dda2b2884 Send create:user Event from OAuth Flow (#2683)
* feat: Send create:user event from OAuth flow

* feat: Implement user and tenant creation events in callbacks

* move callback into cb.Do

---------

Co-authored-by: Alexander Belanger <alexander@hatchet.run>
2026-01-06 14:06:38 -05:00
abelanger5
9f463e92d6 refactor: move v1 packages, remove webhook worker references (#2749)
* chore: move v1 packages, remove webhook worker references

* chore: move msgqueue

* fix: relative paths in sqlc.yaml
2026-01-02 11:42:40 -05:00
abelanger5
f82d3bd071 refactor: consolidate repository methods (#2730)
* refactor: remove v0 paths from codebase

* remove uiVersion references

* refactor: remove v0-exclusive database queries

* remove webhook test

* chore: move api token repository

* chore: move dispatcher repository to v1

* chore: move health repository to v1

* chore: remove event repository

* remove some unused repositories

* chore: move mq implementation to v1

* chore: consolidate rate limit implementations

* chore: move security check to v1 repository

* chore: move slack to v1 repository

* chore: move sns implementation to v1 repository

* clean up step repository

* chore: move tenant invite to v1 repository

* chore: move limits, workers, tenant alerts to v1 repository

* chore: move user, tenant, userSession to v1 repository

* chore: move ticker to v1 repository

* chore: move scheduled workflows to v1 repository

* chore: remove workflows

* fix: remove pointer for limits config file

* propagate cache value to api token

* propagate cache durations
2025-12-31 16:35:46 -05:00
matt
39f5481410 hotfix: handle panic (#2732) 2025-12-31 08:39:31 -07:00
abelanger5
f57ebf7546 refactor: remove v0-exclusive database queries (#2729)
* refactor: remove v0 paths from codebase

* remove uiVersion references

* refactor: remove v0-exclusive database queries

* remove webhook test
2025-12-31 09:36:12 -05:00
abelanger5
dd9c36c315 refactor: remove v0 paths from codebase (#2728)
* refactor: remove v0 paths from codebase

* remove uiVersion references
2025-12-30 09:57:00 -05:00
Mohammed Nafees
88e7a60b83 msgqueue msg IDs as constants for ease of navigation and readability (#2692) 2025-12-25 11:56:07 +01:00
matt
b65c6de53f Feat: Hatchet Metrics Monitoring, I (#2699)
* Revert "Revert "Feat: Hatchet Metrics Monitoring, I (#2480)" (#2698)"

This reverts commit b87150767a.

* go mod tidy

---------

Co-authored-by: Mohammed Nafees <hello@mnafees.me>
2025-12-23 20:14:14 +01:00
matt
b87150767a Revert "Feat: Hatchet Metrics Monitoring, I (#2480)" (#2698)
This reverts commit fdc075ec6f.
2025-12-22 16:26:14 -05:00
matt
fdc075ec6f Feat: Hatchet Metrics Monitoring, I (#2480)
* feat: queries + task methods for oldest running task and oldest task

* feat: worker slot and sdk metrics

* feat: wal metrics

* repository stub

* feat: add meter provider thingy

* pg queries

* fix: add task

* feat: repo methods for worker metrics

* feat: active workers query, fix where clauses

* fix: aliasing

* fix: sql, cleanup

* chore: cast

* feat: olap queries

* feat: olap queries

* feat: finish wiring up olap status update metrics

* chore: lint

* chore: lint

* fix: dupes, other code review comments

* send metrics to OTel collector

* last autovac

* flag

* logging updates

* address PR comments

---------

Co-authored-by: gabriel ruttner <gabriel.ruttner@gmail.com>
Co-authored-by: Mohammed Nafees <hello@mnafees.me>
2025-12-23 01:04:02 +05:30
matt
0a947924fa Feat: Parallelize replication from PG -> External (#2637)
* feat: chunking query

* feat: first pass at range chunking

* fix: bug bashing

* fix: function geq

* fix: use maps.Copy

* fix: olap func

* feat: olap side

* refactor: external id

* fix: order by

* feat: wire up env vars

* fix: pass var through

* fix: naming

* fix: append to returnErr properly

* fix: use eg.Go
2025-12-10 17:11:03 -05:00
matt
9e14814acb Feat: OLAP Payload Cutover Job (#2618)
* feat: migration

* feat: queries

* feat: overwrite queries

* fix: bug

* feat: first pass

* fix: more olap job wiring

* fix: signature

* fix: refs to a bunch of funcs

* feat: job

* fix: table names

* fix: span name

* chore: lint

* fix: redundant error check

* fix: naming

* fix: handle nil external id

* fix: order payload partitions descending

* fix: param for limiting which partitions get processed

* fix: olap
2025-12-09 12:33:07 -05:00
matt
35d1cff963 refactor: simplify external store signature (#2616) 2025-12-08 14:53:52 -05:00
matt
bede3efe0d Feat: Process all old partitions in a loop (#2613)
* feat: process old partitions in a loop

* fix: param

* fix: query return

* feat: add spans

* fix: naming
2025-12-08 11:00:24 -05:00
matt
18940869ae Feat: Job for payload cutovers to external (#2586)
* feat: initial payload cutover job

* refactor: fix a couple things

* feat: start wiring up writes

* feat: only run job if external store is enabled

* fix: add some notes, add loop

* feat: function for reading out payloads

* fix: date handling, logging

* feat: remove wal and immediate offloads

* feat: advisory lock

* feat: partition swap logic

* fix: rm debug

* fix: add todo

* fix: sql cleanup

* fix: sql cleanup, ii

* chore: nuke a bunch of WAL stuff

* chore: more wal

* feat: trigger for crud opts

* feat: drop trigger + function in swapover

* feat: move autovac to later

* feat: use unlogged table initially

* feat: update migration

* fix: drop trigger

* fix: use insert + on conflict

* fix: types

* refactor: clean up a bit

* fix: panic

* fix: detach partition before dropping

* feat: configurable batch size

* feat: offset tracking in the db

* feat: explicitly lock

* fix: down migration

* fix: bug

* fix: offset handling

* fix: try explicit ordering of the insert

* fix: lock location

* fix: do less stuff after locking

* fix: ordering

* fix: dont drop and recreate if temp table exists

* fix: explicitly track completed status

* fix: table name

* fix: dont use unlogged table

* fix: rm todos

* chore: lint

* feat: configurable delay

* fix: use date as pk instead of varchar

* fix: daily job

* fix: hack check constraint to speed up partition attach

* fix: syntax

* fix: syntax

* fix: drop constraint after attaching

* fix: syntax

* fix: drop triggers properly

* fix: factor out insert logic

* refactor: factor out loop logic

* refactor: factor out job preparation work

* fix: ordering

* fix: run the job more often

* fix: use `WithSingletonMode`

* fix: singleton mode sig

* fix: env var cleanup

* fix: overwrite sig

* fix: re-enable immediate offloads with a flag

* fix: order, offload at logic

* feat: add count query to compare

* fix: row-level triggers, partition time bug

* fix: rm todo

* fix: for true

* fix: handle lock not acquired

* fix: handle error

* fix: comment
2025-12-05 10:54:26 -05:00
abelanger5
9dabe7d902 feat: dlq for dispatcher queues (#2600)
* feat: dlq for dispatcher queues

* reduce dispatcher message ttl to 20 seconds

* rename dispatcher queue for clarity

* add error logs when dead lettering

* address comment
2025-12-04 14:19:01 -05:00
matt
7fe9806f5d Feat: Configurable OLAP status update size limits (#2499)
* feat: configurable status updates

* fix: config

* fix: wiring

* feat: export limits from olap

* fix: param drilling
2025-11-06 13:37:40 -05:00
Mohammed Nafees
57ad1af68d fix: deadlocks on trigger, olap prometheus background worker, otel improvements (#2475)
* print error log temporarily

* casing

* only for create-monitoring-event

* rate limit iterator

* add a debugger

* remove rate limiter

* improve otel on trigger

* cache probability stuff

* track misses

* move down one ln

* default

* Fix: Pass tx down into payload retrieve (#2483)

* [Python] Feat: Dataclass Support (#2476)

* fix: prevent lifespan error from hanging worker

* fix: handle cleanup

* feat: dataclass outputs

* feat: dataclasses

* feat: incremental dataclass work

* feat: dataclass tests

* fix: lint

* fix: register wf

* fix: ugh

* chore: changelog

* fix: validation issue

* fix: none check

* fix: lint

* fix: error type

* chore: regenerate examples (#2477)

Co-authored-by: GitHub Action <action@github.com>

* feat: add health and metrics api on typescript sdk worker (#2457)

* feat: add health and metrics api on typescript sdk worker

add: prom-client to fetch metrics data
add: track health status of worker across different states

* refactor: keep prom-client as optional dependency

* refactor: remove async import of prom-client

* chore: update package version for ts sdk

* fix: lint

* fix: lint, const enum

---------

Co-authored-by: mrkaye97 <mrkaye97@gmail.com>

* Update frontend onboarding steps (#2478)

* Update frontend onboarding steps

* Update sidebar as well

* Fix Go SDK cron inputs (#2481)

* cron input in Go SDK

* add example

* fix: pass tx down to retrieve

* fix: attempt 2, another pool use

* fix: spans and debugging for task statuses

* attempted hotfix on olap statuses

* process tenants in parallel in prom worker

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: GitHub Action <action@github.com>
Co-authored-by: Jishnu <jishnun789@gmail.com>
Co-authored-by: Sid Premkumar <sid.premkumar@gmail.com>
Co-authored-by: Mohammed Nafees <hello@mnafees.me>
Co-authored-by: Alexander Belanger <alexander@hatchet.run>

* move debugger package, clean up init

* remove probability factor logic

* remove debug

* fix: debugger instantiation

---------

Co-authored-by: Alexander Belanger <alexander@hatchet.run>
Co-authored-by: gabriel ruttner <gabriel.ruttner@gmail.com>
Co-authored-by: mrkaye97 <mrkaye97@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: GitHub Action <action@github.com>
Co-authored-by: Jishnu <jishnun789@gmail.com>
Co-authored-by: Sid Premkumar <sid.premkumar@gmail.com>
2025-11-04 09:05:44 +01:00
Mohammed Nafees
8412a985e3 increase timeout to 30 seconds (#2449) 2025-10-28 16:50:36 +01:00
Mohammed Nafees
c3a1ac621d fix confusing error (#2447) 2025-10-27 15:58:53 -04:00
abelanger5
e1fdeeaf1c fix: payload performance (#2441)
* change some olap flush settings

* increase timeouts for payload wal

* fix: improve performance of payload wal metrics

* slight updates

* more small tweaks

* undo some olap changes, don't offload some payloads

* remove double reads

* try reducing wal poll limit

* analyze v1_dag

* move partition method
2025-10-23 17:45:49 -04:00
Mohammed Nafees
bd95b78f38 run cleanup job every minute (#2440) 2025-10-23 14:45:10 -04:00
matt
c6e154fd03 Feat: OLAP Payloads (#2410)
* feat: olap payloads table

* feat: olap queue messages for payload puts

* feat: wire up writes on task write

* driveby: add + ignore psql-connect

* fix: down migration

* fix: use external id for pk

* fix: insert query

* fix: more external ids

* fix: bit more cleanup

* feat: dags

* fix: the rest of the refs

* fix: placeholder uuid

* fix: write external ids

* feat: wire up messages over the queue

* fix: panic

* Revert "fix: panic"

This reverts commit c0adccf2ea.

* Revert "feat: wire up messages over the queue"

This reverts commit 36f425f3c1.

* fix: rm unused method

* fix: rm more

* fix: rm cruft

* feat: wire up failures

* feat: start wiring up completed events

* fix: more wiring

* fix: finish wiring up completed event payloads

* fix: lint

* feat: start wiring up external ids in the core

* feat: olap pub

* fix: add returning

* fix: wiring

* debug: log lines for pubs

* fix: external id writes

* Revert "debug: log lines for pubs"

This reverts commit fe430840bd.

* fix: rm sample

* debug: rm pub buffer param

* Revert "debug: rm pub buffer param"

This reverts commit b42a5cacbb.

* debug: stuck queries

* debug: more logs

* debug: yet more logs

* fix: rename BulkRetrieve -> Retrieve

* chore: lint

* fix: naming

* fix: conn leak in putpayloads

* fix: revert debug

* Revert "debug: more logs"

This reverts commit 95da7de64f.

* Revert "debug: stuck queries"

This reverts commit 8fda64adc4.

* feat: improve getters, olap getter

* fix: key type

* feat: first pass at pulling olap payloads from the payload store

* fix: start fixing bugs

* fix: start reworking `includePayloads` param

* fix: include payloads wiring

* feat: analyze for payloads

* fix: simplify writes more + write event payloads

* feat: read out event payloads

* feat: env vars for dual writes

* refactor: clean up task prop drilling a bit

* feat: add include payloads params to python for tests

* fix: tx commit

* fix: dual writes

* fix: not null constraint

* fix: one more

* debug: logging

* fix: more debugging, tweak function sig

* fix: function sig

* fix: refs

* debug: more logging

* debug: more logging

* debug: fix condition

* debug: overwrite properly

* fix: revert debug

* fix: rm more drilling

* fix: comments

* fix: partitioning jobs

* chore: ver

* fix: bug, docs

* hack: dummy id and inserted at for payload offloads

* fix: bug

* fix: no need to handle offloads for task event data

* hack: jitter + current ts

* fix: short circuit

* fix: offload payloads in a tx

* fix: uncomment sampling

* fix: don't offload if external store is disabled

* chore: gen sqlc

* fix: migration

* fix: start reworking types

* fix: couple more

* fix: rm unused code

* fix: drill includePayloads down again

* fix: silence annoying error in some cases

* fix: always store payloads

* debug: use workflow run id for input

* fix: improve logging

* debug: logging on retrieve

* debug: task input

* fix: use correct field

* debug: write even null payloads to limit errors

* debug: hide error lines

* fix: quieting more errors

* fix: duplicate example names, remove print lines

* debug: add logging for olap event writes

* hack: immediate event offloads and cutovers

* fix: rm log line

* fix: import

* fix: short circuit events

* fix: duped names
2025-10-20 09:09:49 -04:00
Mohammed Nafees
e2b1f1353e Fix OTel span attribute naming convention (#2409)
* rename spans according to convention

* low cardinality
2025-10-16 18:43:40 +02:00
Mohammed Nafees
d9268c7270 Cleanup job for old and invalid entries (#2378)
* auto run table cleanup

* batched cleanup of tables

* address PR comments

* fix timeout

* update queries

* fix shouldContinue

* also call cleanup for v1_workflow_concurrency_slot

* fix comment

* comment fix
2025-10-16 16:51:08 +02:00
abelanger5
b16be655be feat: stateful polling intervals (#2417)
* initial pass on stateful intervals

* pr review comments + add evict expired idempotency keys

* fix: goroutine leak and name vars better

* fix some cleanup logic
2025-10-15 11:40:22 -04:00
Mohammed Nafees
a750ce950d Introduce vars to tune ANALYZE job gocron run intervals (#2407)
* introduce cars to tune ANALYZE job gocron run intervals

* update config doc

* fix assignment
2025-10-10 11:02:10 +02:00
matt
c48a3211b5 Feat: Immediate Payload Offloads (#2375)
* feat: modify operations

* feat: attempt 1 at doing the cutover + the offload in the same query

* fix: operation write

* debug: add some print lines

* fix: check constraint

* fix: select records to offload properly

* fix: fn

* feat: add second table to hold queued cutovers

* fix: start reworking queries

* fix: select

* fix: missing cols

* fix: for update

* fix: query name for finalize

* feat: cut over query finalizer

* feat: query for writes into cutover queue

* feat: add query for cut over polling

* feat: add cutover job

* fix: rm operations

* feat: write cutover queue items at the same time as setting payload keys

* fix: simplify into single query

* fix: revert debug

* chore: lint

* fix: don't remove operation column yet

* feat: refactor into struct of opts and make job intervals configurable

* fix: add analyze for payload table

* fix: schema copy paste

* fix: drop fk

* feat: add an index to help with poll performance for a short while

* fix: simplify poll ordering

* fix: simplify more

* fix: ctx

Co-authored-by: Mohammed Nafees <hello@mnafees.me>

* Feat: Task Event and DAG Payloads (#2370)

* feat: initial work on task event payloads

* fix: iterator

* feat: wire up task events

* fix: backwards compat

* fix: migrations

* fix: duplication

* fix: col

* fix: add timestamptz col

* fix: overwrite

* fix: rm debugging

* fix: revert debugging

* fix: rm unused cols

* fix: spelling

* fix: use `current_timestamp` as default

* feat: dual writes for payloads

* fix: improve debug lines

* debug: add log

* debug: always write

* fix: make annoying log debug level

* fix: rm debug lines

* fix: add comment

* feat: dag payloads

* fix: index

* fix: migration ver

* fix: error msg

Co-authored-by: abelanger5 <belanger@sas.upenn.edu>

* fix: create, then set default

* fix: inserted at copy paste

* fix: n+1 query

* fix: another n+1 query

* fix: rm unused singleton retrieve

---------

Co-authored-by: abelanger5 <belanger@sas.upenn.edu>

---------

Co-authored-by: Mohammed Nafees <hello@mnafees.me>
Co-authored-by: abelanger5 <belanger@sas.upenn.edu>
2025-10-08 11:22:34 -04:00
Mohammed Nafees
ed40a82dbb Include tenant_id in OTel spans wherever possible (#2382) 2025-10-03 18:16:16 +02:00
matt
bb1de91254 fix: run analyze every 3 hours (#2380) 2025-10-03 09:49:35 -04:00
matt
8b8ded655d Fix: Update payload properly on replay (#2317)
* fix: overwrite payloads when task is in an initially e.g. cancelled state

* fix: add distinct to payload writes to limit conflict resolution

* feat: first pass at test

* fix: tenant in warning

* fix: lint, more assertions

* fix: bug

* fix: my pet peeve
2025-09-18 20:42:39 -04:00
matt
bdedab653a Fix: WAL partition poll function type (#2301)
* fix: type

* fix: cast to int32

* debug: add logging

* debug: more logs

* Revert "debug: more logs"

This reverts commit 2ff8033f89.

* Revert "debug: add logging"

This reverts commit a7aaa05b9c.

* fix: rm unnecessary generic

* feat: span attrs + names

* fix: span naming, more details
2025-09-16 12:44:55 -04:00
matt
92843bb277 Feat: Payload Store Repository (#2047)
* feat: add table for storing payloads

* feat: add payload type enum

* feat: gen sqlc

* feat: initial sql impl

* feat: add payload store repo to shared

* feat: add overwrite

* fix: impl

* feat: bulk op

* feat: initial wiring of inputs for task triggers

* feat: wire up dag matches

* feat: create V1TaskWithPayload and use it everywhere

* fix: couple bugs

* fix: clean up types

* fix: overwrite

* fix: rm input from replay

* fix: move payload store to shared repo

* fix: schema

* refactor: repo setup

* refactor: repos

* fix: gen

* chore: lint

* fix: rename

* feat: naming, write dag inputs

* fix: more naming, trigger bug

* fix: dual writes for now

* fix: pass in tx

* feat: initial work on offloader

* feat: improve external offloader

* fix: some refs

* add withExternalHandler

* fix: improve impl of external store

* feat: implement offloading, fix other impls

* feat: add query to update JSON

* fix: implement offloading + updating records in payloads table

* feat: add WAL table

* feat: add queries for polling WAL and evicting

* feat: wire up writes into WAL

* fix: get job working

* refactor: improve types

* fix: infinite loop

* feat: improve offloading logic to run in two separate txes

* refactor: rework how overrides work

* fix: lint

* fix: migration number

* fix: migration

* fix: migration version

* fix: revert back to reading payloads out

* fix: fall back to previous input, part i

* fix: input fallback

* fix: add back input to replay

* fix: input fallback in dispatcher

* fix: nil check

* feat: advisory locks, part i

* fix: no skip locked

* feat: hash partitioned wal table

* fix: modify queries a bit, tweak crud enum

* fix: pk order, function to find tenants

* feat: wal processing

* fix: only write wal if an external store is enabled, fix offloading logic

* fix: spacing

* feat: schema cleanup

* fix: rm external store loc name

* fix: set content to null when offloading

* fix: cleanup, naming

* fix: pass overwrite payload store along

* debug: add some logging

* Revert "debug: add some logging"

This reverts commit 43e71eadf1.

* fix: typo

* fx: add offloatAt to store opts for offloading

* fix: handle leasing with advisory lock

* fix: struct def

* fix: requeue on payloads not found

* fix: rm hack for triggers

* fix: revert empty input on write

* fix: write input

* feat: env var for enabling / disabling dual writes

* feat: wire up dual writes

* fix: comments

* feat: generics!

* fix: panic from type cast

* fix: migration

* fix: generic

* fix: hack for T key in map

* fix: cleanup
2025-09-12 09:53:01 -04:00
matt
f385964fcc Fix: Scheduled runs race w/ idempotency key check (#2077)
* feat: create table for storing key

* feat: is_filled col

* feat: idempotency repo

* fix: handle filling

* fix: improve queries

* feat: check if was created already before triggering

* fix: handle partitions

* feat: improve schema

* feat: initial idempotency key claiming impl

* fix: db

* fix: sql fmt

* feat: crazy query

* fix: downstream

* fix: queries

* fix: query bug

* fix: migration rename

* fix: couple small issues

* feat: eviction job

* fix: copilot comments

* fix: index name

* fix: rm comment
2025-09-12 07:54:42 -04:00
Gabe Ruttner
9459dad14d Feat improve auth error handling (#1893)
* common errors

* rate limits

* add IP extractor to api server

* use echo rate limit middleware func

* use rate limit for webhooks as well

---------

Co-authored-by: Mohammed Nafees <hello@mnafees.me>
2025-09-11 18:30:07 +02:00
Mohammed Nafees
1a2891154e Periodically run ANALYZE on v1_task and v1_task_event (#2236)
* analyze v1_task and v1_task_event tables periodically

* copy pasta
2025-09-02 11:07:05 -04:00
abelanger5
f7eda21c10 fix: confusing error message (#2199) 2025-08-26 10:55:23 -04:00
matt
5eab4b74e7 Feat: Run ANALYZE on a few tables once a day (#2163)
* feat: add analyze for a few tables

* feat: run at 5am utc

* fix: add tx, timeout

* fix: 30m timeout
2025-08-19 13:43:27 -04:00
abelanger5
1407594902 fix: move rate limited queue items off the main queue (#2155)
* fix: move rate limited queue items off the main queue

* preserve FIFO behavior on queues

* fix unit tests, address pr comments

* fix: generated

* rename table
2025-08-18 11:31:21 -04:00
matt
ed65e41ff2 Fix: Optimize DAG timing query for Prom (#2102)
* feat: improve dag duration query

* fix: naming

* fix: wiring

* feat: add trace

* fix: add timeouts

* fix: inserted at

* fix: correctness tweak

* fix: try upgrading pino
2025-08-12 08:01:00 -04:00
Gabe Ruttner
c6fd39b4e0 fix: ProcessTaskTimeouts limit and timeout (#2087)
* limit and timeout

* right query

* configurable

* limit

* env vars
2025-08-06 08:42:07 -04:00
Mohammed Nafees
89e6d00a8f Add telemetry around task statuses in controller (#2090)
* add telemetry around task statuses in controller

* fixes

* more fixes
2025-08-06 08:41:54 -04:00