Commit Graph

504 Commits

Author SHA1 Message Date
matt
36924936fa Feat: Webhook fixes / improvements (#2131)
* feat: webhook update

* feat: add headers to cel env

* fix: header casing

* feat: wire up edits

* fix: updates

* fix: finish wiring up updates

* fix: handle save on enter

* fix: lint

* feat: add slack and discord

* feat: initial slack setup

* fix: get slack working

* fix: rm discord for now

* fix: lint

* chore: gen

* fix: explicit save button

* feat: add link to CEL docs

* feat: add callout for reaching out to support

* feat: docs

* refactor: challenge

* fix: naming

* fix: return

* fix: resp codes

* fix: webhooks beta flag

* fix: rm discord

* fix: docs
2025-08-14 10:46:57 -05:00
Mohammed Nafees
51a037d493 Workflow combobox search functionality (#2118)
* add pagination support for trigger workflow dropdown

* fix lint

* message fix

* no pagination but search

* no need for useless allocation

* PR comments
2025-08-12 20:43:30 +02:00
matt
ed65e41ff2 Fix: Optimize DAG timing query for Prom (#2102)
* feat: improve dag duration query

* fix: naming

* fix: wiring

* feat: add trace

* fix: add timeouts

* fix: inserted at

* fix: correctness tweak

* fix: try upgrading pino
2025-08-12 08:01:00 -04:00
matt
d2b60917ee Fix: Waterfall panic + query simplification (#2116)
* fix: simplify query a bunch

* fix: simplify more

* fix: simplify a whole bunch more

* fix: wire up

* fix: query

* fix: the actual bug
2025-08-12 07:56:13 -04:00
matt
4d654f34ec Debug: Fail task tracing (#2101)
* feat: add some span attributes to see how big the batches are

* fix: span naming

* fix: naming

* fix: issues + lint
2025-08-07 12:05:28 -04:00
matt
285f1728d5 Fix: Call PopulateTaskRunData sequentially (#2097)
* fix: call task data lookup query sequentially

* fix: error fmt

* feat: add more span attrs

* fix: end + ctx handling

* fix: int type

* fix: handle dupes, factor out into helper

* fix: naming

* fix: unwind naming change

* fix: naming
2025-08-06 18:40:02 -04:00
Mohammed Nafees
34074affd8 Add contextual data for trigger via events (#2092)
* add contextual data for trigger via events

* fix corrId

* string needed
2025-08-06 16:52:06 -04:00
matt
b233e6a5cb Fix: Improve performance of UpdateTasksToAssigned (#2094)
* fix: prune partitions, join on full PK

* fix: subquery

* fix: check validity
2025-08-06 15:04:00 -04:00
Mohammed Nafees
889210da7c add telemetry to task status repo methods (#2091) 2025-08-06 13:34:59 -04:00
Gabe Ruttner
c6fd39b4e0 fix: ProcessTaskTimeouts limit and timeout (#2087)
* limit and timeout

* right query

* configurable

* limit

* env vars
2025-08-06 08:42:07 -04:00
Mohammed Nafees
0b646316f1 Add GRPC callback interceptor and correlation IDs to respective API and GRPC handlers (#2073)
* chore(deps): bump hatchet-sdk in /examples/python/quickstart (#2070)

Bumps hatchet-sdk from 1.16.3 to 1.16.4.

---
updated-dependencies:
- dependency-name: hatchet-sdk
  dependency-version: 1.16.4
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* chore(deps): bump google.golang.org/api from 0.243.0 to 0.244.0 (#2071)

Bumps [google.golang.org/api](https://github.com/googleapis/google-api-go-client) from 0.243.0 to 0.244.0.
- [Release notes](https://github.com/googleapis/google-api-go-client/releases)
- [Changelog](https://github.com/googleapis/google-api-go-client/blob/main/CHANGES.md)
- [Commits](https://github.com/googleapis/google-api-go-client/compare/v0.243.0...v0.244.0)

---
updated-dependencies:
- dependency-name: google.golang.org/api
  dependency-version: 0.244.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* add grpc callback interceptor

* add correlation id to more endpoints

* fix string interpolation payment methods (#2072)

* hotfix: empty scope in OLAP replication (#2068)

* fix lint

* update comment

* feat: activity detection (#2055)

* feat: activity detection

* address comments

* chore(deps): bump github.com/prometheus/client_golang (#2074)

Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.22.0 to 1.23.0.
- [Release notes](https://github.com/prometheus/client_golang/releases)
- [Changelog](https://github.com/prometheus/client_golang/blob/v1.23.0/CHANGELOG.md)
- [Commits](https://github.com/prometheus/client_golang/compare/v1.22.0...v1.23.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/client_golang
  dependency-version: 1.23.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* chore(deps): bump github.com/getsentry/sentry-go from 0.34.1 to 0.35.0 (#2075)

Bumps [github.com/getsentry/sentry-go](https://github.com/getsentry/sentry-go) from 0.34.1 to 0.35.0.
- [Release notes](https://github.com/getsentry/sentry-go/releases)
- [Changelog](https://github.com/getsentry/sentry-go/blob/master/CHANGELOG.md)
- [Commits](https://github.com/getsentry/sentry-go/compare/v0.34.1...v0.35.0)

---
updated-dependencies:
- dependency-name: github.com/getsentry/sentry-go
  dependency-version: 0.35.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* add resource id and type

* update grpc callback middleware

* fix v0 trigger

* use constants

* fix values

* use constants

* use string declared method

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: matt <mrkaye97@gmail.com>
Co-authored-by: Gabe Ruttner <gabriel.ruttner@gmail.com>
2025-08-04 12:29:01 -04:00
matt
13b2e1d26c Fix: Propagate priority through to DAG subtasks (#2078)
* feat: add priority column to v1_match

* feat: wire up writes

* fix: more wiring

* fix: migration name
2025-08-04 12:22:28 -04:00
matt
c44c70bd0c Debug: Add debug logs around put log method (#2079)
* feat: add logger to ingestor

* debug: add a bunch of debug logs

* fix: add prefix for grep

* fix:  copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* fix: panic

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-08-04 11:19:07 -04:00
matt
8480228d79 Fix: Allow bypassing partitioning for events lookup table (#2054)
* fix: allow olap events lt to not be partitioned manually

* chore: gen

* chore: gen
2025-07-31 18:18:49 -04:00
Mohammed Nafees
cc1331c59f Use PostgreSQL advisory lock to create task table partitions instead of depending on internal tenant (#1991)
* use pg advisory lock for task table partition

* fix lint

* use a separate transaction for advisory lock

* fix lint

* use PrepareTx

* short circuit return fast if partitions already exist

---------

Co-authored-by: mrkaye97 <mrkaye97@gmail.com>
2025-07-31 18:18:15 -04:00
Mohammed Nafees
e6c50ca1a0 Allow member roles to be changed by owners and admins (#2044)
* allow member roles to be changed by owners and admins

* PR comments

* chore: gen

* fix: rm changes to /next/

* chore: gen

---------

Co-authored-by: mrkaye97 <mrkaye97@gmail.com>
2025-07-30 17:42:34 -04:00
matt
392483c5d8 Fix: Weekly partition dropping (#2066)
* fix: check weekly partitions older than a week old

* fix: logic

* chore: gen
2025-07-30 16:28:08 -04:00
matt
d6f8be2c0f Feat: OLAP Table for CEL Eval Failures (#2012)
* feat: add table, wire up partitioning

* feat: wire failures into the OLAP db from rabbit

* feat: bubble failures up to controller

* fix: naming

* fix: hack around enum type

* fix: typo

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* fix: typos

* fix: migration name

* feat: log debug failure

* feat: pub message from debug endpoint to log failure

* fix: error handling

* fix: use ingestor

* fix: olap suffix

* fix: pass source through

* fix: dont log ingest failure

* fix: rm debug as enum opt

* chore: gen

* Feat: Webhooks (#1978)

* feat: migration + go gen

* feat: non unique source name

* feat: api types

* fix: rm cruft

* feat: initial api for webhooks

* feat: handle encryption of incoming keys

* fix: nil pointer errors

* fix: import

* feat: add endpoint for incoming webhooks

* fix: naming

* feat: start wiring up basic auth

* feat: wire up cel event parsing

* feat: implement authentication

* fix: hack for plain text content

* feat: add source to enum

* feat: add source name enum

* feat: db source name enum fix

* fix: use source name enums

* feat: nest sources

* feat: first pass at stripe

* fix: clean up source name passing

* fix: use unique name for webhook

* feat: populator test

* fix: null values

* fix: ordering

* fix: rm unnecessary index

* fix: validation

* feat: validation on create

* fix: lint

* fix: naming

* feat: wire triggering webhook name through to events table

* feat: cleanup + python gen + e2e test for basic auth

* feat: query to insert webhook validation errors

* refactor: auth handler

* fix: naming

* refactor: validation errors, part II

* feat: wire up writes through olap

* fix: linting, fallthrough case

* fix: validation

* feat: tests for failure cases for basic auth

* feat: expand tests

* fix: correctly return 404 out of task getter

* chore: generated stuff

* fix: rm cruft

* fix: longer sleep

* debug: print name + events to logs

* feat: limit to N

* feat: add limit env var

* debug: ci test

* fix: apply namespaces to keys

* fix: namespacing, part ii

* fix: sdk config

* fix: handle prefixing

* feat: handle partitioning logic

* chore: gen

* feat: add webhook limit

* feat: wire up limits

* fix: gen

* fix: reverse order of generic fallthrough

* fix: comment for potential unexpected behavior

* fix: add check constraints, improve error handling

* chore: gen

* chore: gen

* fix: improve naming

* feat: scaffold webhooks page

* feat: sidebar

* feat: first pass at page

* feat: improve feedback on UI

* feat: initial work on create modal

* feat: change default to basic

* fix: openapi spec discriminated union

* fix: go side

* feat: start wiring up placeholders for stripe and github

* feat: pre-populated fields for Stripe + Github

* feat: add name section

* feat: copy improvements, show URL

* feat: UI cleanup

* fix: check if tenant populator errors

* feat: add comments

* chore: gen again

* fix: default name

* fix: styling

* fix: improve stripe header processing

* feat: docs, part 1

* fix: lint

* fix: migration order

* feat: implement rate limit per-webhook

* feat: comment

* feat: clean up docs

* chore: gen

* fix: migration versions

* fix: olap naming

* fix: partitions

* chore: gen

* feat: store webhook cel eval failures properly

* fix: pk order

* fix: auth tweaks, move fetches out of populator

* fix: pgtype.Text instead of string pointer

* chore: gen

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-07-30 13:27:38 -04:00
abelanger5
e2af391a9b fix: remove essential pool to prevent bottleneck on heartbeats (#2060) 2025-07-29 17:06:29 -04:00
Mohammed Nafees
793df41ccb Deploy HyperDX locally via docker-compose and add traces to task controller (#2058)
* deploy jaegar locally and add traces to task controller

* use jaegar v2

* add SERVER_OTEL_COLLECTOR_AUTH

* fix PR comments

* fix span name
2025-07-29 16:24:38 +02:00
matt
fc374cb8db hotfix: separate statements for delete-then-insert declarative filters (#2053) 2025-07-25 13:46:56 -04:00
abelanger5
c377e75f61 fix: revert lease updates (#2038) 2025-07-22 12:06:58 +02:00
matt
7295254bfa Fix: filter lookup not retaining scope information (#2036)
* fix: filter lookup not retaining scope information

* fix: copilot suggestion

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* fix: add DISTINCT to filter query

* fix: struct deduping

* feat: test

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-07-22 11:59:34 +02:00
Mohammed Nafees
c26ff03dc0 [hotfix] Fix duration calculation of DAGs and single tasks (#2035)
* fix duration for multi tenant

* use external ids

* fix lint
2025-07-21 20:41:23 +02:00
abelanger5
467c6197ba fix: many updates on the lease table (#2034) 2025-07-21 20:40:38 +02:00
matt
3dcd6059c8 Fix: Partition pruning for PreflightCheckTasksForReplay (#2029)
* feat: partition pruning for PreflightCheckTasksForReplay

* fix: use 1d as placeholder

* fix: use current time instead

* fix: pass inserted ats through correctly

* fix: try adding a CTE

* fix: query cleanup
2025-07-21 20:30:59 +02:00
abelanger5
27435a72d6 feat: option to disable logging (#2030) 2025-07-21 16:53:11 +02:00
matt
5bf9f97720 Fix: Validate payloads + metadata and error on illegal unicode (#2023)
* feat: add helper method to repository

* feat: 400 on event pushes with invalid payloads

* fix: pointer

* feat: add to trigger

* feat: error on bulk trigger

* feat: error on schedule

* fix: validate log lines

* feat: validate crons

* feat: fail the task

* fix: rm debug line
2025-07-20 22:44:28 -04:00
matt
c202ec8359 Feat: CEL Debug Endpoint (#2010)
* feat: openapi spec + gen

* feat: scaffold cel service

* feat: impl with discriminated union

* fix: reversed

* chore: gen py

* chore: gen + add cel to hatchet client

* feat: wire up TS CEL client

* chore: versions

* feat: impl for go

* fix: error handling

* feat: python tests
2025-07-20 22:44:08 -04:00
Matt Kaye
7388c6df73 Fix: Improve UpdateDAGStatuses and UpdateTaskStatuses (#2020)
* fix: start improving query

* feat: add helper query for partition pruning

* feat: use helper query

* feat: similar optimizations for tasks query
2025-07-18 08:29:44 -04:00
Mohammed Nafees
c5915a3b14 Add rate limiter around scheduler concurrency (#2021)
* add rate limiter around scheduler concurrency

* have upper limit

* loadtest should pass now
2025-07-18 08:24:57 -04:00
Matt Kaye
48734c8cb8 Fix: Multiple tenants for task & dag status updates (#2019)
* feat: add function to fetch tenants in partition

* feat: update updatedagstatuses query to take list of tenants

* feat: wire tenant id through

* feat: hack string delim to wire writes through

* fix: unnest result of first func

* feat: task updates

* fix: error handling

* fix: one more func + migration

* fix: gen

* fix: concurrent tenant prom metrics and remove tenant operations

---------

Co-authored-by: Alexander Belanger <alexander@hatchet.run>
2025-07-17 14:59:45 -04:00
Matt Kaye
02601fa0ef Fix: Replay bugs (#2001)
* fix: dedupe tasks before replaying

* fix: two toasts

* fix: send workflow run external id through

* fix: send messages to queue immediately

* fix: clean up types

* fix: dedupe

* fix: return task ids instead of workflow ones
2025-07-16 11:42:36 -04:00
Matt Kaye
4676ae8508 FIx: Feedback on replay / cancel (#1997)
* feat: return a response from replays

* fix: return correct thing

* feat: wire up cancel + replay toasts on FE

* fix: naming

* fix: other refs

* fix: linter setup
2025-07-15 13:29:52 -04:00
Matt Kaye
0b21d74712 Fix: Remove internal replay batching for now (#1992)
* fix: remove batching, run replays serially

* proposal: do this at the replay controller level

* Revert "fix: remove batching, run replays serially"

This reverts commit 21a93bb260.

* feat: advisory lock

* fix: add prefix to lock
2025-07-15 13:10:30 -04:00
abelanger5
24b0d0c9d0 fix: panic when rate limit units are passed as nil (#1963) 2025-07-14 13:28:35 -04:00
Mohammed Nafees
c86a65bb0f Add new streaming support to Go SDK (#1955)
* add Go SDK streaming support

* make docs changes for go sdk streaming

* fix git lfs warning

* streaming go example

* fix lint

* fix auto generated snip

* revert poetry lock changes

* some cleanup
2025-07-11 18:00:30 +02:00
Mohammed Nafees
f247a63137 add check for cel input nil (#1977) 2025-07-10 09:44:47 -04:00
abelanger5
53020696e9 fix(go-sdk): v1 rate limit config (#1962) 2025-07-07 16:41:41 -04:00
Mohammed Nafees
33ec5fb7d8 add docs for Go SDK bulk operations (#1954) 2025-07-07 13:04:49 +02:00
abelanger5
6e820a120c feat: waterfall component (#1952)
* tmp: waterfall component

* feat: waterfall component

* address pr review comments
2025-07-04 14:47:30 -04:00
Matt Kaye
7679732b15 Fix: Skipping conditions with multiple parents (#1948)
* fix: skipping bug

* fix: move `waits` -> `conditions`

* fix: refs

* chore: ver

* feat: add skipped task to test

* feat: start implementing or groups in wait for

* feat: test of or groups on durable context

* fix: lint

* chore: gen

* fix: lint

* fix: branching hell
2025-07-03 16:50:57 -04:00
Jean-Baptiste Souvestre
f08c348710 fix(scheduling): negative weigths ranks were not excluded from the candidate workers pool (#1941)
Co-authored-by: jbsouvestre <jean-baptiste@ubble.ai>
2025-07-03 09:03:12 -04:00
Mohammed Nafees
2ccd434ebf Add Prometheus metric for reassigned task total (#1943)
* add reassigned total metric

* lint fix
2025-07-03 10:52:20 +02:00
Mohammed Nafees
144b8dce9e make sure to default to QUEUED for new task initial state (#1931) 2025-07-02 14:45:09 +02:00
abelanger5
3468709a23 fix: correct config pt 2 (#1938) 2025-07-01 16:56:13 -04:00
abelanger5
e18b0e8f58 fix: don't print output data in CEL exception (#1936)
* fix: don't print output data in CEL exception

* add tzdata to lite and loadtest dockerfiles too
2025-07-01 16:16:19 -04:00
Matt Kaye
c805a52e38 Fix: Events query performance improvements (#1930)
* fix: split up event queries for perf

* fix: refs

* fix: event join
2025-07-01 11:58:15 -04:00
Matt Kaye
23bdbbd8a3 Feat: Tenant-in-path (#1923)
* chore: gen

* feat: hook for tenant

* feat: add tenanted routes

* fix: no need for v1 prefix

* feat: remove v1 routes

* fix: remove ui version switcher stuff

* fix: more broken redirects

* fix: start using hooks to fetch tenant

* fix: add (commented out) linting rules

* fix: sidebar

* fix: cruft comment

* fix: layout

* fix: collapsibles

* fix: more refs to v1 paths

* fix: more refs to hold hooks

* fix: more refs

* fix: last few

* fix: more redirects

* fix: rm more refs to `useOutletContext`

* fix: rm tenant-as-prop

* fix: small bugs

* fix: revert unintended changes

* fix: couple more

* fix: last few

* fix: last few

* fix: oooone more

* fix: redirects

* fix: add more redirects

* fix: clean up a bunch more redirects

* fix: copy paste

* fix: more redirects

* fix: zero value bug

* hack: don't set query param on v1

* fix: lint

* fix: copy

* fix: copy

* fix: lint

* fix: rm /next redirect

* make default engine version v1

* feat: crons with timezones

* fix: handle case where tenant is in path

* fix: more hard redirects

* fix: delete v0 cancellation test

---------

Co-authored-by: Alexander Belanger <alexander@hatchet.run>
2025-07-01 11:56:54 -04:00
abelanger5
646adda2a8 fix: concurrency timeout from 5s -> 30s (#1926)
* fix: concurrency timeout from 5s -> 30s

* limits in overwrite file too
2025-07-01 08:05:59 -04:00