Commit Graph

528 Commits

Author SHA1 Message Date
Mohammed Nafees ebb49cb1a0 error out instead of panic (#2274) 2025-09-09 17:48:55 +02:00
matt cf59a7bcd9 Feat: Worker slot Prom metrics (#2195)
* feat: add slots to prom metrics

* feat: available

* fix: extension instead

* fix: docs

* fix: rm unused query changes

* fix: rm unused struct

* fix: labels

* feat: improve total slots

* fix: pr feedback

* fix: docs

* Revert "fix: docs"

This reverts commit 7fe105da92.

* fix: derive total slots
2025-09-08 14:07:44 -04:00
Mohammed Nafees 9b0ec2618e Go SDK v1 feature client changes (#2160)
* feature client changes

* remove code duplication

* func name should make sense

* add simple compile gh workflow
2025-09-08 17:10:24 +02:00
Mohammed Nafees 03e5b37059 Introduce UI for Organizations (#2247)
* org selector

* org selector and pages

* org page starts to look nice I think

* add mgmt tokens section

* better messaging

* custom auth interface

* add comments

* more modals

* more fixes

* onboarding create tenant for orgs

* use ConfirmDialog

* org invite modal

* org invites work

* email service into pkg

* fix build error

* attempt at creating hook

* address PR comments

* more fixes

* update for org list endpoint
2025-09-05 21:30:37 +02:00
Mohammed Nafees 0abefb18ee make sure to case on err properly (#2248) 2025-09-05 13:25:31 +02:00
Mohammed Nafees 1a2891154e Periodically run ANALYZE on v1_task and v1_task_event (#2236)
* analyze v1_task and v1_task_event tables periodically

* copy pasta
2025-09-02 11:07:05 -04:00
matt 54caf2e68a fix: rm annoying loki logs (#2224) 2025-08-28 20:54:42 -04:00
xcono bb68360959 repo/v1: guard cleanAdditionalMetadata against JSON null; client: avoid null AdditionalMetadata in BulkPush; add regression test (#2191) 2025-08-28 16:33:35 -04:00
abelanger5 2c8ea66a7a fix: remove rate limited items from in memory buffer (#2207) 2025-08-27 14:51:35 -04:00
matt c42d59f5d8 Fix: Remove custom auth (#2203)
* fix: rm custom auth

* fix: change auth strategy
2025-08-26 13:57:24 -04:00
abelanger5 f62142f74d fix: explicit ordering in ReleaseTasks and lock parent slots (#2201)
* fix: explicit ordering in ReleaseTasks and lock parent slots

* fix: IN instead of =

* fix: gen diff
2025-08-26 11:06:55 -04:00
abelanger5 acf7215b3f fix: don't query database when flush is called concurrently (#2202) 2025-08-26 11:00:47 -04:00
matt 80fb7657ed Fix: Child runs not rendering after one day, empty worker ids, additional meta filters not being applied to counts (#2196)
* fix: child runs not rendering b/c they've timed out of the lookback window

* fix: migration version

* fix: dead links

* fix: additional meta filters for status counts

* chore: lint
2025-08-25 18:20:08 -04:00
Gabe Ruttner 59fe6c110e feat: improved onboarding part 1 (#2186)
* feat: analytics events

* improved forms

* store state

* lint

* cleanup tenant name

* nits

* add environment to the form

* environment tag

* include env with tenant

* lint

* fix gen

* address comments

* feedback

* fix: layout

* navigation state

* rm dep

* lint

* address review

* lint

* lint

* fix: build
2025-08-25 11:14:34 -07:00
abelanger5 2a8ba155fa fix: match and cancel newest/in progress deadlocks (#2190) 2025-08-25 12:54:08 -04:00
abelanger5 67aef4fa64 add visibility to stream send event (#2174)
* add visibility to stream send event

* more otel

* track down stream timings

* experimental: use PrepareMsg before writing to the stream

* add control over stream window size, add error to span if large delays in stream sends
2025-08-22 09:51:31 -04:00
Gabe Ruttner f59ebd6c47 feat: analytics events (#2171)
* feat: analytics events

* review comments
2025-08-22 05:41:17 -07:00
abelanger5 8463b2c4a3 limit frequency of updates to rate limits (#2173) 2025-08-21 12:50:22 -04:00
Mohammed Nafees 2603939526 Introduce customAuth to the OpenAPI spec (#2168)
* introduce custom auth

* support optional CustomAuthorizationHandler
2025-08-20 17:05:11 +02:00
matt 5eab4b74e7 Feat: Run ANALYZE on a few tables once a day (#2163)
* feat: add analyze for a few tables

* feat: run at 5am utc

* fix: add tx, timeout

* fix: 30m timeout
2025-08-19 13:43:27 -04:00
matt 355a7f197e Feat: Add Linear to preconfigured webhooks (#2157)
* feat: add linear

* feat: linear fallthrough

* feat: linear

* fix: copy tweak
2025-08-18 12:19:43 -04:00
abelanger5 1407594902 fix: move rate limited queue items off the main queue (#2155)
* fix: move rate limited queue items off the main queue

* preserve FIFO behavior on queues

* fix unit tests, address pr comments

* fix: generated

* rename table
2025-08-18 11:31:21 -04:00
matt c4ad23d92c Fix: Populate DAG Metadata Sequentially (#2156)
* feat: add n+1 query

* feat: finish wiring up n+1 query

* fix: type hack

* fix: comment + partition pruning

* fix: copy paste

* fix: return error

* fix: slight correctness improvement

* fix: handle no rows error
2025-08-18 11:31:07 -04:00
matt 82c9d2d17c Fix: Deadlocking on DAG concurrency (#2111)
* debug: try fixing lock order

* fix: single `FOR UPDATE`

* fix: raw sql

* fix: explicit case handling

* fix: cancel in progress

* fix: query bugs

* fix: one more

* feat: cancel newest

* feat: test for cancel in progress
2025-08-14 15:21:24 -04:00
matt 36924936fa Feat: Webhook fixes / improvements (#2131)
* feat: webhook update

* feat: add headers to cel env

* fix: header casing

* feat: wire up edits

* fix: updates

* fix: finish wiring up updates

* fix: handle save on enter

* fix: lint

* feat: add slack and discord

* feat: initial slack setup

* fix: get slack working

* fix: rm discord for now

* fix: lint

* chore: gen

* fix: explicit save button

* feat: add link to CEL docs

* feat: add callout for reaching out to support

* feat: docs

* refactor: challenge

* fix: naming

* fix: return

* fix: resp codes

* fix: webhooks beta flag

* fix: rm discord

* fix: docs
2025-08-14 10:46:57 -05:00
Mohammed Nafees 51a037d493 Workflow combobox search functionality (#2118)
* add pagination support for trigger workflow dropdown

* fix lint

* message fix

* no pagination but search

* no need for useless allocation

* PR comments
2025-08-12 20:43:30 +02:00
matt ed65e41ff2 Fix: Optimize DAG timing query for Prom (#2102)
* feat: improve dag duration query

* fix: naming

* fix: wiring

* feat: add trace

* fix: add timeouts

* fix: inserted at

* fix: correctness tweak

* fix: try upgrading pino
2025-08-12 08:01:00 -04:00
matt d2b60917ee Fix: Waterfall panic + query simplification (#2116)
* fix: simplify query a bunch

* fix: simplify more

* fix: simplify a whole bunch more

* fix: wire up

* fix: query

* fix: the actual bug
2025-08-12 07:56:13 -04:00
matt 4d654f34ec Debug: Fail task tracing (#2101)
* feat: add some span attributes to see how big the batches are

* fix: span naming

* fix: naming

* fix: issues + lint
2025-08-07 12:05:28 -04:00
matt 285f1728d5 Fix: Call PopulateTaskRunData sequentially (#2097)
* fix: call task data lookup query sequentially

* fix: error fmt

* feat: add more span attrs

* fix: end + ctx handling

* fix: int type

* fix: handle dupes, factor out into helper

* fix: naming

* fix: unwind naming change

* fix: naming
2025-08-06 18:40:02 -04:00
Mohammed Nafees 34074affd8 Add contextual data for trigger via events (#2092)
* add contextual data for trigger via events

* fix corrId

* string needed
2025-08-06 16:52:06 -04:00
matt b233e6a5cb Fix: Improve performance of UpdateTasksToAssigned (#2094)
* fix: prune partitions, join on full PK

* fix: subquery

* fix: check validity
2025-08-06 15:04:00 -04:00
Mohammed Nafees 889210da7c add telemetry to task status repo methods (#2091) 2025-08-06 13:34:59 -04:00
Gabe Ruttner c6fd39b4e0 fix: ProcessTaskTimeouts limit and timeout (#2087)
* limit and timeout

* right query

* configurable

* limit

* env vars
2025-08-06 08:42:07 -04:00
Mohammed Nafees 0b646316f1 Add GRPC callback interceptor and correlation IDs to respective API and GRPC handlers (#2073)
* chore(deps): bump hatchet-sdk in /examples/python/quickstart (#2070)

Bumps hatchet-sdk from 1.16.3 to 1.16.4.

---
updated-dependencies:
- dependency-name: hatchet-sdk
  dependency-version: 1.16.4
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* chore(deps): bump google.golang.org/api from 0.243.0 to 0.244.0 (#2071)

Bumps [google.golang.org/api](https://github.com/googleapis/google-api-go-client) from 0.243.0 to 0.244.0.
- [Release notes](https://github.com/googleapis/google-api-go-client/releases)
- [Changelog](https://github.com/googleapis/google-api-go-client/blob/main/CHANGES.md)
- [Commits](https://github.com/googleapis/google-api-go-client/compare/v0.243.0...v0.244.0)

---
updated-dependencies:
- dependency-name: google.golang.org/api
  dependency-version: 0.244.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* add grpc callback interceptor

* add correlation id to more endpoints

* fix string interpolation payment methods (#2072)

* hotfix: empty scope in OLAP replication (#2068)

* fix lint

* update comment

* feat: activity detection (#2055)

* feat: activity detection

* address comments

* chore(deps): bump github.com/prometheus/client_golang (#2074)

Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.22.0 to 1.23.0.
- [Release notes](https://github.com/prometheus/client_golang/releases)
- [Changelog](https://github.com/prometheus/client_golang/blob/v1.23.0/CHANGELOG.md)
- [Commits](https://github.com/prometheus/client_golang/compare/v1.22.0...v1.23.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/client_golang
  dependency-version: 1.23.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* chore(deps): bump github.com/getsentry/sentry-go from 0.34.1 to 0.35.0 (#2075)

Bumps [github.com/getsentry/sentry-go](https://github.com/getsentry/sentry-go) from 0.34.1 to 0.35.0.
- [Release notes](https://github.com/getsentry/sentry-go/releases)
- [Changelog](https://github.com/getsentry/sentry-go/blob/master/CHANGELOG.md)
- [Commits](https://github.com/getsentry/sentry-go/compare/v0.34.1...v0.35.0)

---
updated-dependencies:
- dependency-name: github.com/getsentry/sentry-go
  dependency-version: 0.35.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* add resource id and type

* update grpc callback middleware

* fix v0 trigger

* use constants

* fix values

* use constants

* use string declared method

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: matt <mrkaye97@gmail.com>
Co-authored-by: Gabe Ruttner <gabriel.ruttner@gmail.com>
2025-08-04 12:29:01 -04:00
matt 13b2e1d26c Fix: Propagate priority through to DAG subtasks (#2078)
* feat: add priority column to v1_match

* feat: wire up writes

* fix: more wiring

* fix: migration name
2025-08-04 12:22:28 -04:00
matt c44c70bd0c Debug: Add debug logs around put log method (#2079)
* feat: add logger to ingestor

* debug: add a bunch of debug logs

* fix: add prefix for grep

* fix:  copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* fix: panic

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-08-04 11:19:07 -04:00
matt 8480228d79 Fix: Allow bypassing partitioning for events lookup table (#2054)
* fix: allow olap events lt to not be partitioned manually

* chore: gen

* chore: gen
2025-07-31 18:18:49 -04:00
Mohammed Nafees cc1331c59f Use PostgreSQL advisory lock to create task table partitions instead of depending on internal tenant (#1991)
* use pg advisory lock for task table partition

* fix lint

* use a separate transaction for advisory lock

* fix lint

* use PrepareTx

* short circuit return fast if partitions already exist

---------

Co-authored-by: mrkaye97 <mrkaye97@gmail.com>
2025-07-31 18:18:15 -04:00
Mohammed Nafees e6c50ca1a0 Allow member roles to be changed by owners and admins (#2044)
* allow member roles to be changed by owners and admins

* PR comments

* chore: gen

* fix: rm changes to /next/

* chore: gen

---------

Co-authored-by: mrkaye97 <mrkaye97@gmail.com>
2025-07-30 17:42:34 -04:00
matt 392483c5d8 Fix: Weekly partition dropping (#2066)
* fix: check weekly partitions older than a week old

* fix: logic

* chore: gen
2025-07-30 16:28:08 -04:00
matt d6f8be2c0f Feat: OLAP Table for CEL Eval Failures (#2012)
* feat: add table, wire up partitioning

* feat: wire failures into the OLAP db from rabbit

* feat: bubble failures up to controller

* fix: naming

* fix: hack around enum type

* fix: typo

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* fix: typos

* fix: migration name

* feat: log debug failure

* feat: pub message from debug endpoint to log failure

* fix: error handling

* fix: use ingestor

* fix: olap suffix

* fix: pass source through

* fix: dont log ingest failure

* fix: rm debug as enum opt

* chore: gen

* Feat: Webhooks (#1978)

* feat: migration + go gen

* feat: non unique source name

* feat: api types

* fix: rm cruft

* feat: initial api for webhooks

* feat: handle encryption of incoming keys

* fix: nil pointer errors

* fix: import

* feat: add endpoint for incoming webhooks

* fix: naming

* feat: start wiring up basic auth

* feat: wire up cel event parsing

* feat: implement authentication

* fix: hack for plain text content

* feat: add source to enum

* feat: add source name enum

* feat: db source name enum fix

* fix: use source name enums

* feat: nest sources

* feat: first pass at stripe

* fix: clean up source name passing

* fix: use unique name for webhook

* feat: populator test

* fix: null values

* fix: ordering

* fix: rm unnecessary index

* fix: validation

* feat: validation on create

* fix: lint

* fix: naming

* feat: wire triggering webhook name through to events table

* feat: cleanup + python gen + e2e test for basic auth

* feat: query to insert webhook validation errors

* refactor: auth handler

* fix: naming

* refactor: validation errors, part II

* feat: wire up writes through olap

* fix: linting, fallthrough case

* fix: validation

* feat: tests for failure cases for basic auth

* feat: expand tests

* fix: correctly return 404 out of task getter

* chore: generated stuff

* fix: rm cruft

* fix: longer sleep

* debug: print name + events to logs

* feat: limit to N

* feat: add limit env var

* debug: ci test

* fix: apply namespaces to keys

* fix: namespacing, part ii

* fix: sdk config

* fix: handle prefixing

* feat: handle partitioning logic

* chore: gen

* feat: add webhook limit

* feat: wire up limits

* fix: gen

* fix: reverse order of generic fallthrough

* fix: comment for potential unexpected behavior

* fix: add check constraints, improve error handling

* chore: gen

* chore: gen

* fix: improve naming

* feat: scaffold webhooks page

* feat: sidebar

* feat: first pass at page

* feat: improve feedback on UI

* feat: initial work on create modal

* feat: change default to basic

* fix: openapi spec discriminated union

* fix: go side

* feat: start wiring up placeholders for stripe and github

* feat: pre-populated fields for Stripe + Github

* feat: add name section

* feat: copy improvements, show URL

* feat: UI cleanup

* fix: check if tenant populator errors

* feat: add comments

* chore: gen again

* fix: default name

* fix: styling

* fix: improve stripe header processing

* feat: docs, part 1

* fix: lint

* fix: migration order

* feat: implement rate limit per-webhook

* feat: comment

* feat: clean up docs

* chore: gen

* fix: migration versions

* fix: olap naming

* fix: partitions

* chore: gen

* feat: store webhook cel eval failures properly

* fix: pk order

* fix: auth tweaks, move fetches out of populator

* fix: pgtype.Text instead of string pointer

* chore: gen

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-07-30 13:27:38 -04:00
abelanger5 e2af391a9b fix: remove essential pool to prevent bottleneck on heartbeats (#2060) 2025-07-29 17:06:29 -04:00
Mohammed Nafees 793df41ccb Deploy HyperDX locally via docker-compose and add traces to task controller (#2058)
* deploy jaegar locally and add traces to task controller

* use jaegar v2

* add SERVER_OTEL_COLLECTOR_AUTH

* fix PR comments

* fix span name
2025-07-29 16:24:38 +02:00
matt fc374cb8db hotfix: separate statements for delete-then-insert declarative filters (#2053) 2025-07-25 13:46:56 -04:00
abelanger5 c377e75f61 fix: revert lease updates (#2038) 2025-07-22 12:06:58 +02:00
matt 7295254bfa Fix: filter lookup not retaining scope information (#2036)
* fix: filter lookup not retaining scope information

* fix: copilot suggestion

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* fix: add DISTINCT to filter query

* fix: struct deduping

* feat: test

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-07-22 11:59:34 +02:00
Mohammed Nafees c26ff03dc0 [hotfix] Fix duration calculation of DAGs and single tasks (#2035)
* fix duration for multi tenant

* use external ids

* fix lint
2025-07-21 20:41:23 +02:00
abelanger5 467c6197ba fix: many updates on the lease table (#2034) 2025-07-21 20:40:38 +02:00
matt 3dcd6059c8 Fix: Partition pruning for PreflightCheckTasksForReplay (#2029)
* feat: partition pruning for PreflightCheckTasksForReplay

* fix: use 1d as placeholder

* fix: use current time instead

* fix: pass inserted ats through correctly

* fix: try adding a CTE

* fix: query cleanup
2025-07-21 20:30:59 +02:00