Commit Graph

160 Commits

Author SHA1 Message Date
Gabe Ruttner
1cd9660835 rip: remove unneeded durable event log update (#3186)
* rip: update

* refactor: start cleaning up proto defs

* refactor: finish cleaning up proto definitions

* fix: rm the kind

* refactor: rewire the server to have different methods for the different paths

* refactor: more intermediate work

* fix: variables

* fix: wire the kind through

* refactor: get it to compile

* chore: start fixing python

* fix: first pass at fixing python

* chore: rm unused sql

* fix: rm invocation count, rework some logic

* fix: alias (why does sqlc not catch this)

* fix: panics

* fix: add faster timeout to durable spawn test

* fix: task id bug

* refactor: more cleanup of types

* refactor: rm stale entries logic

* fix: rework getOrCreate logic

* fix: clean up a bunch more unneeded stuff

* fix: bug

* fix: python code

* fix: dag matches bug

* fix: add parent to dag to make it more broken, add timeout

* fix: more involved tests

* fix: dag waits

* fix: tests

---------

Co-authored-by: mrkaye97 <mrkaye97@gmail.com>
2026-03-07 04:02:48 -08:00
matt
1bffb66bb3 Feat: Branching off branches (#3150)
* feat: add branch point table

* chore: gen

* feat: id for ordering

* feat: check for `isBranchPoint` and handle branching

* feat: wire up branching over the api

* fix: api, gen

* fix: gen

* fix: branch

* feat: remove duped parent node and branch ids

* fix: rm branch count, dupe of latest id

* refactor: resolve naming

* fix: base case

* fix: test, + it's literally always a caching issue omg

* fix: docs

* chore: lint

* refactor: make branch resolution more efficient

* feat: stable sort, add a bunch of tests

* fix: confusing naming

* fix: naming

* chore: gen

* fix: update

* fix: failing test

---------

Co-authored-by: Gabe Ruttner <gabriel.ruttner@gmail.com>
2026-03-05 16:31:01 -05:00
Gabe Ruttner
65e44d6f63 feat: OLAP status priority functions and query updates (#3156)
* feat: OLAP status priority functions and query updates

- Add v1_status_priority / v1_status_from_priority for v1_readable_status_olap
- Use priority-based aggregation in OLAP task status update queries (EVICTED
  below terminal statuses)
- Migration v1_0_84 and schema v1-olap.sql

Made-with: Cursor

* test: durable eviction tests — replay, cancel after eviction, restore idempotency

- test_eviction_plus_replay, test_evictable_cancel_after_eviction, test_restore_idempotency
- Tighter poll interval for faster test runs

Made-with: Cursor

* fix: flakes

* feedback
2026-03-04 11:11:09 -08:00
mrkaye97
e29459f58a chore: merge main 2026-03-04 13:38:37 -05:00
matt
6c29e48204 Feat: Dynamic worker label assign (#3137)
* feat: initial wiring work on desired labels

* feat: initial wiring

* chore: gen python

* fix: use the whole desired label thing instead

* fix: more wiring, improve types

* fix: sql type

* fix: len check

* chore: gen python

* fix: initial plural label work

* fix: store the labels properly on the task

* fix: skip cache on override

* fix: bug

* fix: scoping bug whoops

* chore: lint

* fix: send labels back over the api correctly

* feat: python test

* fix: lint

* fix: comment

* fix: override

* fix: namespaces, ugh

* fix: no need for error here

* chore: version

* feat: ruby, go, ts

* feat: versions

* fix: appease the rubocop

* chore: lint

* chore: bundle install

* fix: tests

* chore: lint

* chore: lint more

* fix: ts test

* fix: rb

* chore: gen

* chore: reset gemfile

* chore: reset changelog

* fix: pgroup

* fix: tests, part i

* Revert "chore: reset changelog"

This reverts commit b63bf7d3e5.

* Revert "chore: reset gemfile"

This reverts commit bb848bb6f0.

* fix: go -> golang mapping hack

* fix: go enums

* fix: appease the cop

* fix: namespace

* chore: gen
2026-03-04 11:03:58 -05:00
Gabe Ruttner
2d57b6793a Feat durable olap refactor (#3115)
* chore: lint

* feat: counting and partitioning

* feat: add reason field to DurableTaskEvictInvocationRequest and update eviction handling

* fix: eviction durable execution race

* chore: generate

* refactor: simplified migration

* refactor: address review

* refactor: analyze parent tables in migraiton

* fix: migration

* fix: remove no txn

* fix: one statement

* fix: we do infact need no transaction

* add down/up/down to the online migration test

* fix: or multiple statements

* fix: two migrations...

* chore: rm old migraiton

* chore: generate

* chore: feedback

* fix: idempotent migration

* refactor: update assertions in durable tests and clean up imports in cache.py

* revert: migraiton

* chore: wrap down
2026-02-27 10:58:01 -08:00
Gabe Ruttner
daff28dbfe Feat: durable eviction take 2 (#3075)
* feat: simplified eviction feature

* fix: assign new worker id

* test: shorter sleep

* fix: completion race on same worker

* chore: address todo

* chore: lint

* chore: generate

* fix: n+1 queries

* refactor: WasEvicted bool

* feat: evicted state

* chore: generate

* fix: map status

* fix: update PendingCallback structure to include InvocationCount

* revert: comment

* feat: add support for EVICTED status in waterfall component and metrics display

* fix: implicit eviction

* chore: readable cte

* refactor: queued bool

* refactor: rename eviction_policy

* fix: aio only

* chore: example return type

* fix: map

* feat: eviction error cases

* refactor: change external ID maps to use UUID type

* chore: feedback, cleanup

* tests: additional cases

* chore: generate

* chore: lint

* chore: lint generate

* chore: clean up comments to make matt happy

* refactor: more feedback

* chore: add TODO for worker state reconciliation and clean up comments in eviction policy

* tests: fix

* chore: gen

* test: increase ruby timeout...

* fix: invocation count

* fix: test cases

* fix: stale log entry

* chore: lint

* revert: durable tests to use time.time

* chore: lint
2026-02-27 09:25:50 -08:00
mrkaye97
736ecaa3c0 Merge branch 'main' into feat-durable-execution 2026-02-24 13:26:48 -08:00
Mohammed Nafees
9a063f198d Add missing primary key to "WorkflowTriggerCronRef" (#3086)
* add constraint and migration

* comment
2026-02-23 21:05:28 +01:00
matt
6f3f6e08ac Feat: Replay as new (or from a node) (#3055)
* feat: new messages for reset

* chore: gen python

* feat: reset scaffolding

* feat: initial work

* feat: initial e2e wiring of resetting from a specific node

* fix: add branch to pk

* fix: wire up branches

* fix: add branch to awaited entry

* feat: start wiring up reset api

* fix: colname

* fix: add branch id more places

* fix: some bugs

* fix: replay

* fix: replay, simplify

* feat: add parent branch id

* fix: start reworking parent nodes and branches

* fix: parent branch wiring

* fix: start fixing some bugs

* fix: parent branch bug

* fix: advisory lock for locking the log file to prevent concurrent modification

* fix: move claude.md ignore path

* fix: remove eager replays of events

* fix: rm cruft

* fix: cleanup more params and such

* fix: return type

* fix: comment

* fix: comments

* fix: comment

* chore: gen

* chore: gen

* fix: decrease sleep time

* chore: gen again

* fix: add invocation count on event log entries, make it int32, fix toInt

* fix: more wiring

* chore: gen, simplify

* fix: lint

* fix: more zero values, I hate Go

* feat: add `is_durable` to v1_task

* feat: initial work wiring up dispatcher to increment log entry invocation counts

* feat: wire up assigned action

* fix: property

* fix: send is durable through to the engine

* fix: more invoc count wiring

* fix: node resetting

* fix:revert

* fix: import

* chore: gen

* fix: reset -> fork

* fix: rm a bunch of dead code

* fix: api

* fix: repo method

* fix: log file locking using `FOR UPDATE` + atomic compare-and-set update

* fix: move to shared repo

* feat: increment invocation count on the scheduler

* fix: naming

* fix: make test more reliable

* fix: props

* fix: node id reset
2026-02-20 13:01:46 -05:00
mrkaye97
f8e787cd89 Merge branch 'main' into feat-durable-execution 2026-02-19 19:35:40 -05:00
Gabe Ruttner
8c9fa7fd82 feat: add migration for worker slot config index (#3062) 2026-02-19 12:25:36 -08:00
matt
7e3e3b8fc0 Feat: Non-determinism errors (#3041)
* fix: retrieve payloads in bulk

* fix: hash -> idempotency key

* feat: initial hashing work

* feat: check idempotency key if entry exists

* fix: panic

* feat: initial work on custom error for non-determinism

* fix: handle nondeterminism error properly

* feat: add error response, pub message to task controller

* chore: lint

* feat: add node id field to error proto

* chore: rm a bunch of unhelpful cancellation logs

* fix: conflict issues

* fix: rm another log

* fix: send node id properly

* fix: improve what we hash

* fix: improve error handling

* fix: python issues

* fix: don't hash or group id

* fix: rm print

* feat: add python test

* fix: add timeout

* fix: improve handling of non determinism error

* fix: propagate node id through

* fix: types, test

* fix: make serializable

* fix: no need to cancel internally anymore

* fix: hide another internal log

* fix: add link to docs

* fix: copilot

* fix: use sha256

* fix: test cleanup

* fix: add error type enum

* fix: handle exceptions on the worker

* fix: clean up a bunch of cursor imports

* fix: cursor docstring formatting

* fix: simplify idempotency key func

* fix: add back cancellation logs

* feat: tests for idempotency keys

* fix: add a couple more for priority and metadata

* chore: gen

* fix: python reconnect

* fix: noisy error

* fix: improve log

* fix: don't run durable listener if no durable tasks are registered

* fix: non-null idempotency keys
2026-02-18 11:27:02 -05:00
matt
eaf6bba824 Refactor: Remove separate callback table (#3045)
* fix: remove callback table

* fix: type

* fix: type

* fix: wiring everything up

* fix: result payload for replays

* chore: lint

* feat: set fillfactor

* chore: gen

* fix: simplify v1 match changes

* fix: simplify v1 match wiring

* fix: rm print line

* fix: some more wiring

* fix: wiring

* chore: comments

* chore: gen, proto naming

* chore: comments

* fix: rm comment

* fix: broken listener
2026-02-17 13:25:08 -05:00
mrkaye97
7ad5bfdf89 Merge branch 'main' into feat-durable-execution 2026-02-17 08:45:41 -05:00
Gabe Ruttner
2fdc47a6af feat: multiple slot types (#2927)
* feat: adds support for multiple slot types, primarily motivated by durable slots

---------

Co-authored-by: mrkaye97 <mrkaye97@gmail.com>
2026-02-17 05:43:47 -08:00
matt
05399ebf39 Feat: Durable event log wiring (#2956)
* feat: initial protos

* chore: lint

* fix: work on improving naming

* chore: rename session id to invocation count

* feat: scaffold implementation of durabletask rpc

* fix: one more session rename

* feat: initial work on the server scaffolding

* chore: gen protos for python

* feat: initial durable task client

* feat: initial durable context work for python

* fix: pass client through to runner

* fix: clean up type checking errors

* fix: cruft

* feat: initial work wiring up durable events

* fix: get -> getorcreate

* feat: query + wiring for updating latest node id

* fix: simplify, bump latest node ids in the same query

* chore: note

* feat: wire up sleeps with internal signal matches

* chore: gen

* fix: callback data writes

* feat: cache previous events

* fix: wire up external id writes

* feat: got sleeps sorta working!

* fix: tenant and external id wiring

* chore: comments

* fix: clean up some types a bit

* feat: add run triggering params to proto to allow for spawning children

* feat: first pass at child spawning

* feat: start wiring up child spawning

* fix: use `triggerWriter` for spawn

* feat: update trigger proto def

* chore: regen python

* feat: start wiring up spawning correctly with all opts

* refactor: share trigger code

* chore: remove log lines, lint

* fix: add triggered run external id

* feat: start wiring up child key storage better

* chore: gen again

* fix: gen, colname

* fix: trigger opts panicking

* hack: get things working for now

* feat: shared rpc message

* chore: fix imports

* feat: add tenant id to tables

* fix: improve ingest logic

* refactor: shared trigger opt type

* fix: send tenant id through everywhere

* chore: fix log file insert on conflict

* fix: repo

* fix: generate external id upstream

* feat: add columns to the match

* feat: first pass at durable waits on the controllers instead of the dispatcher

* fix: types

* feat: wire up callbacks

* fix: invoc counts

* fix: typing, lint

* driveby: more constants for message ids

* refactor: struct for callback keys everywhere

* fix: bugs, passing tests

* fix: return errnorows

* fix: schema

* fix: remove current callback flow

* feat: new message types

* fix: remove key from callback model

* fix: rm unused queries

* refactor: start reworking flow

* fix: start working on feedback

* fix: query

* fix: wire up external ids

* revert: drive by

* refactor: rm extra interface

* chore: move listener, lint

* refactor: remove old listener, rename

* refactor: consolidate migrations

* fix: immediately send already-satisfied callbacks

* fix: union

* chore: rm unused queries

* fix: check if entry already exists before re-spawning / signaling

* fix: node id incrementation

* fix: rm json dump

* fix: don't pass node id

* fix: store latest invocation, update query

* fix: upsert logic

* Revert "fix: upsert logic"

This reverts commit cf7c609c1d.

* fix: change logic slightly

* fix: split up get and create queries

* fix: err

* fix: pass node ids around properly

* fix: invocation handling

* fix: callback bug

* fix: naming

* fix: rm cruft method, dynamic kind

* fix: wire up memo payload and kind stuff

* fix: propagate trigger opts

* fix: child spawn signaling + olap wiring

* fix: extract output method

* feat: improve test coverage a bit

* fix: child spawning

* feat: another test

* fix: query fixes, overwrite

* fix: match bug

* fix: proto indexes, regen

* fix: eviction comment

* fix: warning for non-async durable tasks

* fix: rm contracts import

* fix: basic locking, rm sync durable tasks

* fix: invocation counts, etc.

* chore: add fixme

* fix: rm unused invocation count param from callback response

* fix: rm dispatcher id from the callback

* fix: di test

* Revert "fix: rm dispatcher id from the callback"

This reverts commit 26e6c82797.

* fix: migration

* fix: use optimistictx

* fix: lift grpc codes out of trigger repo

* fix: span names

* fix: rm comment

* fix: consolidate kind types, batching, not-null kinds

* fix: null bug

* fix: satisfied claim bug, simplify queries

* fix: add back payload storage

* fix: match bug, simplification

* fix: factor out trigger opts to the dispatcher level

* fix: factor out conditions

* fix: rm unused structs

* fix: rm dupes

* fix: migration

* refactor: switch case helpers

* fix: panic

* fix: couple warnings

* fix: lint

* fix: generate external ids properly

* refactor: return trigger task data from helper

* fix: handle matches correctly for dag spawns

* fix: add validators, one more uuid type

* chore: gen

* chore: bump pytest-asyncio to latest

* fix: store the worker instead of the dispatcher, then look up the dispatcher

* fix: store dispatcher id on the worker

* chore: lint
2026-02-16 12:23:58 -05:00
mrkaye97
eaac2b09fb Merge branch 'main' into feat-durable-execution 2026-02-16 07:59:50 -05:00
Gabe Ruttner
7875d78057 Feat: Official Ruby SDK (#3004)
* feat: initial ruby sdk

* fix: run listener

* fix: scope

* feat: rest feature clients

* fix: bugs

* fix: concurrent register

* fix: tests and ergonomics

* docs: all of them

* chore: lint

* feat: add RBS

* feat: add GitHub Actions workflow for Ruby SDK with linting, testing, and publishing steps

* chore: lint

* refactor: simplify load path setup for Hatchet REST client and remove symlink creation

* fix: cert path

* fix: test

* fix: blocking

* fix: ensure Hatchet client is only initialized once across examples

* fix: tests

* remove: unused example

* fix: bubble up errors

* test: skip flaky for now

* remove: lifespans

* fix: durable context bugs

* fix: bulk replay

* fix: tests

* cleanup: generate tooling

* fix: integration test

* chore: lint

* release: 0.1.0

* chore: remove python comments

* refactor: remove OpenTelemetry configuration and related unused options

* fix: default no healthcheck

* chore: lockfile

* feat: register as ruby

* chore: lint

* chore: update py/ts apis to include ruby

* chore: docs pass

* chore: lint

* chore: generate

* chore: cleanup

* chore: generate examples

* tests: add e2e tests

* tests: cache examples dependencies

* fix: namespace

* fix: namespace

* fix: namespaces

* chore:lint

* fix: improve cancellation workflow polling logic and add error handling

* revert: py/ts versions
2026-02-15 14:32:15 -08:00
mrkaye97
740af439ce Merge branch 'main' into feat-durable-execution 2026-02-12 09:27:40 -05:00
Mohammed Nafees
4fd7b94751 Add support for Svix webhooks (#2996)
* support Svix webhooks

* add migration

* use http status codes

* comment fix

* custom svix verification logic

* copilot comments

* copilot comments
2026-02-11 16:41:36 +01:00
mrkaye97
3faa5adfc7 fix: id and inserted at on fields that need payloads 2026-02-06 16:38:58 -05:00
matt
7d66b3806b Feat: Durable event log models (#2940)
* feat: add new models

* feat: migration

* chore: generate

* fix: linter

* fix: couple types

* Feat: Initial work on CRUD operations for durable events (#2943)

* feat: initial query work

* feat: first pass at durable events repo + queries

* feat: add new payload type for durable event data

* chore: gen

* fix: payload key

* fix: lint
2026-02-05 13:58:19 -05:00
Jishnu
ed43cae0a2 feat: Extend webhook support for scope_expression and payload (#2874)
* add: scope_expression and payload columns for v1_webhook

* refactor: insert or update sql cmds for v1_webhook

* feat: update api clients, openapi schema for new webhook body

* refactor: receiver and transformer for v1 webhook

* add: python sdk changes

* feat: ts sdk changes

* feat: add FE for webhook new params

* fix: scope expression empty payload

* add: support for scope and payload for go client

* fix: lint

* fix: error message UI on webhook

* fix: lint

* fix: migraiton conflict, build failure

* fix: error handling

* update docs, add tests

* fix: lint, test file name
2026-02-04 12:44:52 -05:00
abelanger5
d56dee4266 feat: durable user event log (#2861)
* placeholder

* feat: db tables for user events (#2862)

* feat: db tables for user events

* move event payloads to payloads table, fix env var loading

* fix: address pr review comments

* missed save

* feat: optimistic scheduling (#2867)

* feat: db tables for user events

* move event payloads to payloads table, fix env var loading

* refactor: small changes to prepare optimistic txs

* feat: optimistic scheduling

* address pr review comments

* rm comments

* fix: rampup test race condition

* fix: goleak

* feat: grpc-side triggers

* fix: config and sem logic

* fix: respect optimistic scheduling env var

* add optimistic to testing matrix, remove pg-only mode

* fix cleanup of pubbuffers

* merge migrations

* last testing fixes
2026-02-02 18:04:02 -05:00
matt
a3fe89ef03 Feat: Workflow input JSON schema in trigger preview (#2851)
* feat: add input json schema to workflow version

* feat: add json schema to putworkflow proto

* feat: wire up writes of the json schema

* chore: gen python

* feat: send json schema from the python code

* feat: wiring

* feat: pass json schema into code editor

* feat: pass prop

* fix: clean up validation stuff

* feat: allow zod `input` as optional ts field

* fix: try except logic hack

* fix: rename input -> inputValidator for consistency

* chore: gen api

* fix: improve hack slightly

* chore: changelogs, versions

* feat: zod example

* chore: rework api a bit

* fix: tsc, allow schema to update

* fix: improve loading state

* fix: api cleanup, sqlc cleanup

* fix: initial mount

* chore: lint

* chore: lint

* chore: tsc

* fix: lint

* fix: unwind unneeded change

* [Python] Feat: Default additional metadata (#2876)

* Add doc about autoscaling workers (#2864)

* add doc for autoscaling workers

* oldest also in running stats

* chore(deps): bump google.golang.org/api from 0.262.0 to 0.263.0 (#2869)

Bumps [google.golang.org/api](https://github.com/googleapis/google-api-go-client) from 0.262.0 to 0.263.0.
- [Release notes](https://github.com/googleapis/google-api-go-client/releases)
- [Changelog](https://github.com/googleapis/google-api-go-client/blob/main/CHANGES.md)
- [Commits](https://github.com/googleapis/google-api-go-client/compare/v0.262.0...v0.263.0)

---
updated-dependencies:
- dependency-name: google.golang.org/api
  dependency-version: 0.263.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* chore(deps): bump github.com/getsentry/sentry-go from 0.41.0 to 0.42.0 (#2870)

Bumps [github.com/getsentry/sentry-go](https://github.com/getsentry/sentry-go) from 0.41.0 to 0.42.0.
- [Release notes](https://github.com/getsentry/sentry-go/releases)
- [Changelog](https://github.com/getsentry/sentry-go/blob/master/CHANGELOG.md)
- [Commits](https://github.com/getsentry/sentry-go/compare/v0.41.0...v0.42.0)

---
updated-dependencies:
- dependency-name: github.com/getsentry/sentry-go
  dependency-version: 0.42.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* chore(deps): bump hatchet-sdk in /examples/python/quickstart (#2871)

Bumps hatchet-sdk from 1.22.10 to 1.22.11.

---
updated-dependencies:
- dependency-name: hatchet-sdk
  dependency-version: 1.22.11
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* feat: default additional meta

* feat: wiring

* chore: changelog, version

* fix: copy

* feat: add default meta to stubs

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Mohammed Nafees <hello@mnafees.me>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* chore: migration ver

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Mohammed Nafees <hello@mnafees.me>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-01-29 11:38:25 -05:00
Gabe Ruttner
b7ec0bc270 fix: big int alignment for cleanup function (#2877)
* fix function

* Update cmd/hatchet-migrate/migrate/migrations/20260128120000_v1_0_71.sql

Co-authored-by: matt <mrkaye97@gmail.com>

* leeeeent

---------

Co-authored-by: matt <mrkaye97@gmail.com>
2026-01-29 05:19:39 -08:00
Mohammed Nafees
6eba6fa91f Billing changes (#2643)
* make changes for billing

* progress around redesign

* meter callback

* modify limits

* upcoming subscription

* fix lint

* fix payment methods

* fix build

* PR comments

* address PR comments

* update cloud contracts

* fix migration name

* fix json serialization error

* loader and fixed for managed compute

* PR comments

* upgrade Go version

* fix migration name

* fix CI

* fix lint CI

* golangci-lint fix

* dedicated subscription
2026-01-19 12:15:11 +01:00
abelanger5
6920ec1b61 revert: UI version removal (#2756)
* revert: UI version removal

* Apply suggestion from @Copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-01-06 14:16:42 -05:00
abelanger5
dd9c36c315 refactor: remove v0 paths from codebase (#2728)
* refactor: remove v0 paths from codebase

* remove uiVersion references
2025-12-30 09:57:00 -05:00
matt
735742c466 Revert "Revert "chore: run list query optimizations (#2670)" (#2708)" (#2720)
This reverts commit 2f301e55cf.
2025-12-26 10:11:02 -07:00
Mohammed Nafees
58758d35b2 Publish COULD_NOT_SEND_TO_WORKER OLAP event due to worker backlog (#2710)
* could not send to worker OLAP event

* fix lint and PR comments

* submodule GHA

* remove submodule

* no gitsubmodule

* fix migration

* revert sdk workflows

* revert sdk workflows

* revert sdk workflows
2025-12-26 09:35:15 -07:00
matt
2f301e55cf Revert "chore: run list query optimizations (#2670)" (#2708)
This reverts commit 87b57febe8.
2025-12-23 17:10:47 -05:00
matt
b65c6de53f Feat: Hatchet Metrics Monitoring, I (#2699)
* Revert "Revert "Feat: Hatchet Metrics Monitoring, I (#2480)" (#2698)"

This reverts commit b87150767a.

* go mod tidy

---------

Co-authored-by: Mohammed Nafees <hello@mnafees.me>
2025-12-23 20:14:14 +01:00
Gabe Ruttner
87b57febe8 chore: run list query optimizations (#2670)
* add missing tenant index

* fix span name

* parallelize

* instrument tenant id attribute

* feedback

* cleanup migrations

* rename migration

* correct version

* cleanup
2025-12-23 08:59:13 -08:00
matt
bc7f341d13 Fix: Dynamically-sized chunks on payload read (#2700)
* feat: function for pulling chunk size in bytes

* chore: schema

* feat: queries

* chore: rework function

* feat: impl

* fix: ptr

* fix: ptr

* feat: olap side

* fix: handle nulls
2025-12-22 18:28:04 -05:00
matt
b87150767a Revert "Feat: Hatchet Metrics Monitoring, I (#2480)" (#2698)
This reverts commit fdc075ec6f.
2025-12-22 16:26:14 -05:00
matt
fdc075ec6f Feat: Hatchet Metrics Monitoring, I (#2480)
* feat: queries + task methods for oldest running task and oldest task

* feat: worker slot and sdk metrics

* feat: wal metrics

* repository stub

* feat: add meter provider thingy

* pg queries

* fix: add task

* feat: repo methods for worker metrics

* feat: active workers query, fix where clauses

* fix: aliasing

* fix: sql, cleanup

* chore: cast

* feat: olap queries

* feat: olap queries

* feat: finish wiring up olap status update metrics

* chore: lint

* chore: lint

* fix: dupes, other code review comments

* send metrics to OTel collector

* last autovac

* flag

* logging updates

* address PR comments

---------

Co-authored-by: gabriel ruttner <gabriel.ruttner@gmail.com>
Co-authored-by: Mohammed Nafees <hello@mnafees.me>
2025-12-23 01:04:02 +05:30
matt
a4e7584c18 Fix: Last bits of payload job cleanup (#2690)
* feat: offset cleanup queries

* feat: wire it up

* fix: extend lease interval

* feat: new function for diffing rows

* feat: queries

* fix: signature, start wiring up

* feat: wire up diffing logic at the end

* chore: longer lease

* chore: fix type

* fix: couple bugs

* fix: columns

* fix: dupe

* chore: rm unused function

* fix: err

* fix: reduce memory footprint by a bunch
2025-12-22 12:43:08 -05:00
abelanger5
fe6583fc41 fix: rare cases of duplicate writes causing stuck updates (#2681)
* fix: rare cases of duplicate writes causing stuck updates

* update to go 1.25

* fix: add version to sdk-go.yml
2025-12-18 13:05:01 -05:00
matt
2226a3eaa4 Chore: Remove payload WAL + corresponding tables 🥳 (#2645)
* chore: nuke the wal and cutover qi tables

* fix: migration name
2025-12-17 11:27:59 -05:00
matt
bf849d415c Fix: Payload List Index Performance (#2669)
* feat: update core func

* feat: migration to update function

* fix: pass batch size through

* fix: limit

* feat: materialized cte

* chore: comment
2025-12-16 12:35:29 -05:00
matt
23db2a4fac Fix: Pagination by bounds (#2654)
* fix: pagination missing rows

* fix: separate functions

* fix: return both bounds from query

* fix: wiring

* fix: func

* fix: order col

* fix: bug

* fix: math is hard

* fix: more math

* fix: math and math and math

* fix: slightly more math

* fix: placeholders 🤦

* fix: where clause

* fix: math!

* fix: schema

* refactor: try with `CEIL`

* fix: mathin up a storm

* fix: I was actually a math major in college, who knew

* fix: copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* fix: copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-12-15 13:07:51 -05:00
matt
0a947924fa Feat: Parallelize replication from PG -> External (#2637)
* feat: chunking query

* feat: first pass at range chunking

* fix: bug bashing

* fix: function geq

* fix: use maps.Copy

* fix: olap func

* feat: olap side

* refactor: external id

* fix: order by

* feat: wire up env vars

* fix: pass var through

* fix: naming

* fix: append to returnErr properly

* fix: use eg.Go
2025-12-10 17:11:03 -05:00
matt
3ff672ebe4 Fix: Don't reset offset if a new process acquires lease (#2628)
* fix: don't reset offset if a new process acquires the lease

* fix: copy paste

* feat: migration, fix queries

* fix: more queries

* fix: down migration

* fix: comment

* feat: finish wiring up everything else

* fix: placeholder initial type

* fix: zero values everywhere

* fix: param ordering

* fix: handle no rows

* fix: zero values

* fix: limit

* fix: simplify

* fix: better defaults
2025-12-09 19:01:51 -05:00
matt
9e14814acb Feat: OLAP Payload Cutover Job (#2618)
* feat: migration

* feat: queries

* feat: overwrite queries

* fix: bug

* feat: first pass

* fix: more olap job wiring

* fix: signature

* fix: refs to a bunch of funcs

* feat: job

* fix: table names

* fix: span name

* chore: lint

* fix: redundant error check

* fix: naming

* fix: handle nil external id

* fix: order payload partitions descending

* fix: param for limiting which partitions get processed

* fix: olap
2025-12-09 12:33:07 -05:00
matt
7e48ac7d02 Fix: Leasing for payload job (#2609)
* refactor: acquire a lease instead of an advisory lock

* refactor: partition dates

* fix: single query to acquire / extend

* fix: explicit alias

* fix: unwind

* fix: hwere clause

* fix: handle no rows

* fix: lease bug

* fix: rm debug

* fix: comment for clarity

* fix: syntax that doesn't actually matter

* fix: error
2025-12-05 13:55:59 -05:00
matt
18940869ae Feat: Job for payload cutovers to external (#2586)
* feat: initial payload cutover job

* refactor: fix a couple things

* feat: start wiring up writes

* feat: only run job if external store is enabled

* fix: add some notes, add loop

* feat: function for reading out payloads

* fix: date handling, logging

* feat: remove wal and immediate offloads

* feat: advisory lock

* feat: partition swap logic

* fix: rm debug

* fix: add todo

* fix: sql cleanup

* fix: sql cleanup, ii

* chore: nuke a bunch of WAL stuff

* chore: more wal

* feat: trigger for crud opts

* feat: drop trigger + function in swapover

* feat: move autovac to later

* feat: use unlogged table initially

* feat: update migration

* fix: drop trigger

* fix: use insert + on conflict

* fix: types

* refactor: clean up a bit

* fix: panic

* fix: detach partition before dropping

* feat: configurable batch size

* feat: offset tracking in the db

* feat: explicitly lock

* fix: down migration

* fix: bug

* fix: offset handling

* fix: try explicit ordering of the insert

* fix: lock location

* fix: do less stuff after locking

* fix: ordering

* fix: dont drop and recreate if temp table exists

* fix: explicitly track completed status

* fix: table name

* fix: dont use unlogged table

* fix: rm todos

* chore: lint

* feat: configurable delay

* fix: use date as pk instead of varchar

* fix: daily job

* fix: hack check constraint to speed up partition attach

* fix: syntax

* fix: syntax

* fix: drop constraint after attaching

* fix: syntax

* fix: drop triggers properly

* fix: factor out insert logic

* refactor: factor out loop logic

* refactor: factor out job preparation work

* fix: ordering

* fix: run the job more often

* fix: use `WithSingletonMode`

* fix: singleton mode sig

* fix: env var cleanup

* fix: overwrite sig

* fix: re-enable immediate offloads with a flag

* fix: order, offload at logic

* feat: add count query to compare

* fix: row-level triggers, partition time bug

* fix: rm todo

* fix: for true

* fix: handle lock not acquired

* fix: handle error

* fix: comment
2025-12-05 10:54:26 -05:00
Mohammed Nafees
8842a2a9cf Case on conflict for v1_statuses_olap entry (#2528)
* case on conflict for v1_statuses_olap

* fix sql
2025-11-14 17:18:35 +01:00
abelanger5
e1fdeeaf1c fix: payload performance (#2441)
* change some olap flush settings

* increase timeouts for payload wal

* fix: improve performance of payload wal metrics

* slight updates

* more small tweaks

* undo some olap changes, don't offload some payloads

* remove double reads

* try reducing wal poll limit

* analyze v1_dag

* move partition method
2025-10-23 17:45:49 -04:00