Commit Graph

68 Commits

Author SHA1 Message Date
abelanger5 bfb11cac51 fix: always use retention on queues, optional data/worker (#916) 2024-09-27 14:23:14 -04:00
abelanger5 d23e5d9963 feat: expression-based concurrency keys (#889)
* feat: expression-based concurrency keys

* fix: build

* fix: typos

* fix: gen

* fix: migration

* fix: remove print statements

* fix: reassignment bugs, retries on closed transport, pr review
2024-09-19 10:32:22 -04:00
abelanger5 b5014f6b3d chore: more visibility and debug lines for queues (#836)
* chore: more visibility and debug options for queues

* better debug lines on queue repo

* don't log so much in load test
2024-08-29 14:49:24 -04:00
abelanger5 263eaf069b feat: pass otel through msgqueue (#802)
* feat: pass otel through msgqueue

* feat: more spans on scheduling

* otel increase batch size
2024-08-28 14:45:02 +00:00
abelanger5 6317f86793 refactor: consolidate partition logic (#826)
* refactor: consolidate partition logic

* fix: race on scheduler

* fix: move partition uuid to db query

* fix: generate
2024-08-27 15:28:53 -04:00
abelanger5 2b9121d295 fix: long token expiry for k8s quickstart (#808)
* debug overwrite

* fix: k8s 100 year expiry
2024-08-23 07:43:02 -04:00
abelanger5 93438ce09d feat: adds a k8s helper for easy k8s installation (#806)
* feat: k8s helper script for generating env

* chore: bump 1.21 -> 1.22

* feat: support both secret, configmap and generate api token

* fix: better errors

* fix: upsert logic

* use default seed tenant for token generation
2024-08-22 21:24:16 +00:00
Gabe Ruttner 4ea4712d4d refactor: performance and throughput (#756)
Refactors the queueing logic to be fairly balanced between actions, with each action backed as a separate FIFO queue. Also adds support for priority queueing and custom queues, though those aren't exposed on the API layer yet. Improves throughput to be > 5000 tasks/second on a single queue. 

---------

Co-authored-by: Alexander Belanger <alexander@hatchet.run>
2024-08-12 14:38:47 +00:00
Gabe Ruttner b4670af138 Fix qos otel config (#754)
* feat: otel trace id ratio

* feat: rabbitmq qos

* feat: requeue limit

* fix: tests
2024-07-30 18:11:10 -04:00
Gabe Ruttner b802f9f45f feat: stream by addl meta (#751)
* feat: prop schedule and run

* wip

* fix: filter wfrid

* feat: hangup

* chore: rm debug log

* chore: func name

* fix: cancelled payload

* fix: load

* fix: cleanup the cahce

* fix: single proto

* fix: key -> val

* chore: case

* chore: rm dead code

* chore: rm dead code

* feat: go and docs

* fix: docs
2024-07-29 19:09:51 +00:00
Gabe Ruttner ad29edb44f fix: partitioned semaphore resolver (#731)
* fix: partition and improve query

* feat: paginate until done

* chore: address comments

* fix: write partitions
2024-07-18 11:06:25 -04:00
Gabe Ruttner b7cec9ec53 feat: soft delete (#717)
* feat: soft delete workflows and versions

* feat: filter soft deletes wf and wfr

* feat: filter events and step runs

* fix: query

* fix: query

* chore: generate

* wip

* chore: squash migrations

* chore: separate retention into new service

* feat: regularly clean up

* chore: migrations

* fix: tests

* fix: queries

* fix: ambiguous

* fix: refs

* fix: ambiguous id

* fix: remove update from

* fix: soft delete

* fix: cleanup retention scheduler

* fix: has more query

* chore: gen

* fix: query

* fix: table
2024-07-18 09:06:05 -04:00
abelanger5 5d87f380ef feat: managed worker pools (#725)
* change api extension spec to register custom populators

* fix: support only bearer auth

* fix: correct authn logic

* fix: indexes on workflow runs, events

* feat: managed worker pools

* chore: lint fix

* hide workers view when not enabled

* support internal api tokens, minor improvements

* fix: actually write internal

* fix breaking changes

* don't allow revoking internal tokens

* fix: linting and remove metrics view

* fix: token

* address review and add feat flags
2024-07-16 13:33:46 +00:00
abelanger5 8f8f3ad287 fix: reduce max throughput of requeue (#713)
* fix: reduce max throughput of requeue

* fix: reassign query

* fix: move step run timeout to partition model

* fix: partitioning queries and index

* better logs on requeue

* fix: inactive rebalance and get step run for engine query

* fix: correct inactive queries
2024-07-12 14:03:55 -04:00
Gabe Ruttner 8a8a033af6 feat: variable token expiration (#670)
* feat: variable token expiration

* fix: typo

* fix: long tokens for webhook workers

* fix: format
2024-07-01 15:47:32 +00:00
abelanger5 c2debe62d8 fix: add back deprecated service names and fix webhook worker query (#660) 2024-06-27 08:01:02 -04:00
abelanger5 f2c6bc1f44 feat: tenant partitioning (#649)
* feat: tenant partitioning

* fix: rebalance inactive partitions, split into separate partitioner

* fix: shutdown partitioner scheduler properly

* update config options

* fix: config options linting
2024-06-26 21:06:51 +00:00
Gabe Ruttner a8d42819ea feat: check security service (#639)
* feat: check security service

* feat: propegate version

* feat: with ident

* fix: lint

* chore: generate

* fix: change domain

* fix: panic recover

* fix: migrations

* fix: hash

* fix: dont check in tests
2024-06-26 16:26:29 -04:00
abelanger5 d19e299d1e refactor: make engine runnable with config instead of loader (#640)
* refactor: make hatchet-engine runnable programmatically

* feat: export teardown name and fn
2024-06-26 08:14:30 -04:00
Luca Steeb 1490d88954 feat: webhook workers (#542)
Adds serverless support via the concept of webhook workers. Allows any webhook to be registered as a serverless endpoint for executing a step.
2024-06-25 17:06:43 -04:00
abelanger5 5538196169 fix: correct lengths on random.Generate (#638) 2024-06-25 15:12:59 -04:00
Gabe Ruttner 697757879f feat: billing (#624)
* feat: init lago client

* feat: billable meter

* feat: db persistence

* wip: expose sub

* feat: rename page

* wip: billing section

* wip: lago integration

* feat: separate plan and period

* wip: webhook

* feat: improve empty state

* feat: update limits on plan changes

* feat: can change plans

* feat: change plan loading state

* feat: yearly filter

* feat: billing clarification

* fix: treatment

* feat: filter plans

* feat: prevent non-owner from changing plan

* fix: loading state

* fix: jit portal link

* fix: rm import

* fix: build errors

* fix: default to free

* fix: wrong files

* fix: select or insert customer

* fix: note

* feat: upgrade dependent on payment method state

* fix: dedupe

* chore: remove github-app from core

* chore: port to cloud

* chore: port to cloud

* chore: port to cloud

* chore: port to cloud

* chore: port to cloud

* add new components, repository callbacks

* chore: rm unused packages

* chore: fix generation

* chore: gen

* fix: cloud api references

* debug

* debug

* fix: actually set plans

* chore: rm debug

* fix: build

* feat: callbacks

* fix: add generated code

* chore: group cloud components

* chore: group by feature

* feat: alert change

* feat: confirm

* fix: confirm modal

* fix: ui

* fix: remove arrears

* fix: open in same tab

* fix: wan alert

* fix: call the callback

* fix: callback obj

* fix: disable if no cloud meta

---------

Co-authored-by: Alexander Belanger <alexander@hatchet.run>
2024-06-25 13:57:16 -04:00
abelanger5 771054a401 fix: make rabbitmq connection optional and disable for token generation (#635) 2024-06-25 15:53:55 +00:00
Luca Steeb b6dcb4e7e9 refactor(random): refactor random string generation (#633) 2024-06-24 23:44:03 +01:00
abelanger5 7c3ddfca32 feat: api server extensions (#614)
* feat: allow extending the api server

* chore: remove internal packages to pkg

* chore: update db_gen.go

* fix: expose auth

* fix: move logger to pkg

* fix: don't generate gitignore for prisma client

* fix: allow extensions to register their own api spec

* feat: expose pool on server config

* fix: nil pointer exception on empty opts

* fix: run.go file
2024-06-19 09:36:13 -04:00
Gabe Ruttner bbc4e58dd9 feat: limits (#559)
* feat: workflow run limits

* fix: resource exhausted 429

* feat: event limit

* feat: worker limit

* fix: sensible error

* fix: pb

* feat: expose limits api

* feat: default limits

* feat: add enable alert option

* feat: slack and email alerts

* fix: cron interval

* feat: make metered util

* wip: schedules and crons

* chore: squash migration

* fix: select or insert

* fix: remove unfinished meter

* chore: atlas migration

* fix: template format

* fix: shared ErrResourceExhausted

* feat: cache

* fix: limit can be nil

* fix: clarification

* fix: close meter ticker

* fix: friendly error for child workflows
2024-06-07 10:57:57 -07:00
abelanger5 b0b2e26952 feat: hatchet-lite (#560)
* feat: hatchet-lite mvp

* fix: init shadow db

* fix: install atlas

* fix: correct env

* fix: wait for db ready

* fix: remove name flag

* fix: add hatchet-lite to build
2024-06-06 14:03:53 -04:00
Luca Steeb d1a4d35830 chore(pre-commit): lint whitespace (#494)
Adds a whitespace linter to the pre-commit hook to ensure consistent formatting.
It also enables linting of other SQL files such as for SQLc queries.
2024-05-16 09:17:01 -04:00
abelanger5 68a79fe071 fix: handle nil input more gracefully (#486) 2024-05-13 13:07:41 -04:00
abelanger5 b50ed62924 feat: alerting from slack and email (#461)
* feat: alerting. implements slack alerting, email, and refactors tenant settings to make them more manageable

* chore: generate

* chore: generate sqlc after migrate
2024-05-08 10:04:58 -04:00
abelanger5 7543a0c2a5 add jobs which always run on failure (#445)
* (wip) prisma schema

* feat: on-failure steps

* chore: address changes from PR review

* chore: bump migration number
2024-05-06 15:39:22 -04:00
abelanger5 4ce1dd8632 feat: multi-workflow runs listener on a single endpoint
* new api-contract for workflow run events

* feat: initial implementation for new subscribe listener

* fix: sync issues and send workflow runs immediately

* refactor: add context to all engine db queries, fix deadlocking query

* fix: use new ctx for deleting dispatcher and ticker

* add cancellation reasons

* fix: docs linting

---------

Co-authored-by: gabriel ruttner <gabriel.ruttner@gmail.com>
2024-04-18 20:55:11 -04:00
abelanger5 e0d363e796 chore: intercept grpc errors and don't send internal to client (#370) 2024-04-10 19:03:18 -04:00
Gabe Ruttner d8b6843dec feat: streaming events (#309)
* feat: add stream event model

* docs: how to work with db models

* feat: put stream event

* chore: rm comments

* feat: add stream resource type

* feat: enqueue stream event

* fix: contracts

* feat: protos

* chore: set properties correctly for typing

* fix: stream example

* chore: rm old example

* fix: async on

* fix: bytea type

* fix: worker

* feat: put stream data

* feat: stream type

* fix: correct queue

* feat: streaming payloads

* fix: cleanup

* fix: validation

* feat: example file streaming

* chore: rm unused query

* fix: tenant check and read only consumer

* fix: check tenant-steprun relation

* Update prisma/schema.prisma

Co-authored-by: abelanger5 <belanger@sas.upenn.edu>

* chore: generate protos

* chore: rename migration

* release: 0.20.0

* feat(go-sdk): implement streaming in go

---------

Co-authored-by: gabriel ruttner <gabe@hatchet.run>
Co-authored-by: abelanger5 <belanger@sas.upenn.edu>
2024-04-01 15:46:21 -04:00
abelanger5 092f54c64f refactor: separate api and engine repositories, change ticker logic (#281)
* refactor: separate api and engine repositories, change ticker logic

* fix: nil error blocks

* fix: run migration on load test

* fix: generate db package in load test

* fix: test.yml

* fix: add pnpm to load test

* fix: don't lock CTEs with columns that don't get updated

* fix: update heartbeat for worker every 4 seconds, not 5

* chore: remove dead code

* chore: update python sdk

* chore: add back telemetry attributes
2024-03-21 14:10:34 -04:00
abelanger5 d9360520de chore: add better telemetry to database (#268)
* chore: add better telemetry to database

* fix: span end on query
2024-03-15 15:08:40 -04:00
Luca Steeb d577b5f34c fix(cli): make server config cleanup properly (#267) 2024-03-14 17:46:01 +07:00
abelanger5 d7e6e4d8c6 fix: worker locking on requeues (#265)
* fix: worker locking on requeues

* chore: add alerter to dispatcher
2024-03-13 21:50:02 -04:00
abelanger5 c66f97c856 fix: deadlocks on workers and tickers (#241)
* chore: add sentry support to engine

* fix: deadlocks on workers and tickers

* refactor: reduce prisma calls in engine

* trigger

* fix: remove some tenant lookups

* feat: dlx and renamed taskqueue -> msgqueue

* refactor: get group key run logic

* fix: retry counts on messages and concurrency edge cases

* fix: rabbitmq integration tests

* feat: add consumer timeouts

---------

Co-authored-by: Luca Steeb <contact@luca-steeb.com>
2024-03-12 00:45:18 -04:00
Luca Steeb c7cdc8aa5d fix(engine/health): listen before serving (#243) 2024-03-08 14:46:47 +07:00
abelanger5 105aa08f3f chore: add sentry support to engine (#237)
* chore: add sentry support to engine

* chore: address PR comments
2024-03-06 11:50:49 -05:00
abelanger5 f256b258d8 feat: logging from step run executions (#217)
* fix: job cancellations with shared ctx

* fix: found the bug

* fix: all job runs were getting cancelled

* feat: support logging from executions

* fix: place logging in background

* add back split screen
2024-03-01 17:55:31 -05:00
Luca Steeb f5a6e80fc7 fix(engine): add --no-graceful-shutdown flag for nodemon (#221) 2024-03-02 00:38:39 +07:00
Luca Steeb 9b68115fb5 refactor: cleanup functions in api + worker (#192) 2024-03-02 00:37:02 +07:00
Luca Steeb 0d503288ba fix(engine): cleanup dispatcher and grpc server in parallel (#207) 2024-03-01 01:32:24 +07:00
Luca Steeb 577f432218 feat(engine): readiness & liveness probes (#197) 2024-02-28 00:28:11 +07:00
Luca Steeb ae4841031b feat(engine): standalone tests and engine teardown (#172) 2024-02-28 00:15:25 +07:00
abelanger5 3743746657 feat: github app integration (#163)
* feat: github app integration

* chore: proto

* fix: migrate instead of push

* fix: db migrate -> migrate

* fix: migrate again

* remove skip-generate

* add back generate

* setup pnpm
2024-02-13 21:34:16 -05:00
Luca Steeb ab9f8e6c47 fix(cli): log errors via stderr and always exit (#139) 2024-02-01 00:12:22 +07:00
abelanger5 52ba01bf06 chore: qol improvements (#137) 2024-01-30 00:08:52 -05:00