Commit Graph

7 Commits

Author SHA1 Message Date
Mohammed Nafees
793df41ccb Deploy HyperDX locally via docker-compose and add traces to task controller (#2058)
* deploy jaegar locally and add traces to task controller

* use jaegar v2

* add SERVER_OTEL_COLLECTOR_AUTH

* fix PR comments

* fix span name
2025-07-29 16:24:38 +02:00
Mohammed Nafees
ef498a6235 Introduce tenant Prometheus metrics (#1875)
* introduce tenant workflow completed metric

* expose tenant prom metrics via handler

* fix workflow and worker id in metrics

* correctly add workflow metrics from workflow controller

* use olap DB to gather information for workflow completion

* fix prom metrics endpoint for tenant

* workflow name from external id

* simplify tenant registry based metrics

* add docs for prometheus metrics

* fix docs lint

* run prettier fix

* WIP metrics work

* use federate prom server URL to proxy metrics

* implement workflow duration histogram metric

* separate prom stack docker compose

* fix duration metrics calls

* move scheduler metrics to prom tenant specific file

* update docs for prom metrics

* fix lint

* use proper indices to query for durations

* reorg tenant metrics

* fix lint for doc

* update docs with promql examples and casing around prom metrics enabled

* update prom server url

* fix lint

* enabled prom metrics for v1 only from controller
2025-06-27 11:46:31 -04:00
abelanger5
9aead7ab68 feat: global prometheus metrics (#1568)
* feat: global prometheus metrics

* configure prom with env vars, clean up metrics

* add histogram and docs

* update port
2025-04-17 15:11:38 -04:00
abelanger5
a9936ef687 fix: set otel insecure flag for all telemetry instantiations (#999) 2024-10-30 17:34:36 -04:00
abelanger5
3d218302ff fix: internal queue items performance and race conditions (#943)
* fix: don't use xmin hack

* fix: assign not append

* refactor: parallel step run updates via hashes

* fix: intermittent double execution of child step runs

* fix: rollback rate limits

* fix: bulk event writes from single buffer

* expose cleanup

* fix: race conditions on failures and cancellations

* change logger defaults to warn and console
2024-10-07 11:16:53 -04:00
Gabe Ruttner
b4670af138 Fix qos otel config (#754)
* feat: otel trace id ratio

* feat: rabbitmq qos

* feat: requeue limit

* fix: tests
2024-07-30 18:11:10 -04:00
abelanger5
7c3ddfca32 feat: api server extensions (#614)
* feat: allow extending the api server

* chore: remove internal packages to pkg

* chore: update db_gen.go

* fix: expose auth

* fix: move logger to pkg

* fix: don't generate gitignore for prisma client

* fix: allow extensions to register their own api spec

* feat: expose pool on server config

* fix: nil pointer exception on empty opts

* fix: run.go file
2024-06-19 09:36:13 -04:00