bugsink

mirror of https://github.com/bugsink/bugsink.git synced 2026-01-03 20:00:07 -06:00

Author	SHA1	Message	Date
Klaas van Schelven	4a73880ea7	PID_FILE check: make optional As implied by this comment: > this implementation is not supposed to be bullet-proof for race conditions (nor is it cross-platform)... it's > just a small check to prevent the regularly occurring cases: > * starting a second runsnappea in development > * running 2 separate instances of bugsink on a single machine without properly distinguishing them but this "small check" gets in the way sometimes, so it's better to be able to turn it off. See #99	2025-07-28 20:46:45 +02:00
Klaas van Schelven	aa255978b7	Snappea: refuse to start in TASK_ALWAYS_EAGER mode	2025-07-08 20:57:26 +02:00
Klaas van Schelven	53d4be8183	Fix 'different_runtime_limit' race conditions This commit fixes 3 related issues with the way runtime_limit was administered; which could lead to race conditions (and hence: the wrong runtime_limit applying at some point in time). Post-fix, the folllowing holds: 1. We use thread_locals to store this info, since there are at least 2 sources of threaded code that touch this (snappea's workers and the django debugserver) 2. We distinguish between the "from connection settings" timeout and the "temporarily overridden" ones, since we cannot assume connection-initialization happens first (as per the comment in base.py) 3. We store runtime-limits per alias ('using'). Needed for [2] (each connection may have a different moment-of-initialization, clobbering CM-set values from the other connection) and also needed once you realize there may be different defaults for the timeouts. General context: I've recently started introducing the 'different runtime' helper quite a bit more; and across connections (snappea!), which created more and more doubts as to it actually working as advertised. Thoughts on "using" being required. I used to think "you can reason about a global timeout value, and the current transaction makes clear what you're actually doing", but as per the notes above that doesn't really work. Thoughts on reproducing: A few thoughts/notes on reproducing problems with race conditions. Basic note: that's always hairy. So in the end I settled on a solution that's hopefully easy to reason about, even if it's verbose. When I started work on this commit, I focussed on thread-safety; "proving the problem" consisted of F5/^R on a web page with 2 context managers with different timeouts, hoping to show that the stack unrolling didn't work properly. However, during those "tests" I noticed quite a few resets-to-5s (from the connection defaults), which prompted fix [2] from above.	2025-04-22 22:08:53 +02:00
Klaas van Schelven	9b6fbe523f	Snappea foreman: on catastrophic errors, wait for workers	2025-04-18 14:57:08 +02:00
Klaas van Schelven	366d22f295	Snappea stats: fix for when no tasks remain	2025-04-18 14:37:44 +02:00
Klaas van Schelven	7616b0ea77	Document timing of task.create/delete in code	2025-04-17 10:16:43 +02:00
Klaas van Schelven	89927c7ab4	Snappea stats: never bring down snappea	2025-04-17 10:13:19 +02:00
Klaas van Schelven	6500548168	Snappea Stats: document the need for separate table	2025-04-17 09:41:03 +02:00
Klaas van Schelven	abd05b7269	Snappea stats: silently ignore backwards clock drift	2025-04-17 09:38:42 +02:00
Klaas van Schelven	4cedffc1b7	Snappea stats: configurable retention	2025-04-16 17:10:15 +02:00
Klaas van Schelven	e27439ab7b	snappea stats: log cost of stats themselves	2025-04-16 16:57:53 +02:00
Klaas van Schelven	94338051ef	Snappea Stats: first version	2025-04-16 16:40:28 +02:00
Klaas van Schelven	1084796763	When not dogfooding, just print a regular stacktrace in the logs This will hopefully help when getting issue-reports for those that have not set up dogfooding. See [Dogfooding Bugsink](https://www.bugsink.com/docs/dogfooding/)	2025-04-11 13:22:37 +02:00
Klaas van Schelven	eb780c0008	Snappea Foreman: don't crash on "non-bullet-broof" pid-check	2025-03-19 08:53:21 +01:00
Klaas van Schelven	14d34807ca	Snappea 'worker done': display task name for the important case of 'quickly eye-balling what-took-you-so-long' this saves those eye-balls a lookup	2025-02-20 21:38:14 +01:00
Klaas van Schelven	918b1ef54c	Add ids to 2 system-checks	2025-02-18 12:10:32 +01:00
Klaas van Schelven	19aa439339	reword comment slightly	2025-02-07 10:21:20 +01:00
Klaas van Schelven	86e8c4318b	Add indexes on fields on which we order and vice versa Triggered by issue_event_list being more than 5s on "emu" (my 1,500,000 event test-machine). Reason: sorting those events on non-indexed field. Switching to a field-with-index solved it. I then analysed (grepped) for "ordering" and "order_by" and set indexes accordingly and more or less indiscriminately (i.e. even on tables that are assumed to have relatively few rows, such as Project & Team).	2025-02-04 21:19:24 +01:00
Klaas van Schelven	59372aba33	First version of multi-tenant setup (EE)	2025-01-29 09:04:19 +01:00
Klaas van Schelven	705cf43fc2	Remove doc-TODO	2025-01-24 11:43:28 +01:00
Klaas van Schelven	cf23ba707e	Warn about top-level settings	2025-01-24 11:40:13 +01:00
Klaas van Schelven	0ad878d1bc	AttrLike dict: better exceptions	2024-11-20 16:30:48 +01:00
Klaas van Schelven	71d6e89c93	Show warning message when there are many/stale snappea tasks As discussed in #11, there are scenarios (e.g. misconfiguration) where snappea does not pick up the tasks. Events not showing up in Bugsink, w/o further indication why that may be, leaves people confused. Better to warn explicitly in that case.	2024-11-15 14:51:41 +01:00
Klaas van Schelven	67cfbb58d7	Use 'monofy' package now that it is extracted	2024-09-04 22:54:27 +02:00
Klaas van Schelven	68c556cdca	Comment update about WAL & Snappea	2024-08-29 10:23:10 +02:00
Klaas van Schelven	5d6983042a	Snappea: workaholic mode	2024-08-28 08:58:35 +02:00
Klaas van Schelven	66bece58c1	snappea: remove a comment that's mostly of historic interest	2024-08-28 08:58:00 +02:00
Klaas van Schelven	46046f894c	snappea foreman comment clarifications	2024-08-27 22:17:28 +02:00
Klaas van Schelven	129a8db421	Fix various flake8 errors	2024-08-21 09:31:05 +02:00
Klaas van Schelven	63417d555f	Explain why we deal with SIGTERM as we do (from memory, in response to the glib remarks in `b09cfb21c3`, ('as demanded by systemd')	2024-07-26 16:22:30 +02:00
Klaas van Schelven	d56a8663a7	Remove the periodCounter and the PC registry direct consequence of switching to SQL-based counting	2024-07-16 15:08:05 +02:00
Klaas van Schelven	6767ea593a	Fix port number in example nginx conf	2024-07-12 08:39:50 +02:00
Klaas van Schelven	eb23d44962	Enforce a single pc_registry for a single ingesting process Using a pid-file that's implied by the ingestion directory. We do this in `get_pc_registry`, i.e. on the first request. This means failure is in the first request on the 2nd process. Why not on startup? Because we don't have a configtest or generic on-startup location (yet). Making _that_ could be another source of fragility, and getting e.g. the nr of processes might be non-trivial / config-dependent.	2024-07-09 13:14:27 +02:00
Klaas van Schelven	c2ec150f52	Cost of connection.close and subsequent reopen documented	2024-07-08 09:53:15 +02:00
Klaas van Schelven	b6cc268333	Remove connection_close this was never supposed to have been committed, mistake in `c453ca00e5`	2024-07-08 09:41:57 +02:00
Klaas van Schelven	c453ca00e5	Snappea connection_close	2024-07-05 16:28:23 +02:00
Klaas van Schelven	1eb65a7790	Release worker_semaphore when failing to create worker exposed when playing around with arbitrary Tasks in a shell; this created workers I could not run, which would put the foreman in a 'waiting for available threads' mode. I briefly looked at the rest of that loop to see whether more exception handling is necessary, but TBH I don't think we can reasonably recover from e.g. task.delete() failing (or at least I don't want to think about it now)	2024-07-05 15:58:15 +02:00
Klaas van Schelven	259069f6e2	Explicit error message for malformed task name	2024-07-05 15:54:55 +02:00
Klaas van Schelven	253380bf2f	Foreman: document current understanding of connection.close()	2024-07-05 13:00:08 +02:00
Klaas van Schelven	bf5d221a03	Snappea: fixes on 'atomic' call for Task-getting prompted by the work in the previous commit; but somewhat separate from it	2024-07-04 14:05:04 +02:00
Klaas van Schelven	4daa6c9e09	Close database connections in snappea	2024-07-04 14:04:03 +02:00
Klaas van Schelven	75b620941a	Don't (accidentally) load all events into memory on-init	2024-05-23 10:43:28 +02:00
Klaas van Schelven	82d40e3741	Push get_pc_registry into snappea.foreman init	2024-05-23 10:12:24 +02:00
Klaas van Schelven	5af1d2384e	Performance logging of Snappea task create/delete	2024-05-22 17:45:28 +02:00
Klaas van Schelven	151af98559	Performance logging: push into development.py (i.e. remove from non-development servers)	2024-05-22 17:06:21 +02:00
Klaas van Schelven	b09cfb21c3	Configure runsnappea to be rebooted every day a way to limit memory leaks also: deal with SIGTERM 'correctly' (i.e. as demanded by systemd)	2024-05-22 12:44:58 +02:00
Klaas van Schelven	f150c839dc	Recommended setup: fix tmpfile troubles in 2 ways * recommend to just run in the home dir * don't use private tmp The troubles were: when set up using private tmp files, the 2 processes cannot communicate with each other	2024-05-22 08:37:52 +02:00
Klaas van Schelven	89dba6e6e5	Fix typo	2024-05-17 15:52:43 +02:00
Klaas van Schelven	5ff2623112	Add checksnappea command	2024-05-17 12:03:40 +02:00
Klaas van Schelven	46220f97ea	Snappea: 'ensure' it is running as a singleton	2024-04-27 21:59:20 +02:00

1 2

75 Commits