Commit Graph

38 Commits

Author SHA1 Message Date
Klaas van Schelven
fd80eff7ca Migration fix: delete TurningPoints w/ project=None
Fix #155
2025-07-17 14:41:27 +02:00
Klaas van Schelven
28b2ce0eaf Various models: .project SET_NULL => DO_NOTHING
Like e45c61d6f0, but for .project.

I originally thought `SET_NULL` would be a good way to "do stuff later", but
that's only so the degree that [1] updates are cheaper than deletes and [2]
2nd-order effects (further deletes in the dep-tree) are avoided.

Now that we have explicit Project-deletion (deps-first, delayed, properly batched)
the SET_NULL behavior is always a no-op (but with cost in queries).

As a result, in the test for project deletion (which has deletes for many
of the altered models), the following 12 queries are no longer done:

```
SELECT "projects_project"."id", [..many fields..] FROM "projects_project" WHERE "projects_project"."id" = 1
DELETE FROM "projects_projectmembership" WHERE "projects_projectmembership"."project_id" IN (1)
DELETE FROM "alerts_messagingserviceconfig" WHERE "alerts_messagingserviceconfig"."project_id" IN (1)
UPDATE "releases_release" SET "project_id" = NULL WHERE "releases_release"."project_id" IN (1)
UPDATE "issues_issue" SET "project_id" = NULL WHERE "issues_issue"."project_id" IN (1)
UPDATE "issues_grouping" SET "project_id" = NULL WHERE "issues_grouping"."project_id" IN (1)
UPDATE "events_event" SET "project_id" = NULL WHERE "events_event"."project_id" IN (1)
UPDATE "tags_tagkey" SET "project_id" = NULL WHERE "tags_tagkey"."project_id" IN (1)
UPDATE "tags_tagvalue" SET "project_id" = NULL WHERE "tags_tagvalue"."project_id" IN (1)
UPDATE "tags_eventtag" SET "project_id" = NULL WHERE "tags_eventtag"."project_id" IN (1)
UPDATE "tags_issuetag" SET "project_id" = NULL WHERE "tags_issuetag"."project_id" IN (1)
```
2025-07-03 21:49:49 +02:00
Klaas van Schelven
6b9e4d8011 Project.delete_deferred(): first version (WIP)
Implemented using a batch-wise dependency-scanner in delayed
(snappea) style.

* no real point-of-entry in the (regular, non-admin) UI yet.
* no hiding of Projects which are delete-in-progress from the UI

* lack of DRY
* some unnessary work (needed in the Issue-context, but not here)
  is still being done.

See #50
2025-07-03 21:01:28 +02:00
Klaas van Schelven
e45c61d6f0 Various models: .issue and .grouping; SET_NULL => DO_NOTHING
I originally thought `SET_NULL` would be a good way to "do stuff later", but
that's only so the degree that [1] updates are cheaper than deletes and [2]
2nd-order effects (further deletes in the dep-tree) are avoided.

Now that we have explicit Issue-deletion (deps-first, delayed, properly batched)
the SET_NULL behavior is always a no-op (but with cost in queries).

As a result, in the test for issue deletion (which has deletes for many
of the altered models), the following 8 queries are no longer done:

```
SELECT "issues_grouping"."id", [..many fields..] FROM "issues_grouping" WHERE "issues_grouping"."id" IN (1)
UPDATE "events_event" SET "grouping_id" = NULL WHERE "events_event"."grouping_id" IN (1)

[.. a few moments later..]

SELECT "issues_issue"."id", [..many fields..] FROM "issues_issue" WHERE "issues_issue"."id" = 'uuid'
UPDATE "issues_grouping" SET "issue_id" = NULL WHERE "issues_grouping"."issue_id" IN ('uuid')
UPDATE "issues_turningpoint" SET "issue_id" = NULL WHERE "issues_turningpoint"."issue_id" IN ('uuid')
UPDATE "events_event" SET "issue_id" = NULL WHERE "events_event"."issue_id" IN ('uuid')
UPDATE "tags_eventtag" SET "issue_id" = NULL WHERE "tags_eventtag"."issue_id" IN ('uuid')
UPDATE "tags_issuetag" SET "issue_id" = NULL WHERE "tags_issuetag"."issue_id" IN ('uuid')
```

(breaks the tests b/c of constraints and not always using factories; will fix next)
2025-07-03 11:33:58 +02:00
Klaas van Schelven
e5dbeae514 Issue.delete_deferred(): first version (WIP)
Implemented using a batch-wise dependency-scanner in delayed
(snappea) style.

* no tests yet.
* no real point-of-entry in the (regular, non-admin) UI yet.
* no hiding of Issues which are delete-in-progress from the UI
* file storage not yet cleaned up
* project issue counts not yet updated
* dangling tag values: no cleanup mechanism yet.

See #50
2025-06-27 12:52:59 +02:00
Klaas van Schelven
aad0f624f9 Fix: issue-list indexes must have project first
because we always filter by project before ordering;

the now-removed first_seen index was simply unused
2025-05-06 22:19:31 +02:00
Klaas van Schelven
49e6700d4a Grouping.grouping_key: hash it for the index 2025-05-06 11:32:19 +02:00
Klaas van Schelven
392f5a30be Add index for Grouping.grouping_key (and project) 2025-05-05 22:45:33 +02:00
Klaas van Schelven
a097e25310 issue.stored_event_count: consequences for 'irrelevance'
document & assert
2025-03-31 15:25:59 +02:00
Klaas van Schelven
9b1911aded Fix issue.stored_event_count for eviction/retention 2025-03-31 14:51:58 +02:00
Klaas van Schelven
381a5caae4 Issue.calculated_* fields: fix lengths
as in a717dd7374, but for Issue as well as Event.
The need for this was exposed by running the testsuite
against mysql; this commit fixes the tests.
2025-03-05 11:14:19 +01:00
Klaas van Schelven
10f8e10607 DB indexes for the issue-lits (including filters)
simply by reasoning about what they should be; no performance testing (on the issue-list
and on the event-ingestion) was done for these)
2025-02-18 10:32:06 +01:00
Klaas van Schelven
615d2da4c8 Chache stored_event_count (on Issue and Projet)
"possibly expensive" turned out to be "actually expensive". On 'emu', with 1.5M
events, the counts take 85 and 154 ms for Project and Issue respectively;
bottlenecking our digestion to ~3 events/s.

Note: this is single-issue, single-project (presumably, the cost would be lower
for more spread-out cases)

Note on indexes: Event already has indexes for both Project & Issue (though as
the first item in a multi-column index). Without checking further: that appears
to not "magically solve counting".

This commit also optimizes the .count() on the issue-detail event list (via
Paginator).

This commit also slightly changes the value passed as `stored_event_count` to
be used for `get_random_irrelevance` to be the post-evication value. That won't
matter much in practice, but is slightly more correct IMHO.
2025-02-06 16:24:25 +01:00
Klaas van Schelven
0b42d3ff1e Semi-manual squash-migrations
## Goal

Reduce the number of migrations for _fresh installs_ of Bugsink. This implies: squash as
broadly as possible.

## How?

"throw-away-and-rerun". In particular, for a given app:

* throw away the migrations from some starting point up until and including the last one.
* run "makemigrations" for that app. Django will see what's missing and just redo it
* rename to 000n_b_squashed or similar.
* manually set a `replaces` list on the migration to the just-removed migrations
* manually check dependencies; check that they are:
    * as low as possible, e.g. an FK should only depend on existence. this reduces the
      risk of circular dependencies.
    * pointing to "original migrations", i.e. not to a just-created squashed migration.
      because the squashed migrations "contain a lot" they increase the risk of circular
      dependencies.
* restore (git checkout) the thrown-away migration

## Further tips:

* "Some starting point" is often not 0000, but some higher number (see e.g. the outcome
  in the present commit). Leaving the migrations for creation of base models (Event,
  Issue, Project) in place saves you from a lot of circular dependency problems.
* Move db.sqlite3 out of the way to avoid superfluous warnings.

## RunPython worries

I grepped for RunPython in the replaced migrations, with the following results:

* phonehome's create_installation_id was copied-over to the squashed migration.
* all others where ignored, because:
    * they "do something with events", i.e. only when events are present will they have
      an effect. This means they are no-ops for _new installs_.
    * for existing installs, for any given app, they will only be missed (replaced) when
      the first replaced migration is not yet executed.

I used the following command (reading from the bottom) to establish that this means only
people that did a fresh install after 8ad6059722 (June 14, 2024), but before
c01d332e18 (July 16) _and then never did any upgrades_ would be affected. There are no
such people.

git log --name-only \
    events/migrations/0004_event_irrelevance_for_retention.py \
    issues/migrations/0004_rename_event_count_issue_digested_event_count.py \
    phonehome/migrations/0001_initial.py \
    projects/migrations/0002_initial.py \
    teams/migrations/0001_initial.py

Note that the above observation still be true for the next squashmigration (assuming
squashing starting at the same starting migrations).

## Cleanup of the replaced migrations

Django says:

> Once you’ve squashed your migration, you should then commit it alongside the
> migrations it replaces and distribute this change to all running instances of your
> application, making sure that they run migrate to store the change in their database.

Given that I'm not in control of all running instances of my application, this means the
cleanup must not happen "too soon", and only after announcing a migration path ("update
to version X before updating to version Y").

## Roads not taken

Q: Why not just do squashmigrations? A: It didn't work reliably (for me), presumably b/c
of the high number of strongly interdependant apps in combination with some RunPython.

Seen after I was mostly done, not explored seriously (yet):

* https://github.com/3YOURMIND/django-replace-migrations
* https://pypi.org/project/django-squash/
* https://django-extensions.readthedocs.io/en/latest/delete_squashed_migrations.html
2025-02-03 16:06:17 +01:00
Klaas van Schelven
0ec809cbb3 Simplify migration deps and document them 2025-02-03 14:04:44 +01:00
Klaas van Schelven
6497f482ae Correctly order Turningpoints (as per comment) 2024-12-16 22:04:03 +01:00
Klaas van Schelven
65ea181f37 vbc-unmute: reduce calls to the expensive check
as done in the previous commit for project quota
2024-07-17 15:33:15 +02:00
Klaas van Schelven
c01d332e18 Rename ingest_order to digest_order and clarify event_count
* issue.event_count to digested_event_count
* event.ingest_order to event.digest_order
* issue.ingest_order to digest_order

This is generally more correct/explicit, and is also in preparation
of doing work on-digest (which may or may not happen)
2024-07-16 15:23:40 +02:00
Klaas van Schelven
5e2cc0575f Retention, small fixes (from Friday) 2024-06-23 22:20:18 +02:00
Klaas van Schelven
8ad6059722 Complete migration reset 2024-06-14 10:29:10 +02:00
Klaas van Schelven
d2ba9b9ddb Add missing migration 2024-05-17 10:14:09 +02:00
Klaas van Schelven
280bd2172b History page: 'mostly done' (a first setup) 2024-04-12 16:07:25 +02:00
Klaas van Schelven
4dfefec468 denormalize/cache last_frame_* and transaction on Event and Issue
for performance, but also fixes:

* not just the 'last frame' but the 'last relevant frame' (in-app)
* truncation is properly done (matching the DB size, and for each of the fields)
2024-04-10 09:12:15 +02:00
Klaas van Schelven
d46cb7f6e8 DB: unique_together and PositiveIntegerField 2024-04-09 12:34:29 +02:00
Klaas van Schelven
21c4904524 Implement friendly_id 2024-04-09 11:09:31 +02:00
Klaas van Schelven
652823f8c3 Store calculated type and value on issue and event and use these values in the templates 2024-04-08 15:30:41 +02:00
Klaas van Schelven
48307daa0f Introduce 'Grouping' data-modeling 2024-04-08 11:41:15 +02:00
Klaas van Schelven
3aae32b54f blank=True; as implied by the default='' 2024-03-30 20:50:02 +01:00
Klaas van Schelven
28bf2f383e Store fixed_at/events_at newline-terminated
easier to do 'contains' on later
2024-03-20 19:12:30 +01:00
Klaas van Schelven
d5e9aa07ca Issue.fixed_at and Issue.events_at: bracketless
for easier qs-based updates (later/soon)
2024-03-20 17:42:59 +01:00
Klaas van Schelven
af8bff3799 unmute_after: implement the setting side
at least from the list-view
2024-03-08 20:29:38 +01:00
Klaas van Schelven
20361ce75a Date/issue_count correct in list_view 2024-02-20 17:49:48 +01:00
Klaas van Schelven
94661b4bb8 Swap FK event<->issue 2024-01-05 22:38:59 +01:00
Klaas van Schelven
bc849874f1 Missing migrations 2024-01-05 20:31:08 +01:00
Klaas van Schelven
99ac06a0d8 Releases, events, issues: WIP 2023-12-14 19:57:06 +01:00
Klaas van Schelven
725822ce3d Events: some modelling and a command to ingest JSONs from other projects as examples 2023-11-11 21:13:15 +01:00
Klaas van Schelven
972fd99697 Issue model introduced and used 2023-11-05 17:43:05 +01:00
Klaas van Schelven
1a5bf7d56c The ugliest thing that could get a stacktrace on screen 2023-11-04 21:02:04 +01:00