I've done a full grep on Issue.objects, Project.objects and get_object_or_404
equivelents, and applying some common sense. The goal: avoid having
confusing/half-broken pages in the UI.
On index-usage: I've decided not to update the indexes. The assumption is:
`is_deleted` items will be a tiny minority of items in general, making the
cost/benefit analysis not turn out favorably (just scanning them out as a final
step is more efficient). Also: sqlite is able to use the correct index without
adding a special one, proof:
```
EXPLAIN QUERY PLAN SELECT [..] WHERE ("issues_issue"."project_id" = 1 AND "issues_issue"."is_muted" = (0) AND "issues_issue"."is_resolved" = (0)) ORDER BY "issues_issue"."last_seen" DESC LIMIT 250;
QUERY PLAN
`--SEARCH issues_issue USING INDEX issue_list_open (project_id=? AND is_resolved=? AND is_muted=?)
EXPLAIN QUERY PLAN SELECT [..] WHERE ("issues_issue"."project_id" = 1 AND "issues_issue"."is_muted" = (0) AND "issues_issue"."is_resolved" = (0) AND "issues_issue"."is_deleted" = 0) ORDER BY "issues_issue"."last_seen" DESC LIMIT 250;
QUERY PLAN
`--SEARCH issues_issue USING INDEX issue_list_open (project_id=? AND is_resolved=? AND is_muted=?)
```
See #139 for the 0/1 notation in the above.
(Project-indexes: not an issue, the scale is "below relevance for indexes")
Implement delete functionality with confirmation modals for projects.
cherry picked (by Klaas) from commit 6764fbf343fb; but:
* projects only
* `delete_deferred`
* flake8
See #84 for the original PR
Removes the following 2 redundant queries from the deletion process:
```
SELECT "tags_tagkey"."id" FROM "tags_tagkey" WHERE "tags_tagkey"."project_id" IN (1) ORDER BY "tags_tagkey"."project_id" ASC, "tags_tagkey"."id" ASC LIMIT 498
UPDATE "projects_project" SET "stored_event_count" = ("projects_project"."stored_event_count" - 1) WHERE "projects_project"."id" = 1
```
Like e45c61d6f0, but for .project.
I originally thought `SET_NULL` would be a good way to "do stuff later", but
that's only so the degree that [1] updates are cheaper than deletes and [2]
2nd-order effects (further deletes in the dep-tree) are avoided.
Now that we have explicit Project-deletion (deps-first, delayed, properly batched)
the SET_NULL behavior is always a no-op (but with cost in queries).
As a result, in the test for project deletion (which has deletes for many
of the altered models), the following 12 queries are no longer done:
```
SELECT "projects_project"."id", [..many fields..] FROM "projects_project" WHERE "projects_project"."id" = 1
DELETE FROM "projects_projectmembership" WHERE "projects_projectmembership"."project_id" IN (1)
DELETE FROM "alerts_messagingserviceconfig" WHERE "alerts_messagingserviceconfig"."project_id" IN (1)
UPDATE "releases_release" SET "project_id" = NULL WHERE "releases_release"."project_id" IN (1)
UPDATE "issues_issue" SET "project_id" = NULL WHERE "issues_issue"."project_id" IN (1)
UPDATE "issues_grouping" SET "project_id" = NULL WHERE "issues_grouping"."project_id" IN (1)
UPDATE "events_event" SET "project_id" = NULL WHERE "events_event"."project_id" IN (1)
UPDATE "tags_tagkey" SET "project_id" = NULL WHERE "tags_tagkey"."project_id" IN (1)
UPDATE "tags_tagvalue" SET "project_id" = NULL WHERE "tags_tagvalue"."project_id" IN (1)
UPDATE "tags_eventtag" SET "project_id" = NULL WHERE "tags_eventtag"."project_id" IN (1)
UPDATE "tags_issuetag" SET "project_id" = NULL WHERE "tags_issuetag"."project_id" IN (1)
```
Implemented using a batch-wise dependency-scanner in delayed
(snappea) style.
* no real point-of-entry in the (regular, non-admin) UI yet.
* no hiding of Projects which are delete-in-progress from the UI
* lack of DRY
* some unnessary work (needed in the Issue-context, but not here)
is still being done.
See #50
Fix#56
Looked into this for a while, but I think it was simply an oversight in
the logic-as-programmed; if you're part of a team, you should be able
to just click 'join' on any of that team's projects and be a project-member
somewhere, there's a case that such settings need a different level of
authorization; (just because you're the admin of the project doesn't mean you
should be able to do things that affect the system as a whole) but we don't
have such a level ATM.
"possibly expensive" turned out to be "actually expensive". On 'emu', with 1.5M
events, the counts take 85 and 154 ms for Project and Issue respectively;
bottlenecking our digestion to ~3 events/s.
Note: this is single-issue, single-project (presumably, the cost would be lower
for more spread-out cases)
Note on indexes: Event already has indexes for both Project & Issue (though as
the first item in a multi-column index). Without checking further: that appears
to not "magically solve counting".
This commit also optimizes the .count() on the issue-detail event list (via
Paginator).
This commit also slightly changes the value passed as `stored_event_count` to
be used for `get_random_irrelevance` to be the post-evication value. That won't
matter much in practice, but is slightly more correct IMHO.
Triggered by issue_event_list being more than 5s on "emu" (my 1,500,000 event
test-machine). Reason: sorting those events on non-indexed field. Switching
to a field-with-index solved it.
I then analysed (grepped) for "ordering" and "order_by" and set indexes
accordingly and more or less indiscriminately (i.e. even on tables that are
assumed to have relatively few rows, such as Project & Team).
## Goal
Reduce the number of migrations for _fresh installs_ of Bugsink. This implies: squash as
broadly as possible.
## How?
"throw-away-and-rerun". In particular, for a given app:
* throw away the migrations from some starting point up until and including the last one.
* run "makemigrations" for that app. Django will see what's missing and just redo it
* rename to 000n_b_squashed or similar.
* manually set a `replaces` list on the migration to the just-removed migrations
* manually check dependencies; check that they are:
* as low as possible, e.g. an FK should only depend on existence. this reduces the
risk of circular dependencies.
* pointing to "original migrations", i.e. not to a just-created squashed migration.
because the squashed migrations "contain a lot" they increase the risk of circular
dependencies.
* restore (git checkout) the thrown-away migration
## Further tips:
* "Some starting point" is often not 0000, but some higher number (see e.g. the outcome
in the present commit). Leaving the migrations for creation of base models (Event,
Issue, Project) in place saves you from a lot of circular dependency problems.
* Move db.sqlite3 out of the way to avoid superfluous warnings.
## RunPython worries
I grepped for RunPython in the replaced migrations, with the following results:
* phonehome's create_installation_id was copied-over to the squashed migration.
* all others where ignored, because:
* they "do something with events", i.e. only when events are present will they have
an effect. This means they are no-ops for _new installs_.
* for existing installs, for any given app, they will only be missed (replaced) when
the first replaced migration is not yet executed.
I used the following command (reading from the bottom) to establish that this means only
people that did a fresh install after 8ad6059722 (June 14, 2024), but before
c01d332e18 (July 16) _and then never did any upgrades_ would be affected. There are no
such people.
git log --name-only \
events/migrations/0004_event_irrelevance_for_retention.py \
issues/migrations/0004_rename_event_count_issue_digested_event_count.py \
phonehome/migrations/0001_initial.py \
projects/migrations/0002_initial.py \
teams/migrations/0001_initial.py
Note that the above observation still be true for the next squashmigration (assuming
squashing starting at the same starting migrations).
## Cleanup of the replaced migrations
Django says:
> Once you’ve squashed your migration, you should then commit it alongside the
> migrations it replaces and distribute this change to all running instances of your
> application, making sure that they run migrate to store the change in their database.
Given that I'm not in control of all running instances of my application, this means the
cleanup must not happen "too soon", and only after announcing a migration path ("update
to version X before updating to version Y").
## Roads not taken
Q: Why not just do squashmigrations? A: It didn't work reliably (for me), presumably b/c
of the high number of strongly interdependant apps in combination with some RunPython.
Seen after I was mostly done, not explored seriously (yet):
* https://github.com/3YOURMIND/django-replace-migrations
* https://pypi.org/project/django-squash/
* https://django-extensions.readthedocs.io/en/latest/delete_squashed_migrations.html
during user-testing, it was revealed that people think there is something
wrong when they see 'DivisionByZero' when this is in fact precisely what
was intended. Hopefully the new text removes this confusion
Fixes#7
I could not find any documentation on what the "standard" is, but I know that
we'll pick it up just fine either way (because Django's UUID field does the
magic for us).
Given that it's impossible to setup your JS client with the dashes, they should
simply be removed.
This was probably about the making Project.objects.get(id, sentry_key)
more efficient, but
* I don't have an indication it's a bottleneck
* It may very well be turned into a get-by-id, check-for-key idiom,
in which case the index won't help.
For now, sticking to "how it actually works" is more important than "what we
might need in the future". We don't have an admin-group with actually handed
out perms yet, and we do have the recommendation to create a superuser in the
installation docs. So let's just explicity check for that.
(I found these checks while working on actual breakage in the previous commit)
In the previous commit I put the code for a small performance-experiment.
The results are (very) obvious: don't do this. Response times go through
the roof, and more importantly, the server becomes unreliable. Reason:
time-outs caused by waiting for the write-lock.