Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(group_model): optimize group.filter_to_team query to not do a full table scan #88393

Merged
merged 4 commits into from
Apr 1, 2025

Conversation

MichaelSun48
Copy link
Member

@MichaelSun48 MichaelSun48 commented Mar 31, 2025

Fixes SENTRY-2HWB

Optimizes the group manager's filter_to_team query to not do a full table scan on the GroupAssignee table.

It seems like Postgres was unable to use a combination of the team and user indexes when filtering for teams and users, and instead used the Group index, which caused a full table scan.

@MichaelSun48 MichaelSun48 requested a review from JoshFerge March 31, 2025 21:28
@MichaelSun48 MichaelSun48 requested a review from a team as a code owner March 31, 2025 21:28
@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Mar 31, 2025
@JoshFerge JoshFerge requested a review from a team March 31, 2025 21:30
@JoshFerge
Copy link
Member

Fixes SENTRY-2HWB

@JoshFerge
Copy link
Member

should also fix SENTRY-3JZW

Q(team=team) | Q(user_id__in=user_ids)
).values_list("group_id", flat=True)

assigned_groups = GroupAssignee.objects.filter(team=team).values_list(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think if you do

assigned_groups = (
    GroupAssignee.objects.filter(team=team) | 
    GroupAssignee.objects.filter(user_id__in=user_ids)
).values_list("group_id", flat=True)

it can do it in one query instead of two?

Comment on lines 511 to 514
assigned_groups = (
GroupAssignee.objects.filter(team=team)
| GroupAssignee.objects.filter(user_id__in=user_ids)
).values_list("group_id", flat=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

print((GroupAssignee.objects.filter(user_id__in=[2,3,4]) | GroupAssignee.objects.filter(team_id=1)).values_list("group_id", flat=True).query)

produces:

SELECT "sentry_groupasignee"."group_id" 
FROM "sentry_groupasignee" 
WHERE ("sentry_groupasignee"."team_id" = 1 OR "sentry_groupasignee"."user_id" IN (2, 3, 4))`

Which is the same query as before, I think?

The problem you're trying to avoid is the OR I think, since it's breaking index usage? You probably want a union

print(GroupAssignee.objects.filter(user_id__in=[2,3,4]).union(GroupAssignee.objects.filter(team_id=1)).values_list("group_id", flat=True).query)

produces

(
    SELECT "sentry_groupasignee"."group_id" AS "col1" 
    FROM "sentry_groupasignee" 
    WHERE "sentry_groupasignee"."user_id" IN (2, 3, 4)
) 
UNION 
(
    SELECT "sentry_groupasignee"."group_id" AS "col1" 
    FROM "sentry_groupasignee" 
    WHERE "sentry_groupasignee"."team_id" = 1
)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, i thought .union and | were equivalent.

Copy link
Member

@JoshFerge JoshFerge Mar 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JoshFerge JoshFerge merged commit e6c31b5 into master Apr 1, 2025
47 checks passed
@JoshFerge JoshFerge deleted the msun/optimizeFilterToTeamQuery branch April 1, 2025 18:15
MichaelSun48 added a commit that referenced this pull request Apr 2, 2025
andrewshie-sentry pushed a commit that referenced this pull request Apr 8, 2025
…l table scan (#88393)

Fixes SENTRY-2HWB

Optimizes the group manager's filter_to_team query to not do a full
table scan on the GroupAssignee table.

It seems like Postgres was unable to use a combination of the team and
user indexes when filtering for teams and users, and instead used the
Group index, which caused a full table scan.

---------

Co-authored-by: Josh Ferge <[email protected]>
andrewshie-sentry pushed a commit that referenced this pull request Apr 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Scope: Backend Automatically applied to PRs that change backend components
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants