Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add support for percent of totals in MetricsViewAggregation #5348

Merged
merged 9 commits into from
Aug 12, 2024

Conversation

AdityaHegde
Copy link
Collaborator

@AdityaHegde AdityaHegde commented Jul 30, 2024

To add support for percent of totals based filter criteria in alerts, we need to add percent of totals as a variant of a measure.

This PR adds that support to MetricsViewAggregation by calculating totals and rewriting percent of totals with measure/<calculated_total>

@@ -389,6 +389,7 @@ message MetricsViewAggregationMeasure {
MetricsViewAggregationMeasureComputeComparisonValue comparison_value = 7;
MetricsViewAggregationMeasureComputeComparisonDelta comparison_delta = 8;
MetricsViewAggregationMeasureComputeComparisonRatio comparison_ratio = 9;
MetricsViewAggregationMeasurePercentOfTotal percent_of_total = 10;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: put "compute" in the name like for the other computed measure types: MetricsViewAggregationMeasureComputePercentOfTotal

@AdityaHegde AdityaHegde marked this pull request as draft July 30, 2024 13:23
@begelundmuller
Copy link
Contributor

On reflection, I think this might be simpler to solve with an additional query.

I think there might actually be a simple way to achieve that. The new metricsview package supports Query-level and AST-level rewrites. This might be achieved with a Query-level rewrite that looks for percent-of-total measures, and if it finds any, computes the total and rewrites the compute measure to contain the actual float value to divide by.

For inspiration, you can see this rewriter which turns relative time ranges into fixed time ranges, sometimes making a separate query to get the min/max timestamps of the metrics view:

// rewriteQueryTimeRanges rewrites the time ranges in the query to fixed start/end timestamps.
func (e *Executor) rewriteQueryTimeRanges(ctx context.Context, qry *Query, executionTime *time.Time) error {
tz := time.UTC

@AdityaHegde AdityaHegde marked this pull request as ready for review August 9, 2024 06:32
@AdityaHegde AdityaHegde force-pushed the adityahegde/percent-of-totals-backend branch from d72a32b to 87d22a5 Compare August 9, 2024 06:39
Comment on lines 383 to 384
Expression: fmt.Sprintf("%s*100/%f", a.dialect.EscapeIdentifier(m.Name), qm.Compute.PercentOfTotal.Total),
Type: runtimev1.MetricsViewSpec_MEASURE_TYPE_DERIVED,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I wonder if we could make it simple and inject the expression directly? I.e. use something like fmt.Sprintf("(%s)/%f", m.Expression, qm.Compute.PercentOfTotal.Total)? I think it would work even with subquery measures, at least select (select 100)/10 works in both DuckDB and Druid.
  2. Not sure, but maybe the %f should be %#f, which always adds a decimal point? Otherwise, if both operands are whole numbers, could it end up doing integer division? (Not sure about the SQL semantics here.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. We are calculating the measure value itself in a subquery since it might be used for comparison columns as well. So adding the expression here will recompute it wont it?
  2. Update to be safer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, if you think it will mostly be queried in requests that also query the plain measure value, then I agree the derived approach is better

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya. I would assume all instances of totals in dashboard right now can be replaced with this.

runtime/metricsview/executor.go Outdated Show resolved Hide resolved
runtime/metricsview/ast.go Outdated Show resolved Hide resolved
runtime/metricsview/query.go Outdated Show resolved Hide resolved
runtime/metricsview/executor.go Outdated Show resolved Hide resolved
Comment on lines 30 to 48
totalsQry := &Query{
MetricsView: qry.MetricsView,
Measures: measures,
TimeRange: qry.TimeRange,
Where: qry.Where,
TimeZone: qry.TimeZone,
}

e, err := NewExecutor(ctx, e.rt, e.instanceID, mv, sec, e.priority)
if err != nil {
return err
}
defer e.Close()

res, err := e.Query(ctx, totalsQry, nil)
if err != nil {
return err
}
defer res.Close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exactify rewriter does a couple things differently that I think we should also do here:

  1. Explicitly set all members of the Query{} struct to reduce chance of forgetting to handle one if it's changed in the future
  2. Explicitly call NewAST and e.olap.Execute to a) prevent any chance infinite recursion, b) only apply the rewrites that make sense in this context (I think that would be only rewriteApproxComparisons and rewriteDruidGroups)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this not add some complexity? Any new rewrite need to be added in multiple places and the author needs to be aware of it. Not sure if it can be solved as is right now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it's a little annoying, but I think it's best to explicitly add the rewrites each time at this point. To avoid recursion dangers and because the rewrites are a very mixed bag (some require a specific order of other rewrites, some are expensive to run, some don't make sense in the context of certain calls).

If we end up adding more rewrites, it may make sense to refactor how they're configured to avoid the repetition.

runtime/metricsview/executor_rewrite_percent_of_totals.go Outdated Show resolved Hide resolved
@AdityaHegde AdityaHegde force-pushed the adityahegde/percent-of-totals-backend branch from 70f31ac to 5dd3ae8 Compare August 9, 2024 12:35
@AdityaHegde AdityaHegde merged commit acf8195 into main Aug 12, 2024
7 checks passed
@AdityaHegde AdityaHegde deleted the adityahegde/percent-of-totals-backend branch August 12, 2024 05:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants