Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query cache interface implementation for timeseries & metrics views #1446

Merged
merged 30 commits into from
Dec 19, 2022

Conversation

egor-ryashin
Copy link
Contributor

No description provided.

@egor-ryashin
Copy link
Contributor Author

cc @AdityaHegde

@egor-ryashin egor-ryashin marked this pull request as ready for review December 14, 2022 17:19
@egor-ryashin
Copy link
Contributor Author

I've added caching for metrics view queries. It's complete. cc @AdityaHegde

@egor-ryashin egor-ryashin force-pushed the query-cache-interface-impl-timeseries branch from 3110831 to 40c7c27 Compare December 15, 2022 14:22
@egor-ryashin egor-ryashin changed the title Query cache interface implementation for timeseries Query cache interface implementation for timeseries & metrics views Dec 15, 2022
Copy link
Collaborator

@nishantmonu51 nishantmonu51 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified and tested the changes on a high level.
Performance wise most of the cache lookups resolve in ~10-15ms. that gives a very nice boost to query profiling.

Copy link
Collaborator

@nishantmonu51 nishantmonu51 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glanced through on a high level and tested the changes by running from branch, overall things seem to be working fine.
Left comments for few methods that seem generic and might be pulled into utility classes, feel free to ignore those comments.
Din't looked into the detailed API logic itself, as i believe that is majorly a refactor of the current logic.

* Importantly, this function runs very fast. For more information about the original M4 method,
* see http://www.vldb.org/pvldb/vol7/p797-jugel.pdf
*/
func (q *ColumnTimeseries) createTimestampRollupReduction( // metadata: DatabaseMetadata,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove commented metadata ?

return m
}

func convertToDateTruncSpecifier(specifier runtimev1.TimeGrain) string {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we need similar method in other APIs as well, If yes, we can move to timeutil

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it's only used for this particular query.

But for shared SQL-building util functions (i.e. that are used by multiple queries), they should be in a central place. Maybe in queries/sqlutil.go?


// valToPB converts any value to a google.protobuf.Value. It's similar to
// structpb.NewValue, but adds support for a few extra primitive types.
func valToPB(v any) (*structpb.Value, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like it belongs to protobuf util ? and might be reused?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is now replicated here and in runtime/server/query.go. Maybe put it (and related functions) in runtime/server/pbutil as ToValue(v any) (*structpb.Value, error)?

return res
}

func protobufValueToAny(val *structpb.Value) (any, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move to util ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to my comment about valToPB, this could become pbutil.FromValue(v *structpb.Value) (any, error)

Copy link
Contributor

@begelundmuller begelundmuller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only the questions around "ts" and tsAlias are important, the rest of the suggestions are nits. Once we address "ts", we can merge and fix other comments in a follow-up PRs.

In general, I found the logic in column_timeseries.go pretty intricate. It would be good to improve the structure and commenting in there to make it easier to understand. But I also know parts of it can be merged with the metrics views logic, so let's do that first.

@@ -99,7 +99,7 @@ type SamplePolicy struct {
func (s *Source) Validate() error {
connector, ok := Connectors[s.Connector]
if !ok {
return fmt.Errorf("connector: not found")
return fmt.Errorf("connector: not found " + s.Connector)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider doing return fmt.Errorf("connector: not found %q", s.Connector) (will insert value in quotes, so empty strings are clearer in output)

func (q *ColumnTimeseries) Key() string {
r, err := json.Marshal(q)
if err != nil {
panic(fmt.Errorf("ColumnTimeseries: failed to marshal: %w", err))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to change, just fyi: panics have stack traces and should mainly be dev errors (any error that's expected in production should be handled as an error object, e.g. returned to the user), so it's usually fine to just do panic(err)

return fmt.Errorf("not available for dialect '%s'", olap.Dialect())
}

timeRange, err := q.normaliseTimeRange(ctx, rt, instanceID, priority)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

normaliseTimeRange sounds like a util function, but it actually runs queries. Consider resolveNormalizedTimeRange or something like that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's subjective.

Comment on lines 73 to 74
}
var measures = normaliseMeasures(q.Measures)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: newline

Comment on lines +75 to +76
var timestampColumn = q.TimestampColumnName
var tableName = q.TableName
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove these (e.g. just use q.TableName directly)

Comment on lines +522 to +533
switch x := v.(type) {
case int32:
value.Records[k] = float64(x)
case int64:
value.Records[k] = float64(x)
case float32:
value.Records[k] = float64(x)
case float64:
value.Records[k] = x
default:
return nil, fmt.Errorf("unknown type %T ", v)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cast doesn't feel safe, since any measure expression can be passed. For now, we can add *big.Int case and we'll probably be safe. But when merging this query with the metrics view code, this should be addressed more generally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that requires changes on UI side.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would need to cast *big.Int to a float for UI (and result would not be exact if exceeding), but though it's bad practice, that's what frontend currently wants (we do it other places as well).

There are also int8, int16, uint8, uint16, and maybe other measure result types I'm not thinking of.


// valToPB converts any value to a google.protobuf.Value. It's similar to
// structpb.NewValue, but adds support for a few extra primitive types.
func valToPB(v any) (*structpb.Value, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is now replicated here and in runtime/server/query.go. Maybe put it (and related functions) in runtime/server/pbutil as ToValue(v any) (*structpb.Value, error)?

return res
}

func protobufValueToAny(val *structpb.Value) (any, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to my comment about valToPB, this could become pbutil.FromValue(v *structpb.Value) (any, error)

-- does not have that value.
SELECT
` + getCoalesceStatementsMeasures(measures) + `,
template.` + tsAlias + ` as ts from template
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it safe to select this using as ts given that a measure might be named ts? Shouldn't it also use tsAlias?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it will be safer.

Comment on lines 518 to 519
value.Ts = row["ts"].(time.Time).Format(IsoFormat)
delete(row, "ts")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this pass in and use tsAlias?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, because in the final output we replace tsAlias name to ts, please see

        -- does not have that value.
        SELECT 
          ` + getCoalesceStatementsMeasures(measures) + `,
          template.` + tsAlias + ` as ts from template
        LEFT OUTER JOIN series ON template.` + tsAlias + ` = series.` + tsAlias + `
        ORDER BY template.` + tsAlias + `

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if getCoalesceStatementsMeasures contains a measure named ts?

@egor-ryashin egor-ryashin merged commit 3584a6f into main Dec 19, 2022
@egor-ryashin egor-ryashin deleted the query-cache-interface-impl-timeseries branch December 19, 2022 11:54
@begelundmuller begelundmuller mentioned this pull request Dec 19, 2022
begelundmuller pushed a commit that referenced this pull request Dec 22, 2022
* addressing review comments

* addressing review comments

* addressing review comments

* addressing review comments

* addressing review comments

* addressing review comments

* addressing review comments

* addressing review comments

* merge fix

* addressing review comments

* addressing review comments

* addressing review comments

* addressing review comments

* addressing review comments

* addressing review comments

* addressing review comments

* addressing review comments

Co-authored-by: egor-ryashin <[email protected]>
djbarnwal pushed a commit that referenced this pull request Aug 3, 2023
…1446)


* caching: metricsview timeseries

Co-authored-by: egor-ryashin <[email protected]>
djbarnwal pushed a commit that referenced this pull request Aug 3, 2023
* addressing review comments

* addressing review comments

* addressing review comments

* addressing review comments

* addressing review comments

* addressing review comments

* addressing review comments

* addressing review comments

* merge fix

* addressing review comments

* addressing review comments

* addressing review comments

* addressing review comments

* addressing review comments

* addressing review comments

* addressing review comments

* addressing review comments

Co-authored-by: egor-ryashin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants