-
Notifications
You must be signed in to change notification settings - Fork 25.4k
[ES|QL] Support some stats on aggregate_metric_double #120343
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ES|QL] Support some stats on aggregate_metric_double #120343
Conversation
a3123a7
to
c12cf21
Compare
Adds support for min, max, sum, and count
c12cf21
to
ae16694
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Larisa this look good! I left a few comments.
About the failing yamlRestCompatTest
test suite, the following error is returned by aggregate double metric field mapper: Must have all subfields to use aggregate double metric in ESQL
The yamlRestCompatTest test suite runs 8.x versions of the same yaml test against current main in this branch. This error is now returned, because before only min and max metric were configured to be stored. This fails the assumption in aggregate double metric field mapper. Maybe be less strict here (see comment in AggregateDoubleMetricFieldMapper)?
x-pack/plugin/esql-core/src/main/java/org/elasticsearch/xpack/esql/core/type/DataType.java
Outdated
Show resolved
Hide resolved
.../compute/src/main/java/org/elasticsearch/compute/data/AggregateDoubleMetricBlockBuilder.java
Outdated
Show resolved
Hide resolved
...gin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/aggregate/Count.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/planner/AggregateMapper.java
Show resolved
Hide resolved
x-pack/plugin/esql/compute/src/main/java/org/elasticsearch/compute/data/CompositeBlock.java
Outdated
Show resolved
Hide resolved
...g/elasticsearch/xpack/esql/expression/function/scalar/convert/FromAggregateDoubleMetric.java
Outdated
Show resolved
Hide resolved
...in/java/org/elasticsearch/xpack/aggregatemetric/mapper/AggregateDoubleMetricFieldMapper.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left some comments, but the approach looks good. Thanks Larisa!
x-pack/plugin/esql/compute/src/main/java/org/elasticsearch/compute/data/CompositeBlock.java
Outdated
Show resolved
Hide resolved
.../compute/src/main/java/org/elasticsearch/compute/data/AggregateDoubleMetricBlockBuilder.java
Outdated
Show resolved
Hide resolved
.../compute/src/main/java/org/elasticsearch/compute/data/AggregateDoubleMetricBlockBuilder.java
Outdated
Show resolved
Hide resolved
...g/elasticsearch/xpack/esql/expression/function/scalar/convert/FromAggregateDoubleMetric.java
Outdated
Show resolved
Hide resolved
...in/java/org/elasticsearch/xpack/aggregatemetric/mapper/AggregateDoubleMetricFieldMapper.java
Outdated
Show resolved
Hide resolved
0064bc5
to
a5be73b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some comments and questions. Thanks Larisa for iterating on this.
.../compute/src/main/java/org/elasticsearch/compute/data/AggregateDoubleMetricBlockBuilder.java
Outdated
Show resolved
Hide resolved
.../compute/src/main/java/org/elasticsearch/compute/data/AggregateDoubleMetricBlockBuilder.java
Outdated
Show resolved
Hide resolved
.../compute/src/main/java/org/elasticsearch/compute/data/AggregateDoubleMetricBlockBuilder.java
Outdated
Show resolved
Hide resolved
.../compute/src/main/java/org/elasticsearch/compute/data/AggregateDoubleMetricBlockBuilder.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/esql/compute/src/main/java/org/elasticsearch/compute/data/CompositeBlock.java
Outdated
Show resolved
Hide resolved
.../compute/src/main/java/org/elasticsearch/compute/data/AggregateDoubleMetricBlockBuilder.java
Outdated
Show resolved
Hide resolved
.../compute/src/main/java/org/elasticsearch/compute/data/AggregateDoubleMetricBlockBuilder.java
Outdated
Show resolved
Hide resolved
@@ -233,6 +233,17 @@ private static Block constantBlock(BlockFactory blockFactory, ElementType type, | |||
case BYTES_REF -> blockFactory.newConstantBytesRefBlockWith(toBytesRef(val), size); | |||
case DOUBLE -> blockFactory.newConstantDoubleBlockWith((double) val, size); | |||
case BOOLEAN -> blockFactory.newConstantBooleanBlockWith((boolean) val, size); | |||
case COMPOSITE -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, a composite type can be more than aggregated_metric_double? Can we just leave it unsupported here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to add this to be able to support the unit tests and wasn't really sure of a way to work around it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If composite can be more than just aggregated_metric_double
, then should aggregated_metric_double
have its own element type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure it really needs it's own type. Maybe there's something funny around constants though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we instanceof
on the val
and do an AggregateMetricDouble if it's one of those constants. Again, this feels like it's the kind of thing we'd use for just tests and ROW. Which is ok.
|
||
package org.elasticsearch.compute.data; | ||
|
||
public class AggregateMetricDoubleLiteral { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did we introduce this because of tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, there's a couple of other places in tests where the need for something like this also popped up in the first attempt at implementing aggregate_metric_double
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I'm misreading, but is this only needed for tests? If so should this be moved to the test sources?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it's only used in the tests so far but one function that required it is outside of the test code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That code was originally for ROW
but we've used it more in tests since. I think, conceptually at least, we could use this thing for ROW
support for aggregate metric double. It's a lot more convenient than a Map
representation or something.
...sticsearch/xpack/esql/expression/function/scalar/convert/FromAggregateDoubleMetricTests.java
Outdated
Show resolved
Hide resolved
|
||
package org.elasticsearch.compute.data; | ||
|
||
public class AggregateMetricDoubleLiteral { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I'm misreading, but is this only needed for tests? If so should this be moved to the test sources?
@@ -233,6 +233,17 @@ private static Block constantBlock(BlockFactory blockFactory, ElementType type, | |||
case BYTES_REF -> blockFactory.newConstantBytesRefBlockWith(toBytesRef(val), size); | |||
case DOUBLE -> blockFactory.newConstantDoubleBlockWith((double) val, size); | |||
case BOOLEAN -> blockFactory.newConstantBooleanBlockWith((boolean) val, size); | |||
case COMPOSITE -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If composite can be more than just aggregated_metric_double
, then should aggregated_metric_double
have its own element type?
x-pack/plugin/src/yamlRestTest/resources/rest-api-spec/test/esql/40_unsupported_types.yml
Show resolved
Hide resolved
@@ -501,4 +503,10 @@ interface SingletonOrdinalsBuilder extends Builder { | |||
*/ | |||
SingletonOrdinalsBuilder appendOrd(int value); | |||
} | |||
|
|||
interface AggregateMetricDoubleBuilder extends Builder { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting! This is not something in server
but we have to add it here for this. Huh. I suppose that's ok.
server/src/main/java/org/elasticsearch/index/mapper/BlockLoader.java
Outdated
Show resolved
Hide resolved
.../esql/compute/src/main/java/org/elasticsearch/compute/data/AggregateMetricDoubleLiteral.java
Outdated
Show resolved
Hide resolved
|
||
package org.elasticsearch.compute.data; | ||
|
||
public class AggregateMetricDoubleLiteral { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That code was originally for ROW
but we've used it more in tests since. I think, conceptually at least, we could use this thing for ROW
support for aggregate metric double. It's a lot more convenient than a Map
representation or something.
@@ -233,6 +233,17 @@ private static Block constantBlock(BlockFactory blockFactory, ElementType type, | |||
case BYTES_REF -> blockFactory.newConstantBytesRefBlockWith(toBytesRef(val), size); | |||
case DOUBLE -> blockFactory.newConstantDoubleBlockWith((double) val, size); | |||
case BOOLEAN -> blockFactory.newConstantBooleanBlockWith((boolean) val, size); | |||
case COMPOSITE -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure it really needs it's own type. Maybe there's something funny around constants though?
case UNSUPPORTED, OBJECT, DOC_DATA_TYPE, TSID_DATA_TYPE, PARTIAL_AGG -> throw new IllegalArgumentException( | ||
"can't make random values for [" + type.typeName() + "]" | ||
); | ||
case UNSUPPORTED, OBJECT, DOC_DATA_TYPE, TSID_DATA_TYPE, PARTIAL_AGG, AGGREGATE_METRIC_DOUBLE -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At some point we'll have to be able to make random AggregateMetricDoubles. But later is fine.
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/action/PositionToXContent.java
Show resolved
Hide resolved
...g/elasticsearch/xpack/esql/expression/function/scalar/convert/FromAggregateMetricDouble.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/mapper/BlockLoader.java
Outdated
Show resolved
Hide resolved
...in/java/org/elasticsearch/xpack/aggregatemetric/mapper/AggregateDoubleMetricFieldMapper.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy when the others are happy.
@@ -233,6 +233,17 @@ private static Block constantBlock(BlockFactory blockFactory, ElementType type, | |||
case BYTES_REF -> blockFactory.newConstantBytesRefBlockWith(toBytesRef(val), size); | |||
case DOUBLE -> blockFactory.newConstantDoubleBlockWith((double) val, size); | |||
case BOOLEAN -> blockFactory.newConstantBooleanBlockWith((boolean) val, size); | |||
case COMPOSITE -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we instanceof
on the val
and do an AggregateMetricDouble if it's one of those constants. Again, this feels like it's the kind of thing we'd use for just tests and ROW. Which is ok.
@@ -141,6 +144,9 @@ protected TypeResolution resolveType() { | |||
public Expression surrogate() { | |||
var s = source(); | |||
var field = field(); | |||
if (field.dataType() == DataType.AGGREGATE_METRIC_DOUBLE) { | |||
return new Sum(s, FromAggregateMetricDouble.withMetric(source(), field, AggregateMetricDoubleBlockBuilder.Metric.COUNT)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a really clear explanation of what this field is - COUNT(aggregate_metric_double)
will is a SUM of the preaggregated counts. It's really quite educational.
It does ask "should we have methods to pick apart the sub-fields?" like GET_COUNT(aggregate_metric_double)
. Or something. Not now, but eventually? Like, if we want to treat the field like it's just a container of numbers.
Which is odd. We don't have container type fields until this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, you already wrote it and called it FromAggregateMetricDouble
. Of course! Do we plug it in? It looks like not. That's fine and good. But we should talk about if it's appropriate to expose it in the language as a function one day.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We plugged it in, but it is not yet exposed at the language level. I think we should consider exposing this method so that users can retrieve values from aggregate_metrics_double individually, as we currently don't have a good way to return all values at once. Let's address this in a follow-up.
@@ -372,7 +372,7 @@ private PhysicalOperation planTopN(TopNExec topNExec, LocalExecutionPlannerConte | |||
case GEO_POINT, CARTESIAN_POINT, GEO_SHAPE, CARTESIAN_SHAPE, COUNTER_LONG, COUNTER_INTEGER, COUNTER_DOUBLE, SOURCE -> | |||
TopNEncoder.DEFAULT_UNSORTABLE; | |||
// unsupported fields are encoded as BytesRef, we'll use the same encoder; all values should be null at this point | |||
case PARTIAL_AGG, UNSUPPORTED -> TopNEncoder.UNSUPPORTED; | |||
case PARTIAL_AGG, UNSUPPORTED, AGGREGATE_METRIC_DOUBLE -> TopNEncoder.UNSUPPORTED; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This means you can't SORT
if one of these is used later. That's fine for now, but may not be appropriate later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think that's something we want to support later on/not in this PR
and might require some discussion since I feel like it's not clear what we would sort on exactly
@@ -288,7 +291,7 @@ public AggregateDoubleMetricFieldType(String name) { | |||
} | |||
|
|||
public AggregateDoubleMetricFieldType(String name, Map<String, String> meta, MetricType metricType) { | |||
super(name, true, false, false, TextSearchInfo.SIMPLE_MATCH_WITHOUT_TERMS, meta); | |||
super(name, true, false, true, TextSearchInfo.SIMPLE_MATCH_WITHOUT_TERMS, meta); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Were we just incorrectly saying "no, this doesn't have doc values" before? That feels silly.
} | ||
} | ||
|
||
private void copyDoubleValuesToBuilder(Docs docs, BlockLoader.DoubleBuilder builder, NumericDocValues values) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose you could reuse one of the other builders somehow maybe? Like the one that numerics uses for doubles. It has code just like this, right? Too tricky to share?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work. Thanks Larisa!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
private final Double sum; | ||
private final Integer count; | ||
|
||
public AggregateMetricDoubleLiteral(Double min, Double max, Double sum, Integer count) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can be turned into a record
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think it can; I just did that and moved it to be under AggregateMetricDoubleBlockBuilder
💔 Backport failed
You can use sqren/backport to manually backport by running |
Adds non-grouping support for min, max, sum, and count, using CompositeBlock as the underlying block type and an internal FromAggregateMetricDouble function to handle converting from CompositeBlock to the correct metric subfields. Closes elastic#110649
) Adds non-grouping support for min, max, sum, and count, using CompositeBlock as the underlying block type and an internal FromAggregateMetricDouble function to handle converting from CompositeBlock to the correct metric subfields. Closes #110649
@@ -212,6 +212,9 @@ private static Stream<AggDef> groupingAndNonGrouping(Tuple<Class<?>, Tuple<Strin | |||
if (tuple.v1().isAssignableFrom(Rate.class)) { | |||
// rate doesn't support non-grouping aggregations | |||
return Stream.of(new AggDef(tuple.v1(), tuple.v2().v1(), tuple.v2().v2(), true)); | |||
} else if (tuple.v2().v1().equals("AggregateMetricDouble")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a bit puzzled by this condition.
I think (related pr: #121542) corresponds to extra configs
. I do not think they are ever set to AggregateMetricDouble
.
I also do not think we enter this branch in CsvTests.
Could you please help me understand when this is happening?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, it looks like we define case AGGREGATE_METRIC_DOUBLE -> "AggregateMetricDouble";
below, but I do not think dataTypeToString
is called when building AggDef
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're correct we don't enter this branch; this PR was a 2nd/3rd iteration on adding aggregate metric double to ES|QL and it was necessary in an older iteration, and I mistakenly left it in in this PR. I plan to remove this in the next phase/PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
Adds non-grouping support for min, max, sum, and count, using
CompositeBlock as the underlying block type and an internal
FromAggregateMetricDouble function to handle converting from
CompositeBlock to the correct metric subfields.
Closes #110649