Skip to content

ESQL: Fix case insensitive comparisons to "" #127532

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Apr 30, 2025
Merged

Conversation

nik9000
Copy link
Member

@nik9000 nik9000 commented Apr 29, 2025

This fixes the compute engine side of case insensitive ==. You can trigger it by writing:

FROM foo
| WHERE TO_LOWER(field) == ""

But only when we can't push the comparison to lucene - like if field is not indexed or is a text field.

Closes #127431

This fixes the compute engine side of case insensitive ==. You can
trigger it by writing:
```
FROM foo
| WHERE TO_LOWER(field) == ""
```

But *only* when we can't push the comparison to lucene - like if `field`
is not indexed or is a `text` field.

Closes elastic#127431
@nik9000 nik9000 requested a review from luigidellaquila April 29, 2025 20:39
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Apr 29, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine
Copy link
Collaborator

Hi @nik9000, I've created a changelog YAML for you.

@@ -83,6 +85,10 @@ protected TypeResolution resolveType() {
}

public static Automaton automaton(BytesRef val) {
if (val.length == 0) {
// toCaseInsensitiveString doesn't match empty strings properly so let's do it ourselves
return Automata.makeEmptyString();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_search works around this as well with:

        // check if valueForSearch is the same as an empty string
        // if we have a length of zero, just do a regular term query
        if (valueForSearch.length == 0) {
            return termQuery(value, context);
        }

Which makes sense. And it's slightly faster. Here, instead of doing a term query I just make an automata for the empty string. We are already pushing a proper term query using the _search infrastructure when possible.

Copy link
Contributor

@luigidellaquila luigidellaquila left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! LGTM

@nik9000
Copy link
Member Author

nik9000 commented Apr 30, 2025

Good catch! LGTM

It was all randomized testing!

@nik9000 nik9000 merged commit 4494fdc into elastic:main Apr 30, 2025
17 checks passed
nik9000 added a commit to nik9000/elasticsearch that referenced this pull request May 28, 2025
This fixes the compute engine side of case insensitive ==. You can
trigger it by writing:
```
FROM foo
| WHERE TO_LOWER(field) == ""
```

But *only* when we can't push the comparison to lucene - like if `field`
is not indexed or is a `text` field.

Closes elastic#127431
nik9000 added a commit to nik9000/elasticsearch that referenced this pull request May 28, 2025
This fixes the compute engine side of case insensitive ==. You can
trigger it by writing:
```
FROM foo
| WHERE TO_LOWER(field) == ""
```

But *only* when we can't push the comparison to lucene - like if `field`
is not indexed or is a `text` field.

Closes elastic#127431
@nik9000
Copy link
Member Author

nik9000 commented May 28, 2025

Backporting: #128578 #128580

elasticsearchmachine pushed a commit that referenced this pull request May 28, 2025
This fixes the compute engine side of case insensitive ==. You can
trigger it by writing:
```
FROM foo
| WHERE TO_LOWER(field) == ""
```

But *only* when we can't push the comparison to lucene - like if `field`
is not indexed or is a `text` field.

Closes #127431
elasticsearchmachine pushed a commit that referenced this pull request May 28, 2025
This fixes the compute engine side of case insensitive ==. You can
trigger it by writing:
```
FROM foo
| WHERE TO_LOWER(field) == ""
```

But *only* when we can't push the comparison to lucene - like if `field`
is not indexed or is a `text` field.

Closes #127431
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.19.0 v9.0.1 v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CI] PushQueriesIT testPushCaseInsensitiveEqualityOnDefaults failing
4 participants