-
Notifications
You must be signed in to change notification settings - Fork 25.4k
Apply default k for knn query eagerly #118774
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Apply default k for knn query eagerly #118774
Conversation
Pinging @elastic/es-search-relevance (Team:Search Relevance) |
Hi @benwtrent, I've created a changelog YAML for you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
I think we need to change the docs as well, to something like:
k
(Optional, integer) The number of nearest neighbors to return from each shard. Elasticsearch collects k results from each shard, then merges them to find the global top results. This value must be less than or equal to num_candidates. Defaults to the search request size.
💔 Backport failed
You can use sqren/backport to manually backport by running |
💚 All backports created successfully
Questions ?Please refer to the Backport tool documentation |
When originally added, the knn query didn't apply `top-k` restrictions to the query. Instead it would allow the resulting `num_candidate` to be combined with sibling queries without restricting to `top-size` results ahead of time. This honestly is confusing behavior and leads to some bugs in understand how it all works. This commit addresses this by eagerly gathering only `size` results when `k==null` before combining with other queries. To achieve the previous behavior, this can be done directly by setting `k==num_candidates` in the query. (cherry picked from commit c18b48d)
When originally added, the knn query didn't apply `top-k` restrictions to the query. Instead it would allow the resulting `num_candidate` to be combined with sibling queries without restricting to `top-size` results ahead of time. This honestly is confusing behavior and leads to some bugs in understand how it all works. This commit addresses this by eagerly gathering only `size` results when `k==null` before combining with other queries. To achieve the previous behavior, this can be done directly by setting `k==num_candidates` in the query. (cherry picked from commit c18b48d)
When originally added, the knn query didn't apply
top-k
restrictions to the query. Instead it would allow the resultingnum_candidate
to be combined with sibling queries without restricting totop-size
results ahead of time.This honestly is confusing behavior and leads to some bugs in understand how it all works.
This commit addresses this by eagerly gathering only
size
results whenk==null
before combining with other queries.To achieve the previous behavior, this can be done directly by setting
k==num_candidates
in the query.