Skip to content

Fix async stop sometimes not properly collecting result #121843

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Feb 8, 2025

Conversation

smalyshev
Copy link
Contributor

@smalyshev smalyshev commented Feb 5, 2025

If the request finishes before the stop is called, but the result is not stored yet, we rely on async functionality to collect the response. I though that we are guaranteed that the data would be stored before the final handler is completed, but looks like it's not always the case. So this explicitly checks the task is finished before trying to collect.

Fixes #121249

@elasticsearchmachine
Copy link
Collaborator

Hi @smalyshev, I've created a changelog YAML for you.

@elasticsearchmachine
Copy link
Collaborator

Hi @smalyshev, I've updated the changelog YAML for you.

@smalyshev smalyshev marked this pull request as ready for review February 5, 2025 23:32
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Feb 5, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@dnhatn
Copy link
Member

dnhatn commented Feb 6, 2025

@smalyshev
I think the problem is that we wrap runAfter to remove the async listener too early:

.

This is because the listener does not include storing results to the index:

wrapStoringListener(searchTask, waitForCompletionTimeout, keepAlive, keepOnCompletion, listener)
.

We could avoid this issue if the runAfter wrap before dispatch to request to asyncTaskManagementService

). However, at that point, we don't yet have the async ID or executionInfo.

I'm a bit hesitant about this proposal in this PR because it would deduplicate the get-result logic more.

@smalyshev
Copy link
Contributor Author

@dnhatn We don't change much logic-wise - we fetch the task ID anyway, the only change is that we'd wait for it to finish on one of the branches. As you note, if we move listener wrap earlier, we'd have to also move execution info creation and it may cause other problems?

@smalyshev smalyshev added the auto-backport Automatically create backport pull requests when merged label Feb 7, 2025
Copy link
Member

@dnhatn dnhatn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One ask, but LGTM. Thanks @smalyshev.

}));
}

private EsqlQueryTask getEsqlQueryTask(AsyncExecutionId asyncId) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is related to security, can we expose and use AsyncTaskIndexService#getTaskAndCheckAuthentication instead of duplicating it here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I didn't notice this exists. This is even better.

@smalyshev smalyshev merged commit d11dad4 into elastic:main Feb 8, 2025
17 checks passed
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
8.18 Commit could not be cherrypicked due to conflicts
8.x Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 121843

smalyshev added a commit to smalyshev/elasticsearch that referenced this pull request Feb 8, 2025
* Fix async stop sometimes not properly collecting result

(cherry picked from commit d11dad4)

# Conflicts:
#	x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plugin/TransportEsqlQueryAction.java
@smalyshev
Copy link
Contributor Author

💚 All backports created successfully

Status Branch Result
8.x
9.0
8.18

Questions ?

Please refer to the Backport tool documentation

smalyshev added a commit to smalyshev/elasticsearch that referenced this pull request Feb 8, 2025
* Fix async stop sometimes not properly collecting result

(cherry picked from commit d11dad4)

# Conflicts:
#	muted-tests.yml
#	x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plugin/TransportEsqlQueryAction.java
smalyshev added a commit to smalyshev/elasticsearch that referenced this pull request Feb 8, 2025
* Fix async stop sometimes not properly collecting result

(cherry picked from commit d11dad4)
smalyshev added a commit to smalyshev/elasticsearch that referenced this pull request Feb 8, 2025
* Fix async stop sometimes not properly collecting result

(cherry picked from commit d11dad4)
elasticsearchmachine pushed a commit that referenced this pull request Feb 8, 2025
…22113)

* Fix async stop sometimes not properly collecting result

(cherry picked from commit d11dad4)

# Conflicts:
#	muted-tests.yml
elasticsearchmachine pushed a commit that referenced this pull request Feb 8, 2025
…22114)

* Fix async stop sometimes not properly collecting result

(cherry picked from commit d11dad4)
elasticsearchmachine pushed a commit that referenced this pull request Feb 8, 2025
…22115)

* Fix async stop sometimes not properly collecting result

(cherry picked from commit d11dad4)
@smalyshev smalyshev deleted the fix-async-stop-result branch February 10, 2025 16:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL auto-backport Automatically create backport pull requests when merged >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.18.1 v8.19.0 v9.0.1 v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CI] RemoteClusterSecurityEsqlIT testCrossClusterAsyncQueryStop failing
3 participants