Skip to content

fix: Race conditions and performance issues #237

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Sep 14, 2021
Merged

Conversation

dpcollins-google
Copy link
Collaborator

There are two main retrying_connection race conditions fixed here:

  1. Improper handling of cancelled write tasks can cause set_exception to be called when the task is already cancelled, which raises an InvalidStateError which is never caught by the existing code.
  2. There is a race where if reinitialize() is called after queues are cycled, meaning a poller from the old instance of the class can add a message to the new queues. This has been fixed by splitting the ConnectionReinitializer interface into "stop_processing" and "reinitialize" parts.

Also fix other performance issues identified in profiles.

Something close to this code has been running for an hour or two on compute engine with no hangs.

There are two main retrying_connection race conditions fixed here:

1) Improper handling of cancelled write tasks can cause set_exception to be called when the task is already cancelled, which raises an InvalidStateError which is never caught by the existing code.
2) There is a race where if reinitialize() is called after queues are cycled, meaning a poller from the old instance of the class can add a message to the new queues. This has been fixed by splitting the ConnectionReinitializer interface into "stop_processing" and "reinitialize" parts.

Also fix other performance issues identified in profiles.
There are two main retrying_connection race conditions fixed here:

1) Improper handling of cancelled write tasks can cause set_exception to be called when the task is already cancelled, which raises an InvalidStateError which is never caught by the existing code.
2) There is a race where if reinitialize() is called after queues are cycled, meaning a poller from the old instance of the class can add a message to the new queues. This has been fixed by splitting the ConnectionReinitializer interface into "stop_processing" and "reinitialize" parts.

Also fix other performance issues identified in profiles.
@dpcollins-google dpcollins-google requested review from a team as code owners September 14, 2021 18:33
@product-auto-label product-auto-label bot added the api: pubsublite Issues related to the googleapis/python-pubsublite API. label Sep 14, 2021
@google-cla google-cla bot added the cla: yes This human has signed the Contributor License Agreement. label Sep 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: pubsublite Issues related to the googleapis/python-pubsublite API. cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants