-
Notifications
You must be signed in to change notification settings - Fork 251
Fix an inherent race in execution vs. destruction. #1150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
While I do like this cleanup in that it makes things much more consistent, I realized that there is (likely) another way to fix this. The handler function has a problem where the |
Just confirming that I did test this patch and it seemed to fix what I observed in #1147 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If one of those entities is destroyed after the collection but before we attempt to "take" it, then we can end up attempting to enter a Destroyable-derived class that has already been destroyed. The Destroyable will then raise an InvalidHandle error.
One thing to clarify, since the Handler
is given a reference to the original entity, we can't get to this case when the entity is garbage collected. It will only happen when the entity is explicitly destroyed.
Fix this by explicitly catching the InvalidHandle error that can be raised in all of the Destroyable-derived entities. If we do catch it, then we actually let the machinery continue but tell things to just not execute; in a subsequent executor iteration, the entity will be destroyed and hence not looked at anymore.
I think ignoring the work when InvalidHandle
is raised is the right idea.
The rclpy executor collects all of the entities in one pass, then creates async tasks for each of the ready ones and attempts to "take" and execute them. If one of those entities is destroyed after the collection but before we attempt to "take" it, then we can end up attempting to __enter__ a Destroyable-derived class that has already been destroyed. The Destroyable will then raise an InvalidHandle error. Fix this by explicitly catching the InvalidHandle error that can be raised in all of the Destroyable-derived entities. If we do catch it, then we actually let the machinery continue but tell things to just not execute; in a subsequent executor iteration, the entity will be destroyed and hence not looked at anymore. This seems to fix the race in my testing. Signed-off-by: Chris Lalancette <[email protected]>
Signed-off-by: Chris Lalancette <[email protected]>
b79dafa
to
55d1fe6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Since CI is green, this is approved, and this fixes a real user-reported bug, I'm going to go ahead and merge this. @sloretz if you have any further feedback please feel free to leave it and I'll address it in a follow-up PR. Thanks! |
* Fix an inherent race in execution vs. destruction. The rclpy executor collects all of the entities in one pass, then creates async tasks for each of the ready ones and attempts to "take" and execute them. If one of those entities is destroyed after the collection but before we attempt to "take" it, then we can end up attempting to __enter__ a Destroyable-derived class that has already been destroyed. The Destroyable will then raise an InvalidHandle error. Fix this by explicitly catching the InvalidHandle error that can be raised in all of the Destroyable-derived entities. If we do catch it, then we actually let the machinery continue but tell things to just not execute; in a subsequent executor iteration, the entity will be destroyed and hence not looked at anymore. This seems to fix the race in my testing. Signed-off-by: Chris Lalancette <[email protected]>
* Fix an inherent race in execution vs. destruction. The rclpy executor collects all of the entities in one pass, then creates async tasks for each of the ready ones and attempts to "take" and execute them. If one of those entities is destroyed after the collection but before we attempt to "take" it, then we can end up attempting to __enter__ a Destroyable-derived class that has already been destroyed. The Destroyable will then raise an InvalidHandle error. Fix this by explicitly catching the InvalidHandle error that can be raised in all of the Destroyable-derived entities. If we do catch it, then we actually let the machinery continue but tell things to just not execute; in a subsequent executor iteration, the entity will be destroyed and hence not looked at anymore. This seems to fix the race in my testing. Signed-off-by: Chris Lalancette <[email protected]> Co-authored-by: Chris Lalancette <[email protected]>
The rclpy executor collects all of the entities in one pass, then creates async tasks for each of the ready ones and attempts to "take" and execute them. If one of those entities is destroyed after the collection but before we attempt to "take" it, then we can end up attempting to enter a Destroyable-derived class that has already been destroyed. The Destroyable will then raise an InvalidHandle error.
Fix this by explicitly catching the InvalidHandle error that can be raised in all of the Destroyable-derived entities. If we do catch it, then we actually let the machinery continue but tell things to just not execute; in a subsequent executor iteration, the entity will be destroyed and hence not looked at anymore. This seems to fix the race in my testing.
Fixes #1147
@sloretz I'm particularly looking for input from you, since I think you have the best handle on what is going on in rclpy. Still a draft until I get that feedback.