-
Notifications
You must be signed in to change notification settings - Fork 7.8k
fix: no-op when signal handlers are called on threads not managed by … #9766
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This should also fix this issue: #5591 (comment) As expected, when Zend Signals is enabled, the caught signal is lost and not dispatched to other signal handlers that parent programs may have registered. It's still better than a segmentation fault, but not ideal. When Zend Signal is disabled, this patch also fixes the segfaults related to SIGPROF handling, but not this separate issue: https://bugs.php.net/bug.php?id=79464 In #5591, Zend Signals removal was considered, is it still the case? This may fix many issues related to signal handling in ZTS and embed builds. |
Looks good to me, but should we print a warning when Regarding Zend Signal, the PR seems to have staled because there is a significant performance impact in the original proposal, and race conditions in the optimized version. I ran some benchmarks with the original change, and there is indeed a negative impact: https://gist.github.com/arnaud-lb/32ae56eef8f2c85a144149fe7b8228af. A change could be accepted if it addressed the performance impact and the race conditions. |
ca08a15
to
0e41146
Compare
Message on stderr added (let me know if you think about a better message, mine is maybe a bit verbose). Regarding Zend Signals, do you have an idea of what's causing the performance regression? |
0e41146
to
1875b1e
Compare
Thank you! The error message is perfect. The performance regression must be caused by the extra sigprocmask() syscall in ZEND_SIGNAL_BLOCK_INTERRUPTIONS(). Before the change, ZEND_SIGNAL_BLOCK_INTERRUPTIONS() is just a variable increment. |
this should be asserted by a test |
Fixes phpGH-8789. Fixes phpGH-10015. This is one small part of the underlying bug for phpGH-10737, as in my attempts to reproduce the issue I constantly hit this crash easily. (The fix for the other underlying issue for that bug will follow soon.) It's possible that a signal arrives at a thread that never handled a PHP request before. This causes the signal globals to dereference a NULL pointer because the TSRM pointers for the thread aren't set up to point to the thread resources yet. PR phpGH-9766 previously fixed this for master by ignoring the signal if the thread didn't handle a PHP request yet. While this fixes the crash bug, I think the solution is suboptimal for 3 reasons: 1) The signal is ignored and a message is printed saying there is a bug. However, this is not a bug at all. For example in Apache, the signal set up happens on child process creation, and the thread resource creation happens lazily when the first request is handled by the thread. Hence, the fact that the thread resources aren't set up yet is not actually buggy behaviour. 2) I believe since it was believed to be buggy behaviour, that fix was only applied to master, so 8.1 & 8.2 keep on crashing. 3) We can do better than ignoring the signal. By just initialising the resources if they don't exist, in the very same way the request handler in Apache works, we can gracefully handle the signal.
Fixes phpGH-8789. Fixes phpGH-10015. This is one small part of the underlying bug for phpGH-10737, as in my attempts to reproduce the issue I constantly hit this crash easily. (The fix for the other underlying issue for that bug will follow soon.) It's possible that a signal arrives at a thread that never handled a PHP request before. This causes the signal globals to dereference a NULL pointer because the TSRM pointers for the thread aren't set up to point to the thread resources yet. PR phpGH-9766 previously fixed this for master by ignoring the signal if the thread didn't handle a PHP request yet. While this fixes the crash bug, I think the solution is suboptimal for 3 reasons: 1) The signal is ignored and a message is printed saying there is a bug. However, this is not a bug at all. For example in Apache, the signal set up happens on child process creation, and the thread resource creation happens lazily when the first request is handled by the thread. Hence, the fact that the thread resources aren't set up yet is not actually buggy behaviour. 2) I believe since it was believed to be buggy behaviour, that fix was only applied to master, so 8.1 & 8.2 keep on crashing. 3) We can do better than ignoring the signal. By just acting in the same way as if the signals aren't active. This means we need to take the same path as if the TSRM had already shut down.
…bals Fixes GH-8789. Fixes GH-10015. This is one small part of the underlying bug for GH-10737, as in my attempts to reproduce the issue I constantly hit this crash easily. (The fix for the other underlying issue for that bug will follow soon.) It's possible that a signal arrives at a thread that never handled a PHP request before. This causes the signal globals to dereference a NULL pointer because the TSRM pointers for the thread aren't set up to point to the thread resources yet. PR GH-9766 previously fixed this for master by ignoring the signal if the thread didn't handle a PHP request yet. While this fixes the crash bug, I think the solution is suboptimal for 3 reasons: 1) The signal is ignored and a message is printed saying there is a bug. However, this is not a bug at all. For example in Apache, the signal set up happens on child process creation, and the thread resource creation happens lazily when the first request is handled by the thread. Hence, the fact that the thread resources aren't set up yet is not actually buggy behaviour. 2) I believe since it was believed to be buggy behaviour, that fix was only applied to master, so 8.1 & 8.2 keep on crashing. 3) We can do better than ignoring the signal. By just acting in the same way as if the signals aren't active. This means we need to take the same path as if the TSRM had already shut down. Closes GH-10861.
Closes #9649.
Follows the strategy discussed here: #9649 (comment)