-
Notifications
You must be signed in to change notification settings - Fork 7.8k
Remove Zend signal handling #5591
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Can you please check how many sigprocmask syscalls we need per included file in opcache (assuming hot cache)? I'm fine with the general direction here on the assumption that switching to sigprocmask doesn't introduce undue overhead. cc @dstogov |
Zend signal handling allows signal masking/delaying without sigprocmask() syscalls and corresponding user<->kernel context switches. It would be good to check the performance impact of this removal on some application with big code base (many included files) and minimal execution time. |
You may consider using the Symfony Demo application for these performance tests. It's a real-world PHP app with lots of files and it's super easy to download and run it:
|
Certainly, I'll do so. |
Just trying to reproduce CI failures locally. While originally working on this code, I was green on Travis CI and Appveyor before making some final changes and opening the PR... but can't figure out just now what I did to break things. Still working on it. |
Travis CI failure is |
This exact same commit had also passed on Appveyor before: https://ci.appveyor.com/project/alexdowad/php-src/builds/32945360 |
Travis failure is spurous, AppVeyor failure is a failure on master, looks like file cache got broken. |
Thanks. I'll keep working to figure it out. Will proceed with performance assessment after this issue is clear. |
Hmm. Tried with current The funny thing is that on Windows, the macros which were removed all expanded to nothing, and the ones which have been added expand to Then there is also the fact that the exact same commit was green on a different Appveyor test run. I'm keeping my mind open to all possibilities, but trying more test runs with the same commit. |
@alexdowad To be clear, the AppVeyor failure is not related to your changes. Maybe @cmb69 can take a look at why this happens. |
When running simple test files at the CLI with OPCache enabled, the macros in this patch make 10 syscalls per script. |
Here is what I am using to measure
|
When I run the I'm running it with:
Any suggestions of better .INI settings to use, to make it a better test? |
Did a little microbenchmark calling |
@nikic, the test fails, because as of yesterday's changes JIT may be enabled for file_cache_only, while formerly it never was in that case, so |
@dstogov Can you comment on how file cache and JIT are supposed to interact? It probably does not make sense to cache JITed code in file cache right now (we probably don't generate PIC code), but shouldn't file cache still work for caching opcodes, even if JIT is enabled? |
7db5b99
to
3f04331
Compare
Currently, scripts are stored in file cache after they are already cached in SHM and JIT-ed, and unfortunately, file cache can't store opcode handlers overridden by JIT. So file caching is disabled. In general, it's possible to perform JIT after file caching, but this would require writing to file cache under exclusve lock (zend_shared_alloc_unlock). @nikic if you have ideas how to fix this, please let me know. |
f89b554
to
ae06ae7
Compare
OK, I think I have fixed the bugs now. Trying again to assess impact on performance. Generated 1000 random source files, each of which just does a few arithmetic ops. Then another which I may have been printing out the counter from the wrong place before. Now doing it from a shutdown function for opcache extension. On the test script which includes 1000 other source files, it shows 4010 syscalls. Just running it at the shell with Of course, I could set up a test harness to do it many times and run some stats on the results... though personally, it already seems fairly clear that the impact on performance is negligible... |
a72dee6
to
df45e01
Compare
Hopefully this should be ready to merge now. |
I don't understand the reason of this removal. Zend signals seem to work fine and implementation is not complex. Moving "HANDLE_BLOCK_INTERRUPTIONS from hot path" may cause race conditions. |
@dstogov Thanks for raising these good questions. The reason for submitting this PR is to (hopefully):
My reason for preferring smaller LOC (when the extra lines are not needed) is to make the codebase more maintainable, more readable, make it easier for new developers to get started, reduce the number of places where bugs can "hide", etc. As for config parameters, every added config parameter increases the number of possible combinations exponentially (2^n), which makes it impossible to test every possible build in CI. So reducing unneeded config parameters is also helpful. By itself, the removal of Zend signal handling would not be transformative, but I think there may be a lot of other places where the codebase can be simplified and trimmed down without losing functionality. Taken together, these simplifications could have a major impact. We are all aware that PHP has a large number of known bugs, and any reduction in complexity is a step towards getting that open bug count down (and keeping it down). If moving If we have users on rare flavors of Unix, I certainly wouldn't want to break things for them either. What did you feel would present such a risk? Do some of these Unixes not support |
Just did a bit of reading in the developer docs for AIX. They say that multi-threaded programs (using However, it looks like AIX also supports So unless I'm missing something, it is hard to see how this PR could break PHP on AIX. |
Maintainers, any further comments on this proposed change? I have another bugfix PR which is stalled waiting on this one (because I based it on this branch). However, if this change is not wanted, I can still rebase that other PR on |
ping @dstogov perhaps you can come with some more input here in response to @alexdowad As for the branch, you should always base your branch on the lowest target branch in |
@alexdowad I'm also for simplification, but zend signals is really not a complex peace of code. Removing it may cause breaks, new bugs and possible performance loss. |
@dstogov Fair enough. 😄 I'll close the PR. Thanks for reviewing. Just as a side point: please note that if removing ZS may cause bugs or breaks, that means that the With that in mind: How would you feel about removing the Zend signal handling depends on If the maintainers are willing to have a look, I can also submit some refactorings to Zend signal handling -- for example, correcting erroneous code comments, removing Thanks again for the review and for the comments already shared. |
Removing --{enable,disable}-zend-signals may make sense. |
@dstogov I'll give it a try. |
@dstogov, may I ask what the purpose of clearing SIGG(reset) during OPCache preloading is? Could anything "bad" happen if the Zend SH deferred handlers were installed during preloading? |
Preloading is done in context of "virtual" request at the end of PHP initialization. |
Hmm. This is interesting. The question I'm trying to figure out now is: If |
Another comment: After looking more at the issue of This means that as long as Zend SH is enabled, each new request will always install the deferred signal handlers. What problem could it cause if this is also done during preloading? I really can't see anything (but would love to be proved wrong). |
I suppose SIGG(reset) was introduced for "bad" extensions that might override signal handler during request processing. |
Aha... OK, this is interesting. Was the idea that we want to keep signal handlers set by "bad" extensions? Or that we want to replace them with the original handlers? |
I think we should reconsider this decision. The problem is that zend_signals requires us to control all code that is registering signals. However, we cannot really do this with 3rd party libraries we depend on. An instance of this I'm hitting right now, is that I'm seeing Zend signals related test failures in ext/readline using libedit, because apparently it registers some signal handlers if specific features are used. This is something we don't have control over. Doing this using sigprocmask instead will work even if signal handlers are installed by libraries. |
Yeah, good point. Note that the issue you have identified is still a problem if Zend SH is selected using a config parameter -- because users who install PHP as a binary package (with Zend SH built in) may have problems using libraries that expect to set their own signal handlers. I think the very fact that And IMHO the performance argument is also very weak. In the above microbenchmark done by @nikic, there was only a clear performance win from Zend SH when loading a million empty source files. I doubt that any PHP application ever written has ever done that. As such, if there are concerns that moving On the other hand, if the maintainers want to keep Zend SH, I have a series of about 30 commits ready which refactor it into very nice shape. |
I would be in favor of removing Zend Signals. In our usecase (ZTS embed) signals are not working very well with threads. Therefore, if there is no major benefit of keeping them, i would be in favor in removing them to resolve the current problems in ZTS embed environments. |
@paresy Could you please explain what issues you saw with ZTS and Zend signals? (Note that the timeout issue is unrelated, that's a general signal handling problem, regardless of whether Zend signals are used or not.) |
My app catches the SIGINT signal to make a proper shutdown. When Zend Signals is enabled the signal handlers somehow get messed up (or not properly restored, i can just assume due to some race condition while threading) and my own SIGINT handler will never be called. I can try to build a demo app if this would help. The same happens with other signals (if i recall correctly curl used some) which also got lost. This could be the reason why the curl extension is disabling all curl signal handling on ZTS. See here: Line 1861 in 653e4ea
This commit added CURLOPT_NOSIGNAL to the code base. d81f2e5 Unfortunately no bug report / further information is given. |
Hi from much less popular mainframe OS BS2000 (namely its POSIX subsystem). I second paresy. On recent php 7.4.8 zend signals break a lot of tests on BS2000 (when disabled ~600 tests pass additionally). Moreover with zend signals some phpdbg tests loop infinitely. I truss'ed one of them and the loop was:
I think it is very difficult if not impossible to implement universal zend signals which behave correctly on all kinds of signal system implementation. So please leave it at least configurable (--disable-zend-signals). |
Zend signal handling was added in PHP 5.4 to protect against signal handlers running at inopportune times and causing bugs. The details are in this RFC: https://wiki.php.net/rfc/zendsignals
In short, the handlers for 7 signals of concern are saved and replaced with a generic handler which delegates to the specific handlers. Inside 'critical sections', however, the generic handler puts all information regarding a signal on a queue and just returns. At the end of the critical section, all pending signals on the queue are processed and the specific handlers are called.
However, in PHP 7.1, the
vm_interrupt
flag was added which also protects against script execution timeouts, etc. occurring at wrong times. This eliminated most of the use cases of Zend signal handling. The one which has remained until now is accessing shared memory in OPCache. By eliminating the use of Zend signal handling there, there will be no need for Zend signal handling any more and a subsystem can be removed. This will make the codebase smaller and easier to understand.The funny thing about the whole idea of Zend signal handling is... it seems to duplicate what Unix kernels already do. Each process/thread in Unix already has a signal mask which can be used to block signals from being delivered at inopportune times. If a signal arrives when it is masked, the kernel will store it and only deliver it once it is unmasked. So rather than storing signals on a queue and unqueueing them later, we can just let the kernel to its job.