-
Notifications
You must be signed in to change notification settings - Fork 7.8k
Add FPM early bootstrapping mode #6772
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@bukka you probably know a few additional problems that I haven't thought of yet. |
@beberlei love this! Regarding these two questions, I would personally suggest introducing new PHP ini settings for each to control if they should be buffered and output by the FPM child workers, much like display_startup_errors, with the default being for them to simply be discarded.
|
Just to clear out my ignorance, how would this interact with opcache.preload? |
Ben, Scenario: you have this viotstrapping code in memory, and then do a release and want to reload it this bootstrap. What's the DX here for doing so Have you thought about race conditions, of requests in flight when you flush it. Framework bootstrap in here will have to be linked to app code just deployed |
@Slamdunk i added an explanation of the relation to Opcache Preloading to the orginal issue, the question was asked a few times outside this ticket as well. |
@dragoonis Similarly to Preloading you need to reload the PHP-FPM process to be safe. Otherwise you are right, you could run the primary script at the current release with a boot script that was loaded in the previous release. Good catch! |
@beberlei I think this is a very nice idea! My feedback on some of the open points that you mentioned:
In case of error/exception I think we should abort the execution of FPM. If I understood the proposal, the FPM executes the bootstrap file and store it in memory to be ready for all the upcoming requests. If the code generate an error this should be noticed in advance.
We can store the output in a log file. In this way we can have a feedback from the execution of the bootstrap.
The memory consumed by the bootsrap file should not be counted for the
The bootstrap file should follow the same rule and restriction configured in the PHP setting. I think the I was thinking that can be useful to offer also a |
Just leaving this links here in which mnapoli visualized the idea https://twitter.com/matthieunapoli/status/1371380248558907393 |
Shouldn't it be optimized (or some parts cached...) instead of displacing the problem in an hidden place as this feature seems to do ? Moreover if 100ms is too much, the waste of resources will still be present despite being hidden in some circumstances. I'm afraid it would be easy to lose sight of this issue and eventually end up with our back to the wall when the 100ms would become 200ms or during a peak traffic period. This feature could also make debugging / tracing more difficult. For instance pre-request code will not be able to add to logging context the request related information. |
I'm curious, what can make the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea in general. The tricky part is to get all edge cases sorted out because it's changing some expectation for the process managers.
It might also make sense to introduce some time limit option as for some less busy apps and for example static pm the delay between bootstrap and execution of the script could be quite big. So maybe something that would re-bootstrap if it took too long and code is sensitive. Not sure about the use case though as the only thing that came to my mind is opening connection which should not probably be done in bootstrap anyway. Mainly wanted to point out that there might be a significant delay.
There's probably more things that I haven't considered yet.
@@ -1392,6 +1393,7 @@ PHP_INI_BEGIN() | |||
STD_PHP_INI_BOOLEAN("fastcgi.logging", "1", PHP_INI_SYSTEM, OnUpdateBool, fcgi_logging, php_cgi_globals_struct, php_cgi_globals) | |||
STD_PHP_INI_ENTRY("fastcgi.error_header", NULL, PHP_INI_SYSTEM, OnUpdateString, error_header, php_cgi_globals_struct, php_cgi_globals) | |||
STD_PHP_INI_ENTRY("fpm.config", NULL, PHP_INI_SYSTEM, OnUpdateString, fpm_config, php_cgi_globals_struct, php_cgi_globals) | |||
STD_PHP_INI_ENTRY("fpm.bootstrap_file", NULL, PHP_INI_SYSTEM, OnUpdateString, fpm_bootstrap, php_cgi_globals_struct, php_cgi_globals) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All FPM configuration should go to the FPM config and it should be pool specific in this case. This will be more consistent with other settings like status path for example.
while (1) { | ||
// moving this before init_request_info will break with dtrace | ||
// support in php_request_startup(), can we remove? | ||
if (UNEXPECTED(fpm_bootstrapped && php_request_startup() == FAILURE)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: you should probably move this to the condition below (no need to check fpm_bootstrapped
twice)
if (fpm_bootstrapped) { | ||
zend_stream_init_filename(&bootstrap_file, CGIG(fpm_bootstrap)); | ||
|
||
if (zend_execute_scripts(ZEND_REQUIRE, NULL, 1, &bootstrap_file) == FAILURE) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So here is the main missing piece that I see from my quick review. You are starting potentially longer work without changing the request_stage
(updating scoreboard). That might result into various problems with dynamic
and ondemand
process managers because FPM master will think that there is no work going on and consider the process as idle. It means it might not scale properly which might be problematic. It will probably cause a shorter clean up for ondemand as it bases last idle time on the end of request currently. I think it would make sense to introduce a new request stage but it probably requires a bit more thinking and carefully considering all edge cases.
Another thing to consider are reloads that usually wait for children to finish the request so it can nicely restart. The question is what should happen for bootstrap and if there's any point to wait.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could allow this option only for static pools, it does make some sense for dynamic, but not for ondemand i guess
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Except I guess not many users really use static
so it could end up being a bit useless feature :) It's a bit stupid PM so I'd really like not to give it a preference. The dynamic is the default so think this one will be the most used one and from the bug reports it looks that people are using ondemand
as well. The dynamic
is doable so it really needs to be done.
The ondemand
might be a bit trick but it's kind of related in some way to the reloads which should still be done cleanly. We might need to figure out how to do a proper clean up because otherwise it might end without calling php_request_shutdown
, hanging scoreboard proc (needed for dynamic
) and some other things.
I think the solution might be introducing event loop (epoll
) in the worker which could hook up the signal handling (checking pipe where the signal handler writes) and then do clean shutdown if SIGTERM
received for example. It would help with other things a well - see for example discussion in #4101 . I think this might be a prerequisite for this feature.
Nice one. We use similar approach with Swoole / RoadRunner. The disadvantage is that you have to be more careful with state: https://github.com/yiisoft/docs/blob/master/guide/en/tutorial/using-with-event-loop.md. The benefit is 1-2 ms and lower response time. |
Would it be possible to debug with Xdebug? |
@joshuaadickerson yes but only when automatically connecting to the debugger or using environment variable as a trigger. Not wirh request based trigger |
@samdark not in this case. This is not an event loop or a long-running process. Maybe this diagram can help. |
@mnapoli ah! Got the idea. Quite interesting 👍 |
@@ -1392,6 +1393,7 @@ PHP_INI_BEGIN() | |||
STD_PHP_INI_BOOLEAN("fastcgi.logging", "1", PHP_INI_SYSTEM, OnUpdateBool, fcgi_logging, php_cgi_globals_struct, php_cgi_globals) | |||
STD_PHP_INI_ENTRY("fastcgi.error_header", NULL, PHP_INI_SYSTEM, OnUpdateString, error_header, php_cgi_globals_struct, php_cgi_globals) | |||
STD_PHP_INI_ENTRY("fpm.config", NULL, PHP_INI_SYSTEM, OnUpdateString, fpm_config, php_cgi_globals_struct, php_cgi_globals) | |||
STD_PHP_INI_ENTRY("fpm.bootstrap_file", NULL, PHP_INI_SYSTEM, OnUpdateString, fpm_bootstrap, php_cgi_globals_struct, php_cgi_globals) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about early bootstrapping mode for Apache module?
I like the idea, and I really think it's worth exploring. It should be noted, however, that most PHP applications/frameworks set up their object graph in a rather static fashion due to PHP's by-request nature. I'm not sure that bootstrapping into a state that allows you to process certain requests faster would allow processing of very different requests (that require a completely different object graph). So either one had to cut back to "just some" bootstrapping, which would be against the spirit of the whole idea, or one would have to make sure that one FPM process only gets to process certain types of requests. I personally don't think that most frameworks/applications as they are today would be able to "just" pre-bootstrap and be able to work a lot faster. To really achieve that, substantial design changes to application and/or framework might be required. This is not to say that for specific use cases, I can see a lot of value in this approach. |
I have a slightly off topic question. Is it possible to make preloading or early bootstrapping per process pool instead of global so that we can have one fpm server serving multiple sites with different preloading/early bootstrapping configs? |
The idea is really interesting and could really speed up applications. With this approach you could also redesign parts of an application and pre-calculate some things by moving the logic to the early bootstrapping mode. I just have some opinions to share: 1. Don't break the shared nothing architecture 2. Handling of output and errors 3. Extensions which may break |
This inplementation is really shared nothing, there is no choice for users. Early bootstrapping is always repeated before every request.
I believe it gets discarded, however this is different because preloading happens in MINIT before the worker children are spawned.
Yes, profilers and monitoring like Xdebug or my own extension Todeways do this. I assume Blackfire as well. |
What about not exiting at the startup - https://github.com/php/php-src/pull/6772/files#diff-0354097cf69667d6a030457e558abb7f479910bf797bc019124738f1fb3edd93R1880 - and saving the output to display it later at the standard request? |
|
@tpetry no that is not possible within FPMs architecture. It requires something like Roadrunner |
@beberlei I am not sure if there has been some miscommunication or if it is really not solvable. But roadrunner and swoole are completely different architectures than what i meant, they remove the shared nothing concept to be more performant.
So i did not mean to keep the state of an executed request and have it available for further requests like road runner is doing which is breaking php's shared nothing architecture. I mean that only the state of the bootstrap script is available again and again without any changes to it. Whether it's done by unserializing the serialized state of the bootstrap again and again, just cloning the internal memory model or forking the php-fpm worker for every request to execute the request. But to be honest, i only did simple experiments with the php implementation in the past so i am not aware if keeping the state of the bootstrap and reusing it again and again without any modifications (to be shared nothing) is really feasible. Nevertheless i wanted to just suggest an extension to the current idea which would enable much more performance improvements as the boostrapping would not have to be re-executed again and again. This would make php-fpm as performant as swoole or roadrunner but still adhering to php's shared nothing architecture which makes developing so simple and worry-free. |
@tpetry To clarify for others, it seems like you're proposing something similar to V8 snapshots:
The Atom text editor (and other Electron apps) actually use this to speed up their startup times:
|
I think the storing (and restoring) is the main issue here as there would have to be a way to create a snapshot which would probably mean some sort serialization of all zvals but here is the first big problem because there are lots of things that cannot be serialized like for example streams, lots of internal and extension classes. Even if this would be somehow resolved, then all global state would have to be somehow stored and restored too which would difficult as well because it includes global state of external extension... There're likely more things that I'm missing now but as you can see just the mentioned ones are possibly pretty big obstacles. |
Is serialization the only option? Can't the snapshot live in memory? |
|
Well you would need to have mapping memory to zvals so it's kind of serialization. As I said even if you overcome this bit, then you have to deal with resources (e.g. streams that just can't be stored) and internal / extension classes that have got their own memory so some sort of API for that and globals would be required. I'm most likely missing other things as well. V8 doesn't have to deal with extension and doesn't have so much global state that needs storing from what I know which is not much. 😃 Using
If you fork it after each request, then you basically use sort CGI and creates / killing lots of new process which will be much slower and not good for resources. I'm not even thinking about how this would screwed up master process resources, monitoring and basically everything else... 😄 |
Just another note here that this won't really have any effect for keep alive connections because it will always block the connection for the whole time of bootstrapping. Same reason as why keepalive does not work properly with |
I'm going to close this as this solution is unworkable for persistent connections so it's not something that we could introduce in anything close to the current form. I can see the way how to do this as part of pool manager but it's significantly more complex - more info can be found in #11723 (comment) |
This is a very early prototype to gather feedback on an idea @mnapoli came up with and told me on PHP Barcelona 2019 that blew my mind then and was in the back of my mind for that time now.
Essentially this introduces a new
fpm.bootstrap_file
setting with a path to a script to be executed before the FPM child starts listening to connections. This script shares the PHP memory and state with the primary script that is executed once the FPM child accepts a connection.The theory is that with an application bootstrap taking a few milliseconds and a pool that is large enough to contain more children than requests are processed, every request is already bootstrapped when it gets run.
If the bootstrapping of an application takes 100ms, then this will not affect the user anymore unless PHP-FPM is near capacity.
I ran this with a small sample: https://gist.github.com/beberlei/19e58717d0fbf1e95685ff22255fa8d7 where clearly the time generated in
boot.php
is many seconds before the actual request, waiting to accept a connection inbetween.While this might significantly reduce latency for users, there are obviously some risky edge cases in an architecture like this. For example:
$_SERVER['REQUEST_START']
?One easily explainable downside of this kind of approach are open connections to databases, For example, If the boot script opens a MySQL connection that then idles for some time before handling a request then the well known
MySQL is gone away
error will appear. In general holding any network connection during the idling time is a really bad idea.These questions are rather tricky, so the question is.
Would this be something worth pursuing regardless of the hurdles to fit into PHPs process model?
Appendix
Relationship with Opcache Preloading
Preloading runs once per PHP FPM child during the process initialization. That is why it makes the startup slower. The loaded classes and functions are then stored into Opcache and do not need to be loaded into individual requests anymore (copied from shared memory).
Early Bootstrapping would target another stage of the request, notabely the part of the application bootstrapping that executes the same code regardless of the next request details. This includes loading up objects into memory and assembling them together.
One notable difference to preloading is that the bootstrapping step will be run before every request.
As an example for the familiar, in Symfony this could allow splitting up the
Kernel::boot
phase and theKernel::handle($request)
phase of the script execution.Relationship with Shared Nothing Architecture
This feature would not break the shared nothing architecture of PHP-FPM, it only moves the application bootstrapping to a different step of the request/response cycle. Notably the PHP memory state that is created in the bootstrap file will be destroyed alongside all memory that the primary script allocates and uses.
memory_get_usage()
will record the amount of memory used in the boot and the primary script combined. Essentiallyfpm.bootstrap_file
works sort of likeauto_prepend_file
.