Skip to content

Add FPM early bootstrapping mode #6772

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Conversation

beberlei
Copy link
Contributor

@beberlei beberlei commented Mar 14, 2021

This is a very early prototype to gather feedback on an idea @mnapoli came up with and told me on PHP Barcelona 2019 that blew my mind then and was in the back of my mind for that time now.

Essentially this introduces a new fpm.bootstrap_file setting with a path to a script to be executed before the FPM child starts listening to connections. This script shares the PHP memory and state with the primary script that is executed once the FPM child accepts a connection.

The theory is that with an application bootstrap taking a few milliseconds and a pool that is large enough to contain more children than requests are processed, every request is already bootstrapped when it gets run.

If the bootstrapping of an application takes 100ms, then this will not affect the user anymore unless PHP-FPM is near capacity.

I ran this with a small sample: https://gist.github.com/beberlei/19e58717d0fbf1e95685ff22255fa8d7 where clearly the time generated in boot.php is many seconds before the actual request, waiting to accept a connection inbetween.

While this might significantly reduce latency for users, there are obviously some risky edge cases in an architecture like this. For example:

  • what to do when boot script causes an error / exception? should the primary script still run and be expected to boot the application again?
  • What to do with output from boot? i would probably discard it
  • What about $_SERVER['REQUEST_START']?
  • What about extensions that expect the request data to be present in RINIT already?
  • Are we ok with an FPM child consuming potentially "memory_limit" amount of data while idling for a request?
  • bootstrap_file script runs outside of a cwd, what does that mean for open_basedir? php.ini settings per-dir and so on.

One easily explainable downside of this kind of approach are open connections to databases, For example, If the boot script opens a MySQL connection that then idles for some time before handling a request then the well known MySQL is gone away error will appear. In general holding any network connection during the idling time is a really bad idea.

These questions are rather tricky, so the question is.

Would this be something worth pursuing regardless of the hurdles to fit into PHPs process model?

Appendix

Relationship with Opcache Preloading

Preloading runs once per PHP FPM child during the process initialization. That is why it makes the startup slower. The loaded classes and functions are then stored into Opcache and do not need to be loaded into individual requests anymore (copied from shared memory).

Early Bootstrapping would target another stage of the request, notabely the part of the application bootstrapping that executes the same code regardless of the next request details. This includes loading up objects into memory and assembling them together.
One notable difference to preloading is that the bootstrapping step will be run before every request.

As an example for the familiar, in Symfony this could allow splitting up the Kernel::boot phase and the Kernel::handle($request) phase of the script execution.

Relationship with Shared Nothing Architecture

This feature would not break the shared nothing architecture of PHP-FPM, it only moves the application bootstrapping to a different step of the request/response cycle. Notably the PHP memory state that is created in the bootstrap file will be destroyed alongside all memory that the primary script allocates and uses. memory_get_usage() will record the amount of memory used in the boot and the primary script combined. Essentially fpm.bootstrap_file works sort of like auto_prepend_file.

@beberlei
Copy link
Contributor Author

@bukka you probably know a few additional problems that I haven't thought of yet.

@aran112000
Copy link

aran112000 commented Mar 15, 2021

@beberlei love this!

Regarding these two questions, I would personally suggest introducing new PHP ini settings for each to control if they should be buffered and output by the FPM child workers, much like display_startup_errors, with the default being for them to simply be discarded.

  • what to do when boot script causes an error / exception? should the primary script still run and be expected to boot the application again?
  • What to do with output from boot? i would probably discard it

@Slamdunk
Copy link

Just to clear out my ignorance, how would this interact with opcache.preload?

@dragoonis
Copy link
Contributor

Ben,

Scenario: you have this viotstrapping code in memory, and then do a release and want to reload it this bootstrap.

What's the DX here for doing so

Have you thought about race conditions, of requests in flight when you flush it.

Framework bootstrap in here will have to be linked to app code just deployed

@beberlei
Copy link
Contributor Author

@Slamdunk i added an explanation of the relation to Opcache Preloading to the orginal issue, the question was asked a few times outside this ticket as well.

@beberlei
Copy link
Contributor Author

@dragoonis Similarly to Preloading you need to reload the PHP-FPM process to be safe. Otherwise you are right, you could run the primary script at the current release with a boot script that was loaded in the previous release. Good catch!

@ezimuel
Copy link
Contributor

ezimuel commented Mar 15, 2021

@beberlei I think this is a very nice idea!

My feedback on some of the open points that you mentioned:

what to do when boot script causes an error / exception? should the primary script still run and be expected to boot the application again?

In case of error/exception I think we should abort the execution of FPM. If I understood the proposal, the FPM executes the bootstrap file and store it in memory to be ready for all the upcoming requests. If the code generate an error this should be noticed in advance.

What to do with output from boot? i would probably discard it

We can store the output in a log file. In this way we can have a feedback from the execution of the bootstrap.

What about $_SERVER['REQUEST_START']?

$_SERVER should be empty when FPM is loading. Anyway, I think $_SERVER['REQUEST_START'] should be null at that time.

Are we ok with an FPM child consuming potentially "memory_limit" amount of data while idling for a request?

The memory consumed by the bootsrap file should not be counted for the memory_limit size.

bootstrap_file script runs outside of a cwd, what does that mean for open_basedir? php.ini settings per-dir and so on.

The bootstrap file should follow the same rule and restriction configured in the PHP setting. I think the fpm.bootstrap_file can be used only with a single PHP application, like the opcache.preload.

I was thinking that can be useful to offer also a get_fpm_boostrap(): string function to check if we have a boostrap file already loaded. Or maybe we can just read from the settings ini_get('fpm.bootstrap_file').

@staabm
Copy link
Contributor

staabm commented Mar 15, 2021

Just leaving this links here in which mnapoli visualized the idea

https://twitter.com/matthieunapoli/status/1371380248558907393

@NoiseByNorthwest
Copy link

If the bootstrapping of an application takes 100ms

Shouldn't it be optimized (or some parts cached...) instead of displacing the problem in an hidden place as this feature seems to do ?

Moreover if 100ms is too much, the waste of resources will still be present despite being hidden in some circumstances. I'm afraid it would be easy to lose sight of this issue and eventually end up with our back to the wall when the 100ms would become 200ms or during a peak traffic period.

This feature could also make debugging / tracing more difficult. For instance pre-request code will not be able to add to logging context the request related information.

@NoiseByNorthwest
Copy link

As an example for the familiar, in Symfony this could allow splitting up the Kernel::boot

I'm curious, what can make the Kernel::boot() significantly slow in production mode and with no room for optimization in userland ?

Copy link
Member

@bukka bukka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea in general. The tricky part is to get all edge cases sorted out because it's changing some expectation for the process managers.

It might also make sense to introduce some time limit option as for some less busy apps and for example static pm the delay between bootstrap and execution of the script could be quite big. So maybe something that would re-bootstrap if it took too long and code is sensitive. Not sure about the use case though as the only thing that came to my mind is opening connection which should not probably be done in bootstrap anyway. Mainly wanted to point out that there might be a significant delay.

There's probably more things that I haven't considered yet.

@@ -1392,6 +1393,7 @@ PHP_INI_BEGIN()
STD_PHP_INI_BOOLEAN("fastcgi.logging", "1", PHP_INI_SYSTEM, OnUpdateBool, fcgi_logging, php_cgi_globals_struct, php_cgi_globals)
STD_PHP_INI_ENTRY("fastcgi.error_header", NULL, PHP_INI_SYSTEM, OnUpdateString, error_header, php_cgi_globals_struct, php_cgi_globals)
STD_PHP_INI_ENTRY("fpm.config", NULL, PHP_INI_SYSTEM, OnUpdateString, fpm_config, php_cgi_globals_struct, php_cgi_globals)
STD_PHP_INI_ENTRY("fpm.bootstrap_file", NULL, PHP_INI_SYSTEM, OnUpdateString, fpm_bootstrap, php_cgi_globals_struct, php_cgi_globals)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All FPM configuration should go to the FPM config and it should be pool specific in this case. This will be more consistent with other settings like status path for example.

while (1) {
// moving this before init_request_info will break with dtrace
// support in php_request_startup(), can we remove?
if (UNEXPECTED(fpm_bootstrapped && php_request_startup() == FAILURE)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: you should probably move this to the condition below (no need to check fpm_bootstrapped twice)

if (fpm_bootstrapped) {
zend_stream_init_filename(&bootstrap_file, CGIG(fpm_bootstrap));

if (zend_execute_scripts(ZEND_REQUIRE, NULL, 1, &bootstrap_file) == FAILURE) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So here is the main missing piece that I see from my quick review. You are starting potentially longer work without changing the request_stage (updating scoreboard). That might result into various problems with dynamic and ondemand process managers because FPM master will think that there is no work going on and consider the process as idle. It means it might not scale properly which might be problematic. It will probably cause a shorter clean up for ondemand as it bases last idle time on the end of request currently. I think it would make sense to introduce a new request stage but it probably requires a bit more thinking and carefully considering all edge cases.

Another thing to consider are reloads that usually wait for children to finish the request so it can nicely restart. The question is what should happen for bootstrap and if there's any point to wait.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could allow this option only for static pools, it does make some sense for dynamic, but not for ondemand i guess

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except I guess not many users really use static so it could end up being a bit useless feature :) It's a bit stupid PM so I'd really like not to give it a preference. The dynamic is the default so think this one will be the most used one and from the bug reports it looks that people are using ondemand as well. The dynamic is doable so it really needs to be done.

The ondemand might be a bit trick but it's kind of related in some way to the reloads which should still be done cleanly. We might need to figure out how to do a proper clean up because otherwise it might end without calling php_request_shutdown, hanging scoreboard proc (needed for dynamic) and some other things.

I think the solution might be introducing event loop (epoll) in the worker which could hook up the signal handling (checking pipe where the signal handler writes) and then do clean shutdown if SIGTERM received for example. It would help with other things a well - see for example discussion in #4101 . I think this might be a prerequisite for this feature.

@samdark
Copy link

samdark commented Mar 15, 2021

Nice one. We use similar approach with Swoole / RoadRunner. The disadvantage is that you have to be more careful with state: https://github.com/yiisoft/docs/blob/master/guide/en/tutorial/using-with-event-loop.md. The benefit is 1-2 ms and lower response time.

@joshuaadickerson
Copy link

Would it be possible to debug with Xdebug?

@beberlei
Copy link
Contributor Author

@joshuaadickerson yes but only when automatically connecting to the debugger or using environment variable as a trigger. Not wirh request based trigger

@mnapoli
Copy link

mnapoli commented Mar 16, 2021

The disadvantage is that you have to be more careful with state

@samdark not in this case. This is not an event loop or a long-running process. Maybe this diagram can help.

@samdark
Copy link

samdark commented Mar 16, 2021

@mnapoli ah! Got the idea. Quite interesting 👍

@@ -1392,6 +1393,7 @@ PHP_INI_BEGIN()
STD_PHP_INI_BOOLEAN("fastcgi.logging", "1", PHP_INI_SYSTEM, OnUpdateBool, fcgi_logging, php_cgi_globals_struct, php_cgi_globals)
STD_PHP_INI_ENTRY("fastcgi.error_header", NULL, PHP_INI_SYSTEM, OnUpdateString, error_header, php_cgi_globals_struct, php_cgi_globals)
STD_PHP_INI_ENTRY("fpm.config", NULL, PHP_INI_SYSTEM, OnUpdateString, fpm_config, php_cgi_globals_struct, php_cgi_globals)
STD_PHP_INI_ENTRY("fpm.bootstrap_file", NULL, PHP_INI_SYSTEM, OnUpdateString, fpm_bootstrap, php_cgi_globals_struct, php_cgi_globals)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about early bootstrapping mode for Apache module?

@spriebsch
Copy link

I like the idea, and I really think it's worth exploring. It should be noted, however, that most PHP applications/frameworks set up their object graph in a rather static fashion due to PHP's by-request nature. I'm not sure that bootstrapping into a state that allows you to process certain requests faster would allow processing of very different requests (that require a completely different object graph).

So either one had to cut back to "just some" bootstrapping, which would be against the spirit of the whole idea, or one would have to make sure that one FPM process only gets to process certain types of requests. I personally don't think that most frameworks/applications as they are today would be able to "just" pre-bootstrap and be able to work a lot faster. To really achieve that, substantial design changes to application and/or framework might be required.

This is not to say that for specific use cases, I can see a lot of value in this approach.

@halaei
Copy link

halaei commented Mar 17, 2021

I have a slightly off topic question. Is it possible to make preloading or early bootstrapping per process pool instead of global so that we can have one fpm server serving multiple sites with different preloading/early bootstrapping configs?

@tpetry
Copy link

tpetry commented Apr 17, 2021

The idea is really interesting and could really speed up applications. With this approach you could also redesign parts of an application and pre-calculate some things by moving the logic to the early bootstrapping mode. I just have some opinions to share:

1. Don't break the shared nothing architecture
Should the bootstrapping really happen only once for all requests of an fpm worker? I don't share the opinion that this is still an shared nothing architecture. It can be a shared nothing architecture, but this depends really on the implementation: If the state of the bootstrap is forked for every request then you are right, it's a tremendous improvement for the php ecosystem which has the ability to make php a lot faster than the opcache preloading improvements. But if the boostrapped state is mutable by every request it's no longer a shared nothing architecture and you would have to look at every dependency you are using if they are really compatible with the new approach. This is in my opinion a problem of the new laravel octane approach because all libraries in the past have been built on a shared nothing concept. They may work in a new mutable ecosystem, but they may even fail. You would have to check every library which is mostly impossible or just hope everything works correctly. I guess we don't want to have a problem like Github recently that user sessions are used by other requests if libraries are able to change a mutable state.

2. Handling of output and errors
What's the current behaviour of opcache preloading? The early boostrapping mode could use the same error handling behaviour as the features could be declared similar in their behaviour.

3. Extensions which may break
Are there really any extensions depending on request data already available in their RINIT phase? Maybe we could list them and if some are found decide on how to solve the problem? I guess a solution to this problem is really dependent on which extensions currently depend on this behaviour and how it can be solved for these extensions or if it is really not solvable and the early bootstrapping should just be deactivated.

@beberlei
Copy link
Contributor Author

The idea is really interesting and could really speed up applications. With this approach you could also redesign parts of an application and pre-calculate some things by moving the logic to the early bootstrapping mode. I just have some opinions to share:

1. Don't break the shared nothing architecture

Should the bootstrapping really happen only once for all requests of an fpm worker? I don't share the opinion that this is still an shared nothing architecture. It can be a shared nothing architecture, but this depends really on the implementation: If the state of the bootstrap is forked for every request then you are right, it's a tremendous improvement for the php ecosystem which has the ability to make php a lot faster than the opcache preloading improvements. But if the boostrapped state is mutable by every request it's no longer a shared nothing architecture and you would have to look at every dependency you are using if they are really compatible with the new approach. This is in my opinion a problem of the new laravel octane approach because all libraries in the past have been built on a shared nothing concept. They may work in a new mutable ecosystem, but they may even fail. You would have to check every library which is mostly impossible or just hope everything works correctly. I guess we don't want to have a problem like Github recently that user sessions are used by other requests if libraries are able to change a mutable state.

This inplementation is really shared nothing, there is no choice for users. Early bootstrapping is always repeated before every request.

2. Handling of output and errors

What's the current behaviour of opcache preloading? The early boostrapping mode could use the same error handling behaviour as the features could be declared similar in their behaviour.

I believe it gets discarded, however this is different because preloading happens in MINIT before the worker children are spawned.

3. Extensions which may break

Are there really any extensions depending on request data already available in their RINIT phase? Maybe we could list them and if some are found decide on how to solve the problem? I guess a solution to this problem is really dependent on which extensions currently depend on this behaviour and how it can be solved for these extensions or if it is really not solvable and the early bootstrapping should just be deactivated.

Yes, profilers and monitoring like Xdebug or my own extension Todeways do this. I assume Blackfire as well.

@mvorisek
Copy link
Contributor

2. Handling of output and errors
What's the current behaviour of opcache preloading? The early boostrapping mode could use the same error handling behaviour as the features could be declared similar in their behaviour.

I believe it gets discarded, however this is different because preloading happens in MINIT before the worker children are spawned.

What about not exiting at the startup - https://github.com/php/php-src/pull/6772/files#diff-0354097cf69667d6a030457e558abb7f479910bf797bc019124738f1fb3edd93R1880 - and saving the output to display it later at the standard request?

@tpetry
Copy link

tpetry commented Apr 17, 2021

This implementation is really shared nothing, there is no choice for users. Early bootstrapping is always repeated before every request.
Would it be feasible to do boostrapping once for all further requests and just clone the state bootstrapping state? Repeating the boostrapping is a nice performance-win when the number of requests is low but for high usage systems there will not be a big win as these hypothetical 100ms boostrapping need to be done all the time. By only bootstrapping once and cloning/forking the bootrstrapped state these 100ms are only done once but the biggest improvement would be that much longer bootstrapping phases are possible. In node/ruby/python/... you can easily do multiple seconds of bootstrapping once to save a lot of time later, this would be a speed win for users (as the current concept) and even save a lot of processor time as the boostrapping does not need to be repeated again and again which means big applications can use less servers.

@beberlei
Copy link
Contributor Author

@tpetry no that is not possible within FPMs architecture. It requires something like Roadrunner

@tpetry
Copy link

tpetry commented Apr 18, 2021

@tpetry no that is not possible within FPMs architecture. It requires something like Roadrunner

@beberlei I am not sure if there has been some miscommunication or if it is really not solvable. But roadrunner and swoole are completely different architectures than what i meant, they remove the shared nothing concept to be more performant.
The opcache preloading method works (by reading the documentation) by running a custom script once for the php-fpm binary or for every worker, it's not really exactly stated. The loaded classes will then be available for every request without loading them again. What i wanted to add to the discussion was the exact concept just extended for keeping the state. So for my simple example i will assume the preloading happens in the php-fpm workers:

  1. php fpm worker is started
  2. opcache.preload file is executed and classes are kept in some form of memory
  3. bootstrap file is executed and complete state after running script is not garbage collected but stored
  4. request is received, stored state of 3 is cloned and "prepended"
  5. request is finished, state of request is garbage collected as has been all the time
  6. request is received, stored state of 3 is cloned and "prepended"
  7. request is finished, state of request is garbage collected as has been all the time
  8. ...

So i did not mean to keep the state of an executed request and have it available for further requests like road runner is doing which is breaking php's shared nothing architecture. I mean that only the state of the bootstrap script is available again and again without any changes to it. Whether it's done by unserializing the serialized state of the bootstrap again and again, just cloning the internal memory model or forking the php-fpm worker for every request to execute the request.

But to be honest, i only did simple experiments with the php implementation in the past so i am not aware if keeping the state of the bootstrap and reusing it again and again without any modifications (to be shared nothing) is really feasible. Nevertheless i wanted to just suggest an extension to the current idea which would enable much more performance improvements as the boostrapping would not have to be re-executed again and again. This would make php-fpm as performant as swoole or roadrunner but still adhering to php's shared nothing architecture which makes developing so simple and worry-free.

@JosephSilber
Copy link

@tpetry To clarify for others, it seems like you're proposing something similar to V8 snapshots:

embedders can utilize snapshotting to skip over the startup time

The Atom text editor (and other Electron apps) actually use this to speed up their startup times:

In order to improve startup time, when Atom is built we create a V8 snapshot in which we preload core services and packages.

@bukka
Copy link
Member

bukka commented Apr 19, 2021

bootstrap file is executed and complete state after running script is not garbage collected but stored

I think the storing (and restoring) is the main issue here as there would have to be a way to create a snapshot which would probably mean some sort serialization of all zvals but here is the first big problem because there are lots of things that cannot be serialized like for example streams, lots of internal and extension classes. Even if this would be somehow resolved, then all global state would have to be somehow stored and restored too which would difficult as well because it includes global state of external extension... There're likely more things that I'm missing now but as you can see just the mentioned ones are possibly pretty big obstacles.

@JosephSilber
Copy link

Is serialization the only option?

Can't the snapshot live in memory?

@tpetry
Copy link

tpetry commented Apr 20, 2021

@tpetry To clarify for others, it seems like you're proposing something similar to V8 snapshots:
That's an excellent explanation for the concept i had in mind.

Can't the snapshot live in memory?
That's what i had in mind. Like the fork() operation to just continue the operation in a new copy. But i guess the new jit functionalities will be problematic in this case.

@bukka
Copy link
Member

bukka commented Apr 20, 2021

Well you would need to have mapping memory to zvals so it's kind of serialization. As I said even if you overcome this bit, then you have to deal with resources (e.g. streams that just can't be stored) and internal / extension classes that have got their own memory so some sort of API for that and globals would be required. I'm most likely missing other things as well. V8 doesn't have to deal with extension and doesn't have so much global state that needs storing from what I know which is not much. 😃

Using fork before each request doesn't make any sense in FPM model as we are talking about FPM children. Those are forked from master and need to be controlled by master. The whole point of fastcgi is that you can re-use the process so its flow is basically like

accept -> execute request -> accept -> execute another request

If you fork it after each request, then you basically use sort CGI and creates / killing lots of new process which will be much slower and not good for resources. I'm not even thinking about how this would screwed up master process resources, monitoring and basically everything else... 😄

@bukka
Copy link
Member

bukka commented Apr 19, 2024

Just another note here that this won't really have any effect for keep alive connections because it will always block the connection for the whole time of bootstrapping. Same reason as why keepalive does not work properly with fastcgi_finish_request: see #10335 (comment) for more info

@bukka bukka mentioned this pull request Mar 2, 2025
@bukka
Copy link
Member

bukka commented Mar 2, 2025

I'm going to close this as this solution is unworkable for persistent connections so it's not something that we could introduce in anything close to the current form. I can see the way how to do this as part of pool manager but it's significantly more complex - more info can be found in #11723 (comment)

@bukka bukka closed this Mar 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.