Skip to content

JIT: Hibernate JIT to disk for use with pre-built containers #16484

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
iamacarpet opened this issue Oct 17, 2024 · 19 comments
Closed

JIT: Hibernate JIT to disk for use with pre-built containers #16484

iamacarpet opened this issue Oct 17, 2024 · 19 comments

Comments

@iamacarpet
Copy link
Contributor

Description

Idea:

Support hibernating the JIT cache to disk.

Why?:

When building containers that contain both the PHP runtime and application code (that isn’t going to change once built into the container), a lot of CPU cycles are wasted re-warming the opcache/JIT every time a version of the container starts, even though the code will always remain the same.

In a large distributed container based platform like Lambda, App Runner, App Engine or Cloud Run, there are significant performance gains & cost savings to be realised by being able to pre-warm the JIT at container build time, then load it from disk at container startup.

How?:

Would it be possible to dump the JIT/opcache memory to disk?

While also having the option to load that dump at startup?

It could be controlled via ini options, which could be set on the PHP CLI as arguments when dumping the memory at build time.

@iluuu1994
Copy link
Member

Hi @iamacarpet. Opcache can be persisted to disk, it works in the way you describe. See:

https://www.php.net/manual/en/opcache.configuration.php#ini.opcache.file-cache

That's not currently an option for jitted code though. I'll defer to @dstogov on whether that is possible. Serializing assembly would be platform specific, so potentially something that would need to be added to dstogov/ir.

@iamacarpet
Copy link
Contributor Author

iamacarpet commented Oct 17, 2024

Thanks @iluuu1994 ,

I was just reading that, after spotting it after already opening the issue :).

Thanks so much for confirming it doesn’t dump the JIT’d code - that wasn’t clear anywhere.

While we wait for @dstogov , is there any benefit from using this for opcache, if PHP will then use JIT at runtime? i.e. is it layered and opcache is the first layer, then the JIT runs - or does one supersede the other?

@iluuu1994
Copy link
Member

iluuu1994 commented Oct 17, 2024

is there any benefit from using this for opcache, if PHP will then use JIT at runtime?

Yes. Before opcodes can be jitted, they first need to be built by compiling the source files. So you're at least speeding up the first half.

While we wait for dshafik

Think you linked the wrong person there.

@iamacarpet
Copy link
Contributor Author

Think you linked the wrong person there.

Oh nuts! That you for spotting that! Edited :).

Yes. Before opcodes can be jitted, they first need to be built by compiling the source files. So you're at least speeding up the first half.

Perfect, I’ll give it a try and see if we can measure an increase in performance - if we can, hopefully bringing the JIT into the equation is something we can do in a future release.

Thanks for the help so far!

@cmb69
Copy link
Member

cmb69 commented Oct 17, 2024

For what it's worth, a couple of years ago I've measure file_cache vs no OPcache on Windows/NTFS, and there was no performance difference. That might be different for other platforms/filesystems, though.

Anyhow, I think you're actually looking for ahead-of-time compilation.

@iluuu1994
Copy link
Member

iluuu1994 commented Oct 17, 2024

@cmb69 What did you measure? Cold start? I think it's safe to say PHP will never have ahead-of-time compilation.

@kocoten1992
Copy link

I think facebook call it Jump-Start.

@cmb69
Copy link
Member

cmb69 commented Oct 17, 2024

What did you measure?

That was a full benchmark against a couple of downstream projects, similar to Máté's benchmark. There had been a warm-up phase upfront. Main problem on Windows is reading (small) files – super slow. WinCache might have sped that up, but never tried it.

@iluuu1994
Copy link
Member

iluuu1994 commented Oct 17, 2024

Does that mean files were already read from file cache into shm? Or did you disable shm and compare the performance of file cache w/o shm vs. no cache at all?

@cmb69
Copy link
Member

cmb69 commented Oct 17, 2024

Or did you disable shm and compare the performance of file cache w/o shm vs. no cache at all?

That. (opcache.file_cache_only=1)

@iluuu1994
Copy link
Member

iluuu1994 commented Oct 17, 2024

Thank you! While I'm surprised that's not faster on Windows, it might be on Linux. But I haven't measured it myself.

@dstogov
Copy link
Member

dstogov commented Oct 18, 2024

iamacarpet we don't plan to serialize JIT code to disc.

cmb69 I think, file_cache_only vs no-opcache should make speed improvement on Windows s well. It may be insignificant for a single request, but for a number of requests to some fat Symfony app running on FastCGI PHP server I would expect good results.

@cmb69
Copy link
Member

cmb69 commented Oct 18, 2024

The general test environment consisted of 4 VMs:

  • a database server (MySQL) which had already prepared demo databases
  • a Webserver (IIS) with a couple of demo applications
  • 2 clients

The clients used wcat to perform the tests. That included a warmup phase (in seconds) and a actual measurement phase (in seconds). That was typically run to test performance of RC1 vs latest GA, and GA vs RC1. The results were than published, e.g. https://windows.php.net/downloads/snaps/ostc/pftt/perf/7.4.12-7.4.13RC1.html. As you can see, SHM OPcache made a serious perf improvement (WP got almost 10 times as fast). I don't have the results of that single file_cache_test I've done any longer (don't even know which PHP version that used; likely some PHP 7.4.x), but there had been no notable performance differences (at least not outside of the usual outliers/instability).

I still think the problem are the slow file system operations on Windows/NTFS (which eat up the performance benefits of pre-compilation). Even after OPcache became generally available with PHP 5.5, WinCache still offered a file cache and a resolve file path cache (both SHM) to improve this. Unfortunately, that extension has been abandoned a couple of years ago. Oh, apparently it had been revived. I think I'll have a closer look.

we don't plan to serialize JIT code to disc

Okay, then this ticket can be closed.

@cmb69 cmb69 closed this as not planned Won't fix, can't repro, duplicate, stale Oct 18, 2024
@iamacarpet
Copy link
Contributor Author

I want to update on my status trialling this, and a few things I've learnt:

  • Part of the opcache file path when stored on disk is the zend_system_id, which from my testing, only stays the same on the exact same build of PHP (and as a result this means if your service is restarting to install a PHP update, the opcache files are no longer valid anyway, but not necessarily a problem with Docker containers that stay static until updated as a whole).
  • On Linux with Docker, I do see quite a drastic improvement in serving the first request with a pre-warm'd opcache.
  • It isn't working when Docker has a read-only root file-system, and the opcache dir is read-only as a result.

Benchmarks:

First request without opcache warm'd:

$ curl -w "@curl-format.txt" -o /dev/null -s http://localhost:32768/_ah/warmup
     time_namelookup:  0.000032s
        time_connect:  0.000160s
     time_appconnect:  0.000000s
    time_pretransfer:  0.000193s
       time_redirect:  0.000000s
  time_starttransfer:  0.221376s
                     ----------
          time_total:  0.221478s

First request with opcache pre-warm'd:

$ curl -w "@curl-format.txt" -o /dev/null -s http://localhost:32768/_ah/warmup
     time_namelookup:  0.000029s
        time_connect:  0.000164s
     time_appconnect:  0.000000s
    time_pretransfer:  0.000198s
       time_redirect:  0.000000s
  time_starttransfer:  0.053059s
                     ----------
          time_total:  0.053171s

The CPU on the machine I'm testing from is a lot more powerful than the CPU allocation for containers in production (it's lots of small instances that scale horizontally), so I'm expecting the improvement to be even more dramatic there.

Struggling to test it properly though, as prod/pre-prod for serverless all has a read-only root FS.

I think the problem with it not working with a read-only file system / directory is related to this:

https://github.com/php/php-src/blob/master/ext/opcache/zend_file_cache.c#L1836

Potentially it's failing to get a shared lock on the cache file, and silently bombing out - I'm going to investigate further.

@iamacarpet
Copy link
Contributor Author

iamacarpet commented Oct 21, 2024

Ok, I managed to turn on opcache debugging via opcache.log_verbosity_level=4, which gave me this:

Warning opcache.file_cache must be a full path of accessible directory.

On searching the codebase, that led me here:

https://github.com/php/php-src/blob/master/ext/opcache/zend_accelerator_module.c#L179

Above that log line, I can see on Linux it requires read and write permissions to the directory, or it won't even try to load anything from there.

static ZEND_INI_MH(OnUpdateFileCache)
{
	if (new_value) {
		if (!ZSTR_LEN(new_value)) {
			new_value = NULL;
		} else {
			zend_stat_t buf = {0};

		    if (!IS_ABSOLUTE_PATH(ZSTR_VAL(new_value), ZSTR_LEN(new_value)) ||
			    zend_stat(ZSTR_VAL(new_value), &buf) != 0 ||
			    !S_ISDIR(buf.st_mode) ||
#ifndef ZEND_WIN32
				access(ZSTR_VAL(new_value), R_OK | W_OK | X_OK) != 0) {
#else
				_access(ZSTR_VAL(new_value), 06) != 0) {
#endif
				zend_accel_error(ACCEL_LOG_WARNING, "opcache.file_cache must be a full path of accessible directory.\n");
				new_value = NULL;
			}
		}
	}
	OnUpdateString(entry, new_value, mh_arg1, mh_arg2, mh_arg3, stage);
	return SUCCESS;
}

So, while my original request was to serialize the JIT to disk, it seems the most useful thing is being able to serialize the opcache to disk (for the JIT to further optimize at runtime): would it be possible to make some modifications to allow this to happen, even when the disk is read only?

In this situation, we'd read existing cache files (where applicable), but not try to write anything new.

What do you think? @iluuu1994 @cmb69 @dstogov

@dstogov
Copy link
Member

dstogov commented Oct 21, 2024

would it be possible to make some modifications to allow this to happen, even when the disk is read only?

This is possible (you can do this), but this hardly ever will be merged into the main PHP branches.

@iamacarpet
Copy link
Contributor Author

iamacarpet commented Oct 21, 2024

Does that mean you wouldn't be willing to accept it via a PR @dstogov ?

Our prod usage is via the vendor maintained runtimes on one of the large cloud providers, so unless the change could make it into a release, we wouldn't be able to use it.

I'm happy to spend some time putting together a PR if it's something you'll consider for acceptance.

EDIT:

I can see with some further research that something like this should be proposed as an RFC via the mailing list, is that right?

If so, unless you are very opposed to me putting this forward, I'm happy to follow the process - please excuse my ignorance.

@dstogov
Copy link
Member

dstogov commented Oct 21, 2024

This may be accepted only as a carefully designed new feature. This would require RFC, discussion, voting, etc. if accepted, it it's going to be merged only into master branch and then released with PHP-8.5.0 or PHP-9.0.0 in November 2025.

I'm not totally opposite. The idea of pre-initialized read-only file cache may make sense, but it make sense only when all PHP scripts are also read-only, timestamp validation and manual file cache invalidation should be disabled. I'm not sure about other problems...

@iamacarpet
Copy link
Contributor Author

Thanks for your guidance @dstogov

but it make sense only when all PHP scripts are also read-only, timestamp validation and manual file cache invalidation should be disabled.

I agree, and thank you, as I hadn't considered making it a validation type check.

You'll see I've put together a draft PR: before I approach the mailing list, if it isn't too much trouble, would you mind taking a look and giving me a code review please?

C isn't a language I work in frequently, so I'm cautious to get some validation before approaching anyone else.

iamacarpet added a commit to iamacarpet/php-src that referenced this issue Oct 22, 2024
iamacarpet added a commit to iamacarpet/php-src that referenced this issue Oct 23, 2024
iamacarpet added a commit to iamacarpet/php-src that referenced this issue Oct 23, 2024
iamacarpet added a commit to iamacarpet/php-src that referenced this issue Oct 23, 2024
iamacarpet added a commit to iamacarpet/php-src that referenced this issue Oct 23, 2024
iamacarpet added a commit to iamacarpet/php-src that referenced this issue Oct 23, 2024
iamacarpet added a commit to iamacarpet/php-src that referenced this issue Oct 24, 2024
iamacarpet added a commit to iamacarpet/php-src that referenced this issue Oct 25, 2024
iamacarpet added a commit to iamacarpet/php-src that referenced this issue Oct 29, 2024
iamacarpet added a commit to iamacarpet/php-src that referenced this issue Oct 29, 2024
iamacarpet added a commit to iamacarpet/php-src that referenced this issue Oct 29, 2024
iamacarpet added a commit to iamacarpet/php-src that referenced this issue Oct 29, 2024
iamacarpet added a commit to iamacarpet/php-src that referenced this issue Oct 29, 2024
iamacarpet added a commit to iamacarpet/php-src that referenced this issue Oct 30, 2024
iamacarpet added a commit to iamacarpet/php-src that referenced this issue Nov 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants