-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Remove unnecessary memory clearing in virtual_file_ex() #10963
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@nielsdos thanks for this and all your other contributions. For your consideration, and in case you didn't know it already, the Symfony project publishes a demo application that you can use for things like this too. See https://github.com/symfony/demo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct me if I'm wrong, can't this line read from uninitialized memory if we don't initialize resolved_path
?
php-src/Zend/zend_virtual_cwd.c
Line 1114 in 2f309de
while (!IS_SLASH(resolved_path[start])) { |
As IS_UNC_PATH
only checks for len >= 2
.
@javiereguiluz Thanks for pointing that out. This patch also likely improves that, because in general this patch will give an improvement if a lot of files are loaded (e.g. web framework files). I'll take that repo into account for future measurements. |
dc497f5
to
94e380c
Compare
Alright I pushed the requested change. Thanks for reviewing! :) |
94e380c
to
5aefccb
Compare
I checked a simple Laravel CRUD application's home page under Callgrind and found that the line: char resolved_path[MAXPATHLEN] = {0}; took up about 0.95% of the spent instruction count. This is because when opcache revalidates the timestamps, it has to go through the function virtual_file_ex() which contains that line. That line will memset 4096 bytes on my system to all zeroes. This is bad for the data cache and for the runtime. I found that this memsetting is unnecessary in most cases, and that we can fix the one remaining case: * Lines 1020-1027 don't do anything with resolved_path, so that's okay. * Lines 1033-1098: - The !IS_ABSOLUTE_PATH branch will always result in a memcpy from path to resolved_path (+ sometimes an offset) with the total copied amount equal to path_length+1, so that includes a NUL byte. - The else branch either takes the WIN32 path or the non-WIN32 path. ° WIN32: There's a copy from path+2 with length path_length-1. Note that we chop off the first 2 bytes, so this also includes the NUL byte. ° Non-WIN32: Copies path_length+1 bytes, so that includes a NUL byte. At this point we know that resolved_path ends in a NUL byte. Going further in the code: * Lines 1100-1106 don't write to resolved_path, so no NUL byte is removed. * Lines 1108-1136: - The IS_UNC_PATH branch: ° Lines 1111-1112 don't overwrite the NUL byte, because we know the path length is at least 2 due to the IS_UNC_PATH check. ° Both while loops uppercase the path until a slash is found. If a NUL byte was found then it jumps to verify. Therefore, no NUL byte can be overwritten. Furthermore, Lines 1121 and 1129 cannot overwrite a NUL byte because the check at lines 1115 and 1123 would've jumped to verify when a NUL byte would be encountered. Therefore, the IS_UNC_PATH branch cannot overwrite a NUL byte, so the NUL byte we know we already got stays in place. - The else branch: ° We know the path length is at least 2 due to IS_ABSOLUTE_PATH. That means the earliest NUL byte can be at index 2, which can be overwritten on line 1133. We fix this by adding one byte write if the length is 2. All uses of resolved_path in lines 1139-1141 have a NUL byte at the end now. Lines 1154-1164 do a bunch of post-processing but line 1164 will make sure resolved_path still ends in a NUL byte. So therefore I propose to remove the huge memset, and add a single byte write in that one else branch I mentioned earlier. Looking at Callgrind, the instruction count before this patch for 200 requests is 14,264,569,942; and after the patch it's 14,129,358,195 (averaged over a handful of runs).
5aefccb
to
9a654a4
Compare
result may be a slot in op2. In that case SEPARATE_ARRAY() will change both result and the slot in op2. Looping over op2 and inserting the element results in both reference-less recursion which we don't allow, and increasing the refcount to 2, failing any further insertions into the array. Avoid this by copying result into a temporary zval and performing separation there instead. Fixes phpGH-10963
I checked a simple Laravel CRUD application's home page under Callgrind and found that the line:
char resolved_path[MAXPATHLEN] = {0};
took up about 0.95% of the spent instruction count. This is because when opcache revalidates the timestamps, it has to go through the function virtual_file_ex() which contains that line. That line will memset 4096 bytes on my system to all zeroes. This is bad for the data cache and for the runtime.
I found that this memsetting is unnecessary in most cases, and that we can fix the one remaining case:
At this point we know that resolved_path ends in a NUL byte. Going further in the code:
All uses of resolved_path in lines 1139-1141 have a NUL byte at the end now.
Lines 1154-1164 do a bunch of post-processing but line 1164 will make sure resolved_path still ends in a NUL byte.
So therefore I propose to remove the huge memset, and add a single byte write in that one else branch I mentioned earlier.
Looking at Callgrind, the instruction count before this patch for 200 requests is 14,264,569,942; and after the patch it's 14,129,358,195 (averaged over a handful of runs).