Skip to content

Fix GH-10737: PHP 8.1.16 segfaults on line 597 of sapi/apache2handler/sapi_apache2.c #10863

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from

Conversation

nielsdos
Copy link
Member

@nielsdos nielsdos commented Mar 16, 2023

It was actually a TSRM issue after all. Elliot confirmed this patch fixes the problem.

The TSRM keeps a hashtable mapping the thread IDs to the thread resource pointers. It's possible that the thread disappears without us knowing, and then another thread gets spawned some time later with the same ID as the disappeared thread. Note that since it's a new thread the TSRM key pointer and cached pointer will be NULL.

The Apache request handler php_handler() will try to fetch some fields from the SAPI globals. It uses a lazy thread resource allocation by calling ts_resource(0);. This allocates a thread resource and sets up the TSRM pointers if they haven't been set up yet.

At least, that's what's supposed to happen. But since we are in a situation where the thread ID still has the resources of the old thread associated in the hashtable, the loop in ts_resource_ex will find that thread resource and assume the thread has been setup already. But this is not the case since this thread is actually a new thread, just reusing the ID of the old one, without any relation whatsoever to the old thread. Because of this assumption, the TSRM pointers will not be setup, leading to a NULL pointer dereference when trying to access the SAPI globals.

We can easily detect this scenario: if we're in the fallback path, and the pointer is NULL, and we're looking for our own thread resource, we know we're actually reusing a thread ID. In that case, we'll free up the old thread resources gracefully (gracefully because there might still be resources open like database connection which need to be shut down cleanly). After freeing the resources, we'll create the new resources for this thread as if the stale resources never existed in the first place. From that point forward, it is as if that situation never occurred. The fact that this situation happens isn't that bad because a child process containing threads will eventually be respawned anyway by the SAPI, so the stale thread resources won't remain forever.

Note that we can't simply assign our own TSRM pointers to the existing thread resource for our ID, since it was actually from a different thread (just with the same ID!). Furthermore, the dynamically loaded extensions
have their own pointer, which is only set when their constructor is
called, so we'd have to call their constructor anyway...
I also tried to call the dtor and then the ctor again for those resources on the pre-existing thread resource to reuse storage, but that didn't work properly because other code doesn't expect something like that to happen, which breaks assumptions, and this in turn caused Valgrind to (rightfully) complain about memory bugs.

Note 2: I also had to fix a bug in the core globals destruction because it always assumed that the thread destroying them was the owning thread, which on TSRM shutdown isn't always the case. A similar bug was fixed recently with the JIT globals.


(Also: it seems like we don't have a TSRM label?)

Also a huge thanks to @ElliotNB for constantly testing my debug patches and reporting back with traces and logs. I wouldn't have been able to solve this issue without his help.

After this is merged I'll check the other Apache reports, because there are probably ones with a similar root cause.

…ler/sapi_apache2.c

The TSRM keeps a hashtable mapping the thread IDs to the thread resource pointers.
It's possible that the thread disappears without us knowing, and then another thread
gets spawned some time later with the same ID as the disappeared thread.
Note that since it's a new thread the TSRM key pointer and cached pointer will be NULL.

The Apache request handler `php_handler()` will try to fetch some fields from the SAPI globals.
It uses a lazy thread resource allocation by calling `ts_resource(0);`.
This allocates a thread resource and sets up the TSRM pointers if they haven't been set up yet.

At least, that's what's supposed to happen. But since we are in a situation where the thread ID
still has the resources of the *old* thread associated in the hashtable,
the loop in `ts_resource_ex` will find that thread resource and assume the thread has been setup
already. But this is not the case since this thread is actually a new thread, just reusing the ID
of the old one, without any relation whatsoever to the old thread.
Because of this assumption, the TSRM pointers will not be setup, leading to a
NULL pointer dereference when trying to access the SAPI globals.

We can easily detect this scenario: if we're in the fallback path, and the pointer is NULL,
and we're looking for our own thread resource, we know we're actually reusing a thread ID.
In that case, we'll free up the old thread resources gracefully (gracefully because
there might still be resources open like database connection which need to be
shut down cleanly). After freeing the resources, we'll create the new resources for
this thread as if the stale resources never existed in the first place.
From that point forward, it is as if that situation never occurred.
The fact that this situation happens isn't that bad because a child process containing
threads will eventually be respawned anyway by the SAPI, so the stale thread resources
won't remain forever.

Note that we can't simply assign our own TSRM pointers to the existing
thread resource for our ID, since it was actually from a different thread
(just with the same ID!). Furthermore, the dynamically loaded extensions
have their own pointer, which is only set when their constructor is
called, so we'd have to call their constructor anyway...
I also tried to call the dtor and then the ctor again for those resources
on the pre-existing thread resource to reuse storage, but that didn't work properly
because other code doesn't expect something like that to happen, which breaks assumptions,
and this in turn caused Valgrind to (rightfully) complain about memory bugs.

Note 2: I also had to fix a bug in the core globals destruction because it
always assumed that the thread destroying them was the owning thread,
which on TSRM shutdown isn't always the case. A similar bug was fixed
recently with the JIT globals.
@nielsdos nielsdos marked this pull request as draft March 16, 2023 22:04
@nielsdos
Copy link
Member Author

Marking as draft because a remaining issue was found. Will update this once we figure it out.

No semantic changes.
The condition is misleading because thread_resources can never be NULL.
What we actually want to do is check whether we found the resources
corresponding to the current thread ID.

Suggested-by: ylavic
@nielsdos
Copy link
Member Author

I believe I fixed all remaining issues. The side-quest was to find the places where module destruction happened in non-reverse order. This messed up dependencies: e.g. the reporter noticed a crash when php_pcre tried to call efree after the zend_alloc globals were already freed.

The last commit purely cleans up the while loop to be less misleading, as the loop condition would always be true and it's actually the thread id check that is the real condition. If such a change is not acceptable for stable versions I can drop that commit for stable and put it master only.

Upon merging, I'll squash all commits into one, but I thought it was best for easier review to not force-push things.

@HeenaBansal2009
Copy link

@nielsdos , Kudos to you ! This a great news . A quite tricky issue to solve.
Any plans to backport this fix to PHP 7.* ?

@nielsdos
Copy link
Member Author

Any plans to backport this fix to PHP 7.* ?

7.x is out of support, the lowest bugfix version is 8.1.x.
May I please ask you what's preventing you from upgrading?
You can also always try to see if this PR (and the other ZTS fix: #10861) apply on the version you're using.
Hope this helps! :)

@ElliotNB
Copy link

ElliotNB commented Apr 7, 2023

Eagerly awaiting the review and merge of this patch :-)

@nielsdos @arnaud-lb Any indication when this might get code reviewed?

Kudos to the PHP maintainers, I greatly appreciate all you do 👍

@nielsdos
Copy link
Member Author

nielsdos commented Apr 7, 2023

I don't know when this can be reviewed 😅
Either we wait for Arnaud or someone else to review this :) There is no dedicated maintainer for TSRM or Apache AFAIK
There is still a little bit of time until the next patch release though

Copy link
Member

@arnaud-lb arnaud-lb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

This looks good to me

@nielsdos nielsdos closed this in 51faf04 Apr 8, 2023
@nielsdos
Copy link
Member Author

nielsdos commented Apr 8, 2023

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PHP 8.1.16 segfaults on line 597 of sapi/apache2handler/sapi_apache2.c
5 participants