Skip to content

fix: support for timeouts with ZTS on Linux #10141

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 36 commits into from
Mar 3, 2023

Conversation

dunglas
Copy link
Member

@dunglas dunglas commented Dec 20, 2022

Closes https://bugs.php.net/bug.php?id=79464, #9738 and golang/go#56260.
Follows the plan described in this mail: https://externals.io/message/118859.

Currently, timeouts don't work at all on ZTS builds because signals are emitted by setitimer(), which is per-process. This patch fixes the problem on Linux by using timer_create() with SIGEV_THREAD_ID, as suggested by @nikic.

Also, this patch uses SIGIO instead of SIGPROF to prevent messing with profilers. This also improves the compatibility of PHP with Go (which is important for FrankenPHP). Why hijack SIGIO? Because there is no good way to choose this signal, and SIGIO is probably the safest signal in our case (SIGURG is already used internally by Go for its non-cooperative preemption feature).

This fix only works for Linux because Mac and BSD don't implement timer_create(). It may be possible to use libdispatch on Mac and FreeBSD to achieve something similar, but it's out of scope for this patch.

It's safe to merge this in 8.1 because the new system is only enabled for Linux ZTS builds, and timeouts are currently totally broken for this platform.

I wrote a test in FrankenPHP proving that this patch fixes the problem: dunglas/frankenphp#128
I'm not sure how (and if it's necessary) to write a test for this in PHP. I could add an ad-hoc C program doing something similar as in FrankenPHP's test, WDYT?

TODO:

  • add a ./configure argument to enable or disable Zend Timers

@dunglas dunglas force-pushed the fix/zts-timeout-linux branch from 78d13eb to ca660b6 Compare December 20, 2022 18:27
Copy link
Member

@arnaud-lb arnaud-lb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The choice of SIGIO seems reasonable based on the mentioned heuristics. The fact that the handler is able to ignore SIGIOs that was not caused by this specific timer is a good thing too (can we fallback to a previously set handler in this case?).

The first heuristic (being passed-through by debuggers) is much less relevant for this use-case than for Go's use-case - does it makes new choices available if we ignore this heuristic?

It could we worth it to comment about the reasons that led to SIGIO, so that we don't revert to SIGURG or SIGPROF in the future.

I'm not sure how (and if it's necessary) to write a test for this in PHP. I could add an ad-hoc C program doing something similar as in FrankenPHP's test, WDYT?

If the required effort is moderate, it would be worth it. You could add a function in ext/zend_test to execute a script in a separate thread.

Pinging @morrisonlevi @tstarling who showed interest

Copy link
Contributor

@tstarling tstarling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept reviewed, looks like it should work. Needs a line-by-line close review but that can happen after some cleanups are done.

@dunglas
Copy link
Member Author

dunglas commented Dec 24, 2022

The first heuristic (being passed-through by debuggers) is much less relevant for this use-case than for Go's use-case - does it makes new choices available if we ignore this heuristic?

If we want debuggers to handle the timeout signal, maybe could we also use SIGLOST or SIGSTKFLT. These signals are unused and unlikely to be triggered by real applications.

@dunglas dunglas force-pushed the fix/zts-timeout-linux branch 9 times, most recently from 6805b90 to 18ffe66 Compare December 27, 2022 13:57
@dunglas
Copy link
Member Author

dunglas commented Dec 27, 2022

@arnaud-lb @tstarling the patch is ready for another round of reviews, if you don't mind.

I'm not fond of this:

/* Don't trigger an error here because the timer may not be initialized when PHP fail early, and on threads created by PHP but not managed by it. */
# ifdef TIMER_DEBUG
fprintf(stderr, "Could not delete timer that has not been created on thread %d\n", (uintmax_t) timer, (pid_t) syscall(SYS_gettid));
# endif
return;

I would prefer throwing an error, but if I do, ext/opcache/tests/preload_parse_error.phpt fails: timer_delete() is called before timer_create() (I'm not sure why), and generating the error message segfaults on *filename = ZSTR_KNOWN(ZEND_STR_UNKNOWN_CAPITALIZED); in get_filename_lineno() (called by zend_error_noreturn()).

I'm trying to reproduce the FPM errors, but it's most likely related to the calls to fork(): children don't inherit timers (as you can see in the patch, I already fixed this issue for pcntl and ext/opcache.

@dunglas dunglas force-pushed the fix/zts-timeout-linux branch 2 times, most recently from bbea10d to 7761962 Compare December 30, 2022 15:37
@dunglas
Copy link
Member Author

dunglas commented Dec 30, 2022

Tests are now green and the code looks good to me. Could you look if the last version of this patch is ok for you?

I'm hesitating on the naming: "Zend Timer" or "Zend Timers"?

@morrisonlevi
Copy link
Contributor

FYI, I will take a look at this tomorrow. I think the plural "Zend Timers" is better, personally.

Copy link
Contributor

@morrisonlevi morrisonlevi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't ran the code yet, this review is just from reading through it. I plan to check out the branch later today to test it under Apache Event MPM.

@dunglas
Copy link
Member Author

dunglas commented Jan 3, 2023

@morrisonlevi it looks like this MPM uses both fork() and threads internally. We'll have to call zend_timer_create() in ap_hook_child_init() to have proper support in this SAPI because of the call to fork(). I'll update the PR.

@morrisonlevi
Copy link
Contributor

morrisonlevi commented Jan 3, 2023

@morrisonlevi it looks like this MPM uses both fork() and threads internally. We'll have to call zend_timer_create() in ap_hook_child_init() to have proper support in this SAPI because of the call to fork(). I'll update the PR.

What part of the sapi_module_struct does ap_hook_child_init correspond to? Do we already have a hook in the right place?

Edit: Ah, found it here: https://heap.space/xref/php-src/sapi/apache2handler/sapi_apache2.c?r=aef7d810#758.

@dunglas dunglas force-pushed the fix/zts-timeout-linux branch 4 times, most recently from 58a0a04 to 7ac86b6 Compare January 10, 2023 09:32
@dunglas
Copy link
Member Author

dunglas commented Jan 10, 2023

  • Wall time is now used instead of CPU-time
  • The API is now more consistent with the API of Zend Signals
  • I fixed the Apache SAPI

My tests with Apache Event MPM were successful.

@morrisonlevi @arnaud-lb would you mind reviewing again? Thanks!

@arnaud-lb
Copy link
Member

This looks good to me, but it's unfortunate that we have to handle forks.

EG(timer) is used only during execution, and is initialized in init_executor(), so we could let EG(timer) be initialized again by init_executor() if the fork executes a php script.

If we store the process id at the same time we initialize EG(timer), we can find whether the timer was created by the current process or not in zend_timer_shutdown().

pcntl_fork() would still need zend_timer_init(), though.

@dunglas dunglas force-pushed the fix/zts-timeout-linux branch from 7ac86b6 to c0bef12 Compare January 15, 2023 22:43
@dunglas dunglas force-pushed the fix/zts-timeout-linux branch from e79e9ff to fdcd57e Compare February 27, 2023 15:47
@arnaud-lb arnaud-lb merged commit ad85e71 into php:PHP-8.1 Mar 3, 2023
@arnaud-lb
Copy link
Member

Thank you!

arnaud-lb added a commit that referenced this pull request Mar 3, 2023
* PHP-8.1:
  [ci skip] NEWS
  fix: support for timeouts with ZTS on Linux (#10141)
arnaud-lb added a commit that referenced this pull request Mar 3, 2023
* PHP-8.2:
  [ci skip] NEWS
  [ci skip] NEWS
  fix: support for timeouts with ZTS on Linux (#10141)
@dunglas dunglas deleted the fix/zts-timeout-linux branch March 3, 2023 14:03
@morrisonlevi
Copy link
Contributor

Thanks for your work on this!

petk added a commit to petk/php-src that referenced this pull request Jul 7, 2023
In configure.ac check for function strerror_r defines constant
HAVE_STRERROR_R.
@petk petk mentioned this pull request Jul 7, 2023
dunglas pushed a commit to dunglas/php-src that referenced this pull request Aug 5, 2023
In configure.ac check for function strerror_r defines constant
HAVE_STRERROR_R.
@melroy89
Copy link

This is undocumented or not?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants