Skip to content

Delayed notice again #12805

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from
Draft

Delayed notice again #12805

wants to merge 2 commits into from

Conversation

iluuu1994
Copy link
Member

This is another attempt to delay warnings (see #12090).

This attempt implements an idea from @bwoebi to completely avoid the RT_CONSTANT issue by copying constant zvals to EG. This solution simplifies the implementation a lot and makes this feasible again.

@dstogov Are you happier with this approach? If so, I will see what changes are necessary for JIT.

nikic and others added 2 commits November 28, 2023 00:34
This is a prototype for fixing a long-standing source of interrupt
vulnerabilities: A notice is emitted during execution of an opcode,
resulting in an error handling being run. The error handler modifies
some data structure the opcode is working on, resulting in UAF or
other memory corruption.

The idea here is to instead collect notices and only process them
after the opcode. This is implemented similarly to exception
handling, by switching to a ZEND_HANDLE_DELAYED_ERROR opcode,
which will then switch back to the normal opcode stream.

Unfortunately, what this prototype implements is not sufficient.
Opcodes that acquire direct (INDIRECT) references to zvals require
that no interrupts occur between the producing and the consuming
opcode. Chains of W/RW opcodes should be executed without interrupt.
Currently, the notice is only delayed until after the first opcode,
which still results in an illegal interrupt (bug78598.phpt shows
a UAF with this change).

I'm not sure how to best handle that issue.
info->message = zend_vstrpprintf(0, format, args);
va_end(args);

zend_hash_next_index_insert_ptr(&EG(delayed_errors), info);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess a single linked list in zend_error_info would be more straightforward than using a hashtable for that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used as a stack, essentially, and avoids an allocation for each warning. Should I use zend_stack instead?

break;
}

// FIXME: Is this guaranteed to be there?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the locations where a delayed error may be thrown currently, yes.
We might want to simply ZEND_ASSERT(next_op < EX(func)->op_array.opcodes + EX(func)->op_array.last)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably ok to just check, this handler is part of a slow path, and not intended to be particularly fast.

Copy link
Member

@dstogov dstogov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how this will work with JIT.
Can this approach solve ALL the user_error handler related problems?

Comment on lines +3 to +4
--INI--
opcache.jit=0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you disable JIT. Its not supported yet?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, JIT doesn't seem to work but I have not looked into where it goes wrong.


if (EG(current_execute_data)->opline != EG(delayed_error_op)) {
EG(opline_before_exception) = EG(current_execute_data)->opline;
EG(current_execute_data)->opline = EG(delayed_error_op);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you like to implement this branch in JIT?
Note that you might need to save all data kept in CPU registers before branching.
The jump back from delayed warning to normal control flow is not possible at all.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh... I assumed JIT handled exception the same way as the VM, checking EX(opline) rather than EG(exception). It also calls the EG(exception_op)->handler directly, rather than looking up EX(opline)->handler. That indeed will not work here.

The jump back from delayed warning to normal control flow is not possible at all.

I will have to look more closely how this works for exceptions to understand what changes are required. I assumed exceptions trigger deoptimization but I guess that's not correct.

Comment on lines +8125 to +8126
zend_op *delayed_op = &EG(delayed_error_op)[0];
*delayed_op = *next_op;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case you have few FETCH instructions in row and few of them produce warnings, then EG(delayed_error_op)[0] is going to be overridden few times. Is it OK?
Will this work in conjunction with magic __get() and different warnings in main code and magic methods?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case you have few FETCH instructions in row and few of them produce warnings, then EG(delayed_error_op)[0] is going to be overridden few times. Is it OK?

Yes. The idea is that each instruction producing indirect values only delays to the next opcode. If that opcode also produces an indirect value, then it wil once again set EG(delayed_error_op)[0] to the instruction after that, until eventually the indirect value is used and the error can be emitted.

Note that this happens only if the first FETCH emits a warning. The error is still delayed until the fetch+assign chain.

Will this work in conjunction with magic __get() and different warnings in main code and magic methods?

I didn't test this, but I think one unexpected things could be that we don't associate EG(delayed_errors) with a particular stack frame. If the magic method produces a warning, pending warnings from the outer VM call will also be handled. This could be solved by storing the execute_data on the delayed error, and only handling those belonging to the current one. Apart from that I believe magic methods should work correctly with this approach.

Copy link
Member Author

@iluuu1994 iluuu1994 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this approach solve ALL the user_error handler related problems?

I think this should solve all the issues we've experienced lately, for every warning that is actually delayed. I'm not sure yet which warnings require delaying. There's the obvious BC break where previously set_error_handler could abort the operation half-way through, whereas after this PR it could only abort control flow once the current handler has finished. Of course, there are some warning where this is undesired (e.g. function deprecations) that should continue being emitted before the operation starts.

Depending on whether you think JIT can be implemented without massive effort, I will take a look at what warnings need to be changed.

Comment on lines +3 to +4
--INI--
opcache.jit=0
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, JIT doesn't seem to work but I have not looked into where it goes wrong.


if (EG(current_execute_data)->opline != EG(delayed_error_op)) {
EG(opline_before_exception) = EG(current_execute_data)->opline;
EG(current_execute_data)->opline = EG(delayed_error_op);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh... I assumed JIT handled exception the same way as the VM, checking EX(opline) rather than EG(exception). It also calls the EG(exception_op)->handler directly, rather than looking up EX(opline)->handler. That indeed will not work here.

The jump back from delayed warning to normal control flow is not possible at all.

I will have to look more closely how this works for exceptions to understand what changes are required. I assumed exceptions trigger deoptimization but I guess that's not correct.

Comment on lines +8125 to +8126
zend_op *delayed_op = &EG(delayed_error_op)[0];
*delayed_op = *next_op;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case you have few FETCH instructions in row and few of them produce warnings, then EG(delayed_error_op)[0] is going to be overridden few times. Is it OK?

Yes. The idea is that each instruction producing indirect values only delays to the next opcode. If that opcode also produces an indirect value, then it wil once again set EG(delayed_error_op)[0] to the instruction after that, until eventually the indirect value is used and the error can be emitted.

Note that this happens only if the first FETCH emits a warning. The error is still delayed until the fetch+assign chain.

Will this work in conjunction with magic __get() and different warnings in main code and magic methods?

I didn't test this, but I think one unexpected things could be that we don't associate EG(delayed_errors) with a particular stack frame. If the magic method produces a warning, pending warnings from the outer VM call will also be handled. This could be solved by storing the execute_data on the delayed error, and only handling those belonging to the current one. Apart from that I believe magic methods should work correctly with this approach.

break;
}

// FIXME: Is this guaranteed to be there?
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably ok to just check, this handler is part of a slow path, and not intended to be particularly fast.

info->message = zend_vstrpprintf(0, format, args);
va_end(args);

zend_hash_next_index_insert_ptr(&EG(delayed_errors), info);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used as a stack, essentially, and avoids an allocation for each warning. Should I use zend_stack instead?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants