Skip to content

Mark unterminated INIT_CALL and SEND_VAL instructions as NOP when removing unreachable basic blocks #5358

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

iluuu1994
Copy link
Member

@iluuu1994 iluuu1994 commented Apr 7, 2020

@nikic @dstogov

DISCLAIMER: This is a "fix" for the memory leak in #5279. I don't expect this PR to be usable but maybe it can demonstrate the problem so you can point me to the right solution. But it does indeed solve the problem.

try {
    throw new Exception(throw new Exception('foo'));
} catch (Exception $ex) {}

generates these opcodes:

$_main:
     ; (lines=11, args=0, vars=1, tmps=4)
     ; (before optimizer)
     ; /tmp/php-src/Zend/tests/throw/001.php:1-124
     ; return  [] RANGE[0..0]
0000 V1 = NEW 1 string("Exception")
0001 V2 = NEW 1 string("Exception")
0002 SEND_VAL_EX string("foo") 1
0003 DO_FCALL
0004 THROW V2
0005 SEND_VAL_EX bool(true) 1
0006 DO_FCALL
0007 THROW V1
0008 JMP 0010
0009 CV0($e) = CATCH string("Exception")
0010 RETURN int(1)
LIVE RANGES:
     1: 0001 - 0007 (new)
     2: 0002 - 0004 (new)
EXCEPTION TABLE:
     0000, 0009, -, -

which are then optimized to:

$_main:
     ; (lines=7, args=0, vars=1, tmps=1)
     ; (after optimizer)
     ; /tmp/php-src/Zend/tests/throw/001.php:1-124
0000 V1 = NEW 1 string("Exception")
0001 V1 = NEW 1 string("Exception")
0002 SEND_VAL_EX string("foo") 1
0003 DO_FCALL
0004 THROW V1
0005 CV0($e) = CATCH string("Exception")
0006 RETURN int(1)
LIVE RANGES:
     1: 0002 - 0004 (new)
EXCEPTION TABLE:
     0000, 0005, -, -

The problem here is that the instruction 0000 isn't removed. zend_default_exception_new assumes that DO_FCALL will be called at some point to release memory it has allocated which is never does. Here is a different example:

var_dump(
    var_dump('foo'),
    exit()
);

Unoptimized:

$_main:
     ; (lines=9, args=0, vars=0, tmps=2)
     ; (before optimizer)
     ; /Users/ilijatovilo/Developer/php-src/ext/opcache/tests/unterminated_init_fn_bug.php:1-8
     ; return  [] RANGE[0..0]
0000 INIT_FCALL 2 112 string("var_dump")
0001 INIT_FCALL 1 96 string("var_dump")
0002 SEND_VAL string("foo") 1
0003 V0 = DO_ICALL
0004 SEND_VAR V0 1
0005 EXIT
0006 SEND_VAL bool(true) 2
0007 DO_ICALL
0008 RETURN int(1)
string(3) "foo"

Optimized:

$_main:
     ; (lines=6, args=0, vars=0, tmps=1)
     ; (after optimizer)
     ; /Users/ilijatovilo/Developer/php-src/ext/opcache/tests/unterminated_init_fn_bug.php:1-8
0000 INIT_FCALL 2 112 string("var_dump")
0001 INIT_FCALL 1 96 string("var_dump")
0002 SEND_VAL string("foo") 1
0003 V0 = DO_ICALL
0004 SEND_VAL null 1
0005 EXIT
string(3) "foo"

The instructions 0000 and 0004 should be removed, This doesn't cause any problems but they are useless. This PR marks unterminated INIT_CALLs and SEND_VALs of previous building blocks as NOPs.

@dstogov
Copy link
Member

dstogov commented Apr 8, 2020

Converting THROW into expression changed Optimizer assumptions and it starts to optimize it incorrectly. Actually, it already incorrectly optimizes exit(). It doesn't keep live range for result of the first NEW.

<?php
class A {
	function __construct($a, $b){
	}
}
new A(new A(1,2), exit(0));
$ sapi/cli/php -d opcache.opt_debug_level=0x20000 leak2.php 

$_main:
     ; (lines=7, args=0, vars=0, tmps=1)
     ; (after optimizer)
     ; /home/dmitry/php/php-master/CGI-DEBUG/leak2.php:1-7
0000 V0 = NEW 2 string("A")
0001 V0 = NEW 2 string("A")
0002 SEND_VAL int(1) 1
0003 SEND_VAL int(2) 2
0004 DO_FCALL
0005 SEND_VAR V0 1
0006 EXIT int(0)
LIVE RANGES:
     0: 0002 - 0005 (new)

@iluuu1994 I think, your approach with removing INIT_FCALL, SEND_VAL is incorrect.
The simplest fix is, probably, adding FREE before EXIT/THROW in compiler (similar to RETURN).

@nikic ?

@nikic
Copy link
Member

nikic commented Apr 8, 2020

Yes, I don't think this problem really has to do anything with calls, it will probably also appear with other typical live-range uses like ($ary + [1]) + throw new Exception.

Handling EXIT/THROW similar to RETURN in the compiler makes sense to me, I'll take a look at that.

@iluuu1994
Copy link
Member Author

I see, thanks guys for your assessment! Regardless, wouldn't it be better to optimize away unused INIT_CALL and SEND_VAL instructions when DO_CALL will never be executed?

@nikic
Copy link
Member

nikic commented Apr 8, 2020

I see, thanks guys for your assessment! Regardless, wouldn't it be better to optimize away unused INIT_CALL and SEND_VAL instructions when DO_CALL will never be executed?

INIT_CALL and SEND_VAL can have a lot of side-effects (e.g., function does not exist, passing constant by reference, etc), so only very limited cases can be removed. It's unlikely to be worthwhile.

Handling EXIT/THROW similar to RETURN in the compiler makes sense to me, I'll take a look at that.

Hm, this doesn't look simple. The problem is that we currently only track loop vars on the loop/finally stack, on the premise that these are the only ones that can occur on a statement level. In this case we would need to know all the currently open live ranges, which are not so easy to compute (this would require part of the logic from zend_calc_live_ranges).

We protect against the removal of loop var frees using ZEND_BB_UNREACHABLE_FREE, which is also not easy to extend to this case either.

@iluuu1994
Copy link
Member Author

That's interesting, I wasn't aware INIT_CALL checks if the function exists. I assumed that would be done in DO_CALL.

I checked your example. It does indeed leak memory.

$ary = [];
($ary + [1]) + throw new Exception();

php-src/Zend/zend_hash.c(2050) : Freeing 0x000000010c0034e0 (56 bytes)
php-src/Zend/zend_hash.c(131) : Freeing 0x000000010c05f3c0 (264 bytes)

@iluuu1994
Copy link
Member Author

I'll close this PR as this isn't the right approach. @nikic Do you have some solution in mind or should I investigate further?

@TysonAndre
Copy link
Contributor

If this can't be fixed cleanly, would it be possible to use opline->extended_value for ZEND_EXIT and ZEND_THROW when exit or throw, to indicate that those are used as (possible) expressions instead of statements?
Or if extended_value is already used in a conflicting way, add brand new statement types such as ZEND_EXIT_EXPR and ZEND_THROW_EXPR.

  • Or is it possible to used the RETVAL_USED macros to help with that?

Then maybe it'd be possible for the optimizer would act as though ZEND_EXIT_EXPR/ZEND_THROW_EXPR wasn't the end of a basic block, and avoid these memory leaks

(I guess there'd still need to be the following ZEND_JMP for conditionals just so that control flow and ranges could be analyzed)

@TysonAndre
Copy link
Contributor

Then again, my idea has the drawback that opcache wouldn't infer the block containing throw is unreachable, so it would merge the placeholder type of ZEND_THROW_EXPR in code with conditionals that can branch, probably resulting in overly broad types for the examples in https://wiki.php.net/rfc/throw_expression

// $value is non-nullable.
$value = $nullableValue ?? throw new InvalidArgumentException();

@TysonAndre
Copy link
Contributor

If I remember correctly, some phases of the optimizer were turned off when a function body contains catch statements.

Another possible workaround would be to disable the optimizer (or just the problematic phases of the optimizer) if there is a throw expression (not a statement) anywhere in the function scope (maybe with exceptions for ||, &&, and fn() => throw (implicit return throw), although I do hope it gets solved cleanly. I'm unfamiliar with the live range variable tracking implementation, especially in how it deals with return/throw.

The problem is that we currently only track loop vars on the loop/finally stack, on the premise that these are the only ones that can occur on a statement level. In this case we would need to know all the currently open live ranges, which are not so easy to compute (this would require part of the logic from zend_calc_live_ranges).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants