Mark unterminated INIT_CALL and SEND_VAL instructions as NOP when removing unreachable basic blocks #5358

iluuu1994 · 2020-04-07T21:41:11Z

DISCLAIMER: This is a "fix" for the memory leak in #5279. I don't expect this PR to be usable but maybe it can demonstrate the problem so you can point me to the right solution. But it does indeed solve the problem.

try {
    throw new Exception(throw new Exception('foo'));
} catch (Exception $ex) {}

generates these opcodes:

$_main:
     ; (lines=11, args=0, vars=1, tmps=4)
     ; (before optimizer)
     ; /tmp/php-src/Zend/tests/throw/001.php:1-124
     ; return  [] RANGE[0..0]
0000 V1 = NEW 1 string("Exception")
0001 V2 = NEW 1 string("Exception")
0002 SEND_VAL_EX string("foo") 1
0003 DO_FCALL
0004 THROW V2
0005 SEND_VAL_EX bool(true) 1
0006 DO_FCALL
0007 THROW V1
0008 JMP 0010
0009 CV0($e) = CATCH string("Exception")
0010 RETURN int(1)
LIVE RANGES:
     1: 0001 - 0007 (new)
     2: 0002 - 0004 (new)
EXCEPTION TABLE:
     0000, 0009, -, -

which are then optimized to:

$_main:
     ; (lines=7, args=0, vars=1, tmps=1)
     ; (after optimizer)
     ; /tmp/php-src/Zend/tests/throw/001.php:1-124
0000 V1 = NEW 1 string("Exception")
0001 V1 = NEW 1 string("Exception")
0002 SEND_VAL_EX string("foo") 1
0003 DO_FCALL
0004 THROW V1
0005 CV0($e) = CATCH string("Exception")
0006 RETURN int(1)
LIVE RANGES:
     1: 0002 - 0004 (new)
EXCEPTION TABLE:
     0000, 0005, -, -

The problem here is that the instruction 0000 isn't removed. zend_default_exception_new assumes that DO_FCALL will be called at some point to release memory it has allocated which is never does. Here is a different example:

var_dump(
    var_dump('foo'),
    exit()
);

Unoptimized:

$_main:
     ; (lines=9, args=0, vars=0, tmps=2)
     ; (before optimizer)
     ; /Users/ilijatovilo/Developer/php-src/ext/opcache/tests/unterminated_init_fn_bug.php:1-8
     ; return  [] RANGE[0..0]
0000 INIT_FCALL 2 112 string("var_dump")
0001 INIT_FCALL 1 96 string("var_dump")
0002 SEND_VAL string("foo") 1
0003 V0 = DO_ICALL
0004 SEND_VAR V0 1
0005 EXIT
0006 SEND_VAL bool(true) 2
0007 DO_ICALL
0008 RETURN int(1)
string(3) "foo"

Optimized:

$_main:
     ; (lines=6, args=0, vars=0, tmps=1)
     ; (after optimizer)
     ; /Users/ilijatovilo/Developer/php-src/ext/opcache/tests/unterminated_init_fn_bug.php:1-8
0000 INIT_FCALL 2 112 string("var_dump")
0001 INIT_FCALL 1 96 string("var_dump")
0002 SEND_VAL string("foo") 1
0003 V0 = DO_ICALL
0004 SEND_VAL null 1
0005 EXIT
string(3) "foo"

The instructions 0000 and 0004 should be removed, This doesn't cause any problems but they are useless. This PR marks unterminated INIT_CALLs and SEND_VALs of previous building blocks as NOPs.

when removing unreachable basic blocks

dstogov · 2020-04-08T06:58:28Z

Converting THROW into expression changed Optimizer assumptions and it starts to optimize it incorrectly. Actually, it already incorrectly optimizes exit(). It doesn't keep live range for result of the first NEW.

<?php
class A {
	function __construct($a, $b){
	}
}
new A(new A(1,2), exit(0));

$ sapi/cli/php -d opcache.opt_debug_level=0x20000 leak2.php 

$_main:
     ; (lines=7, args=0, vars=0, tmps=1)
     ; (after optimizer)
     ; /home/dmitry/php/php-master/CGI-DEBUG/leak2.php:1-7
0000 V0 = NEW 2 string("A")
0001 V0 = NEW 2 string("A")
0002 SEND_VAL int(1) 1
0003 SEND_VAL int(2) 2
0004 DO_FCALL
0005 SEND_VAR V0 1
0006 EXIT int(0)
LIVE RANGES:
     0: 0002 - 0005 (new)

@iluuu1994 I think, your approach with removing INIT_FCALL, SEND_VAL is incorrect.
The simplest fix is, probably, adding FREE before EXIT/THROW in compiler (similar to RETURN).

@nikic ?

nikic · 2020-04-08T07:56:27Z

Yes, I don't think this problem really has to do anything with calls, it will probably also appear with other typical live-range uses like ($ary + [1]) + throw new Exception.

Handling EXIT/THROW similar to RETURN in the compiler makes sense to me, I'll take a look at that.

iluuu1994 · 2020-04-08T08:03:35Z

I see, thanks guys for your assessment! Regardless, wouldn't it be better to optimize away unused INIT_CALL and SEND_VAL instructions when DO_CALL will never be executed?

nikic · 2020-04-08T08:12:40Z

I see, thanks guys for your assessment! Regardless, wouldn't it be better to optimize away unused INIT_CALL and SEND_VAL instructions when DO_CALL will never be executed?

INIT_CALL and SEND_VAL can have a lot of side-effects (e.g., function does not exist, passing constant by reference, etc), so only very limited cases can be removed. It's unlikely to be worthwhile.

Handling EXIT/THROW similar to RETURN in the compiler makes sense to me, I'll take a look at that.

Hm, this doesn't look simple. The problem is that we currently only track loop vars on the loop/finally stack, on the premise that these are the only ones that can occur on a statement level. In this case we would need to know all the currently open live ranges, which are not so easy to compute (this would require part of the logic from zend_calc_live_ranges).

We protect against the removal of loop var frees using ZEND_BB_UNREACHABLE_FREE, which is also not easy to extend to this case either.

iluuu1994 · 2020-04-08T09:44:29Z

That's interesting, I wasn't aware INIT_CALL checks if the function exists. I assumed that would be done in DO_CALL.

I checked your example. It does indeed leak memory.

$ary = [];
($ary + [1]) + throw new Exception();

php-src/Zend/zend_hash.c(2050) : Freeing 0x000000010c0034e0 (56 bytes)
php-src/Zend/zend_hash.c(131) : Freeing 0x000000010c05f3c0 (264 bytes)

iluuu1994 · 2020-04-10T17:38:41Z

I'll close this PR as this isn't the right approach. @nikic Do you have some solution in mind or should I investigate further?

TysonAndre · 2020-04-19T15:33:25Z

If this can't be fixed cleanly, would it be possible to use opline->extended_value for ZEND_EXIT and ZEND_THROW when exit or throw, to indicate that those are used as (possible) expressions instead of statements?
Or if extended_value is already used in a conflicting way, add brand new statement types such as ZEND_EXIT_EXPR and ZEND_THROW_EXPR.

Or is it possible to used the RETVAL_USED macros to help with that?

Then maybe it'd be possible for the optimizer would act as though ZEND_EXIT_EXPR/ZEND_THROW_EXPR wasn't the end of a basic block, and avoid these memory leaks

(I guess there'd still need to be the following ZEND_JMP for conditionals just so that control flow and ranges could be analyzed)

TysonAndre · 2020-04-19T15:43:49Z

Then again, my idea has the drawback that opcache wouldn't infer the block containing throw is unreachable, so it would merge the placeholder type of ZEND_THROW_EXPR in code with conditionals that can branch, probably resulting in overly broad types for the examples in https://wiki.php.net/rfc/throw_expression

// $value is non-nullable.
$value = $nullableValue ?? throw new InvalidArgumentException();

TysonAndre · 2020-04-23T14:04:29Z

If I remember correctly, some phases of the optimizer were turned off when a function body contains catch statements.

Another possible workaround would be to disable the optimizer (or just the problematic phases of the optimizer) if there is a throw expression (not a statement) anywhere in the function scope (maybe with exceptions for ||, &&, and fn() => throw (implicit return throw), although I do hope it gets solved cleanly. I'm unfamiliar with the live range variable tracking implementation, especially in how it deals with return/throw.

The problem is that we currently only track loop vars on the loop/finally stack, on the premise that these are the only ones that can occur on a statement level. In this case we would need to know all the currently open live ranges, which are not so easy to compute (this would require part of the logic from zend_calc_live_ranges).

Mark unterminated INIT_CALL and SEND_VAL instructions as NOP

96f6b3b

when removing unreachable basic blocks

iluuu1994 closed this Apr 10, 2020

iluuu1994 mentioned this pull request Apr 19, 2020

[RFC] Make throw statement an expression #5279

Closed

TysonAndre mentioned this pull request Apr 21, 2020

[RFC] Match expression #5371

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Mark unterminated INIT_CALL and SEND_VAL instructions as NOP when removing unreachable basic blocks #5358

Mark unterminated INIT_CALL and SEND_VAL instructions as NOP when removing unreachable basic blocks #5358

Uh oh!

iluuu1994 commented Apr 7, 2020 •

edited

Loading

Uh oh!

dstogov commented Apr 8, 2020

Uh oh!

nikic commented Apr 8, 2020

Uh oh!

iluuu1994 commented Apr 8, 2020

Uh oh!

nikic commented Apr 8, 2020

Uh oh!

iluuu1994 commented Apr 8, 2020

Uh oh!

iluuu1994 commented Apr 10, 2020

Uh oh!

TysonAndre commented Apr 19, 2020

Uh oh!

TysonAndre commented Apr 19, 2020

Uh oh!

TysonAndre commented Apr 23, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Mark unterminated INIT_CALL and SEND_VAL instructions as NOP when removing unreachable basic blocks #5358

Mark unterminated INIT_CALL and SEND_VAL instructions as NOP when removing unreachable basic blocks #5358

Uh oh!

Conversation

iluuu1994 commented Apr 7, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dstogov commented Apr 8, 2020

Uh oh!

nikic commented Apr 8, 2020

Uh oh!

iluuu1994 commented Apr 8, 2020

Uh oh!

nikic commented Apr 8, 2020

Uh oh!

iluuu1994 commented Apr 8, 2020

Uh oh!

iluuu1994 commented Apr 10, 2020

Uh oh!

TysonAndre commented Apr 19, 2020

Uh oh!

TysonAndre commented Apr 19, 2020

Uh oh!

TysonAndre commented Apr 23, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

iluuu1994 commented Apr 7, 2020 •

edited

Loading