-
Notifications
You must be signed in to change notification settings - Fork 7.8k
preserves HASH_FLAG_ALLOW_COW_VIOLATION in zend_hash_real_init_ex() #13013
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ext/zend_test/test.c
Outdated
{ | ||
ZEND_PARSE_PARAMETERS_NONE(); | ||
HashTable *ht = _zend_new_array_0(); | ||
HT_ALLOW_COW_VIOLATION(ht); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you really need this, can you not initialize the HT before calling HT_ALLOW_COW_VIOLATION
? Hash tables are very hot, so changes need to be made with caution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you really need this, can you not initialize the HT before calling HT_ALLOW_COW_VIOLATION?
@iluuu1994 Yes, that's one of the possible workarounds.
But if the bug can be fixed and benchmarks show no regressions, then why not fix it? 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change may add several instructions + some register pressure (keeping prev_flags alive) for little benefit (see https://github.com/php/php-src/actions/runs/7309927666?pr=13013#summary-19918042085). This function may be called many thousand times during the handling of a request. If your function relies on HT_ALLOW_COW_VIOLATION
, it seems better to make it responsible to set the flag in a way that is compatible with the rest of the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed the benchmarks show a small but noticeable impact.
If your function relies on HT_ALLOW_COW_VIOLATION, it seems better to make it responsible to set the flag in a way that is compatible with the rest of the code.
Yeah, that's what I'm already doing for my use case.
I was hoping to come-up with a solution so that the next person needing this functionality wouldn't waste time on finding the cause of the issue and reinventing a workaround.
Alas, unless I've missed something, there doesn't seem to be a zero-cost fix for this... 😞
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed the benchmarks show a small but noticeable impact.
The benchmark isn't always reliable, but I would consider the diff quite large.
I was hoping to come-up with a solution so that the next person needing this functionality wouldn't waste time on finding the cause of the issue and reinventing a workaround.
IMO, a comment works well for that. :)
Bob doesn't agree. Dmitry is the primary maintainer of hash maps, so it makes sense to wait for his comment.
116e03a
to
a6ce4c5
Compare
I don't think we should modify PHP code to support this hack better. It should be possible to solve the problem in a "user-space" code initializing the HashTable and setting |
I understand why you call this is a hack, but in the Rust wrapper I'm writing for the Zend API, it is in fact essential for implementing the RC=1 separation pattern for (possibly in-place) modification of hashtables while guaranteeing memory safety. I can explain the details if that is of interest to anybody.
That's fair, I understand the legitimate uses-cases are extremely rare and specific, and workarounds are indeed possible. It seems I'm not the only one to have been bitten by this issue however, so as @iluuu1994 mentioned, the fact that the flag is cleared by I can update this PR (or submit another one) to add said documentation if you want. Just let me know. |
Yeah, please try to make a short explanation. (I'm not sure if I'll able to review it in the next 2 weeks).
In case we have docs about |
OK, trying to make it short is going to be tough but I'll try anyway... The main point is that a
If the Zend Engine were written in pure rust, you'd probably have something resembling a Rc<RefCell<zend_array>> to enable mutable shared ownership through runtime borrow-checking. But If you were allowed to take an exclusive reference to a Now that we've seen that the only way to safely borrow shared pointers from a
Which could be implemented using a helper type like the And I think that's as short as it can get for now... |
I'm afraid there are none in the source code ATM. As for the book, well... |
I think, I more or less understood what you are trying to do. In case you increment reference counter of HashTable, the corresponding array may be silently separated by PHP code and your Rust/FFI will stay with a detached copy of the same array. Usage of May be you should convert the captured arrays (or any zvals) to PHP references. This way you'll take a PHP reference with rc==2 and array with rc==1. This approach should work. May be @nikic may give you a better advise. |
Unfortunately, that wouldn't work. The rules I mentioned in my previous comment apply to all reference-counted types, which include
Yes that could theoretically happen in the examples i've linked to. In this case, the solution is simple: we just need to change our At that point, the potential for invalid usage is greatly reduced, and we're not talking about memory safety issues anymore. As a side note, the Zend API itself does not completely prevent misuse either. For example it is possible to directly modify an array's values / buckets in a
Yes, I get it. But again, it is to my knowledge the only way to actually implement the array separation rule while guaranteeing memory safety (and without refactoring the ZE itself of course). So for now my choice is between either carefuly using this flag or exposing an unsafe-only API (which would defeat the whole purpose of writing a Rust wrapper). Pick your poison I guess... Of course I'd be glad to hear other suggestions, but I'm afraid we've already gone far off-topic of the original issue. |
WRT the issue at hand, I can infer from the discussion that modifying the existing behaviour is not desired. From the following comment:
...and the absence of documentation about this flag, I can infer that documenting the issue is not needed/wanted. Therefore I'm closing this PR. Feel free to reopen if these assumptions were wrong. 😉 |
See #12986 for context.
Closes #12986