Skip to content

Conversation

@alexandre-daubois
Copy link
Member

No description provided.

@alexandre-daubois alexandre-daubois changed the base branch from master to PHP-8.3 September 23, 2025 08:16
@alexandre-daubois alexandre-daubois linked an issue Sep 23, 2025 that may be closed by this pull request
@alexandre-daubois alexandre-daubois changed the title Fix GH-19926: preserve COW violation flags when copying a hashtable while splicing Fix GH-19926: preserve COW violation flags when copying while splicing Sep 23, 2025
@ndossche
Copy link
Member

The code sets the flag at the start, so you'll end up always setting the flag with the new code after splicing is finished. So this looks wrong.

@alexandre-daubois
Copy link
Member Author

Alright, I pushed an alternative that only set the flag temporarily if necessary

@alexandre-daubois alexandre-daubois marked this pull request as ready for review September 23, 2025 09:22
@ndossche
Copy link
Member

ndossche commented Sep 23, 2025

You don't need to branch on refcount and can surround the flag clearing with a preprocessor ZEND_DEBUG check. Alternatively reset the iterator earlier.

@alexandre-daubois
Copy link
Member Author

Resetting the iterator earlier seems the most elegant solution to me. PR updated

@alexandre-daubois alexandre-daubois changed the title Fix GH-19926: preserve COW violation flags when copying while splicing Fix GH-19926: reset internal pointer earlier while splicing array while COW violation flag is still set Sep 23, 2025
Copy link
Member

@ndossche ndossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably right but I can't make a review now.
The test can be simplified to show what's actually going on: note that the id property is not declared so a warning will trigger which will trigger the error handler which will throw. So you can just remove all that logic and throw directly in the destructor.

@alexandre-daubois
Copy link
Member Author

Minimized the test case. Thanks for the review!

Copy link
Member

@iluuu1994 iluuu1994 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has a subtle break:

$a = [1, 2, 3];
unset($a[0]);
next($a);

array_splice($a, 0, 0, [1]);
var_dump(current($a));

Returns 1 before this patch, and 2 after. That's because zend_hash_internal_pointer_reset() advances the iterator until the next valid item, which only appears after copying the relevant fields from out_hash.

@alexandre-daubois
Copy link
Member Author

Hmmm... something like this seems to work but I'm not sure. It seems a pretty complex solution, but I couldn't come up with something better yet 🤔

diff --git a/ext/standard/array.c b/ext/standard/array.c
index 65d721f6bbe..dc0b97192d8 100644
--- a/ext/standard/array.c
+++ b/ext/standard/array.c
@@ -3217,6 +3217,14 @@ static void php_splice(HashTable *in_hash, zend_long offset, zend_long length, H
        GC_ADDREF(in_hash);
        HT_ALLOW_COW_VIOLATION(in_hash); /* Will be reset when setting the flags for in_hash */
 
+       /* Capture current internal pointer key for restoration after splice */
+       zend_string *pointer_key_str = NULL;
+       zend_ulong pointer_key_num = 0;
+       int pointer_key_type = HASH_KEY_NON_EXISTENT;
+       if (in_hash->nInternalPointer < in_hash->nNumUsed) {
+               pointer_key_type = zend_hash_get_current_key(in_hash, &pointer_key_str, &pointer_key_num);
+       }
+
        /* Get number of entries in the input hash */
        num_in = zend_hash_num_elements(in_hash);
 
@@ -3376,7 +3384,7 @@ static void php_splice(HashTable *in_hash, zend_long offset, zend_long length, H
        HT_SET_ITERATORS_COUNT(in_hash, 0);
        in_hash->pDestructor = NULL;
 
-       /* Reset internal pointer while COW violation flag is still set */
+       /* Reset internal pointer while COW violation flag is still set to avoid assertion */
        zend_hash_internal_pointer_reset(in_hash);
 
        if (UNEXPECTED(GC_DELREF(in_hash) == 0)) {
@@ -3397,6 +3405,53 @@ static void php_splice(HashTable *in_hash, zend_long offset, zend_long length, H
        in_hash->nNextFreeElement  = out_hash.nNextFreeElement;
        in_hash->arData            = out_hash.arData;
        in_hash->pDestructor       = out_hash.pDestructor;
+
+       /* Restore internal pointer to the original key position if possible */
+       if (pointer_key_type != HASH_KEY_NON_EXISTENT) {
+               uint32_t search_idx = 0;
+               bool found = false;
+
+               if (HT_IS_PACKED(in_hash)) {
+                       /* For packed arrays, search by numeric key */
+                       if (pointer_key_type == HASH_KEY_IS_LONG && pointer_key_num < in_hash->nNumUsed) {
+                               if (!Z_ISUNDEF(in_hash->arPacked[pointer_key_num])) {
+                                       in_hash->nInternalPointer = pointer_key_num;
+                                       found = true;
+                               }
+                       }
+               } else {
+                       /* For regular arrays, search through arData */
+                       Bucket *p = in_hash->arData;
+                       for (search_idx = 0; search_idx < in_hash->nNumUsed; search_idx++, p++) {
+                               if (Z_ISUNDEF(p->val)) continue;
+
+                               bool key_matches = false;
+                               if (pointer_key_type == HASH_KEY_IS_STRING && p->key && pointer_key_str) {
+                                       key_matches = zend_string_equals(p->key, pointer_key_str);
+                               } else if (pointer_key_type == HASH_KEY_IS_LONG && !p->key) {
+                                       key_matches = (p->h == pointer_key_num);
+                               }
+
+                               if (key_matches) {
+                                       in_hash->nInternalPointer = search_idx;
+                                       found = true;
+                                       break;
+                               }
+                       }
+               }
+
+               /* If key was not found, reset to beginning, but we need to set COW flag temporarily */
+               if (!found) {
+                       HT_ALLOW_COW_VIOLATION(in_hash);
+                       zend_hash_internal_pointer_reset(in_hash);
+                       HT_FLAGS(in_hash) &= ~HASH_FLAG_ALLOW_COW_VIOLATION;
+               }
+       } else {
+               /* No valid pointer was set originally, reset to beginning with COW flag */
+               HT_ALLOW_COW_VIOLATION(in_hash);
+               zend_hash_internal_pointer_reset(in_hash);
+               HT_FLAGS(in_hash) &= ~HASH_FLAG_ALLOW_COW_VIOLATION;
+       }
 }
 /* }}} */

@iluuu1994
Copy link
Member

I think it should be sufficient to replace zend_hash_internal_pointer_reset(); with ht->nInternalPointer = 0;. The iterator will be advanced when actually accessed.

Capture current internal pointer key for restoration after splice

That's exactly opposite of the existing behavior, no? The existing behavior always resets the position to 0.

@alexandre-daubois
Copy link
Member Author

I think I've got things mixed up a bit. But your suggestion worked, thanks!

@iluuu1994
Copy link
Member

The unfortunate part is that destructors might still mess with the iterator position, but at least it shouldn't crash.

@alexandre-daubois
Copy link
Member Author

It seems to be a very specific case for it to have only been spotted now. I imagine that this solution should be sufficient to limit the damage

@iluuu1994
Copy link
Member

Yeah, that's unfortunately how many of these fuzzing issues go. They complicate the code for little real-world benefit. Destructors, error handlers and __toString() in particular are endless sources of similar issues.

Copy link
Member

@ndossche ndossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO this suffices

@iluuu1994
Copy link
Member

IMO, we should consider declaring such destructor issues permanent wontfix, unless we can fix them globally with GH-20001 or similar. Same with error handlers and GH-12805. After that, only __toString() remains, which is also inherently problematic (GH-15938).

Sadly, the code base is already littered with attempts for fix such issues, it will be hard to get rid of all of them if/once we have a general solution.

@iluuu1994
Copy link
Member

But also, I don't object to this fix. Maybe you can clarify why we're not using zend_hash_internal_pointer_reset(), i.e. we specifically want to delay advancing the iterator until we have switched to the new array, as we may find valid indexes in the new array earlier.

@alexandre-daubois
Copy link
Member Author

If it's fine for you, I'll merge during the day 🙂

Copy link
Member

@iluuu1994 iluuu1994 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the adjustments!

@alexandre-daubois alexandre-daubois merged commit 64c1d43 into php:PHP-8.3 Oct 6, 2025
9 checks passed
alexandre-daubois added a commit that referenced this pull request Oct 6, 2025
* PHP-8.3:
  Fix GH-19926: reset internal pointer earlier while splicing array while COW violation flag is still set (#19929)
alexandre-daubois added a commit that referenced this pull request Oct 6, 2025
* PHP-8.4:
  Fix GH-19926: reset internal pointer earlier while splicing array while COW violation flag is still set (#19929)
alexandre-daubois added a commit that referenced this pull request Oct 6, 2025
* PHP-8.5:
  Fix GH-19926: reset internal pointer earlier while splicing array while COW violation flag is still set (#19929)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

assertion failure zend_hash_internal_pointer_reset_ex

3 participants