- 
                Notifications
    You must be signed in to change notification settings 
- Fork 8k
Fix GH-19926: reset internal pointer earlier while splicing array while COW violation flag is still set #19929
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix GH-19926: reset internal pointer earlier while splicing array while COW violation flag is still set #19929
Conversation
| The code sets the flag at the start, so you'll end up always setting the flag with the new code after splicing is finished. So this looks wrong. | 
fdf5f9e    to
    c02f35a      
    Compare
  
    | Alright, I pushed an alternative that only set the flag temporarily if necessary | 
| You don't need to branch on refcount and can surround the flag clearing with a preprocessor ZEND_DEBUG check. Alternatively reset the iterator earlier. | 
c02f35a    to
    4985d41      
    Compare
  
    | Resetting the iterator earlier seems the most elegant solution to me. PR updated | 
4985d41    to
    b863656      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably right but I can't make a review now.
The test can be simplified to show what's actually going on: note that the id property is not declared so a warning will trigger which will trigger the error handler which will throw. So you can just remove all that logic and throw directly in the destructor.
b863656    to
    7eae214      
    Compare
  
    | Minimized the test case. Thanks for the review! | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has a subtle break:
$a = [1, 2, 3];
unset($a[0]);
next($a);
array_splice($a, 0, 0, [1]);
var_dump(current($a));Returns 1 before this patch, and 2 after. That's because zend_hash_internal_pointer_reset() advances the iterator until the next valid item, which only appears after copying the relevant fields from out_hash.
| Hmmm... something like this seems to work but I'm not sure. It seems a pretty complex solution, but I couldn't come up with something better yet 🤔 diff --git a/ext/standard/array.c b/ext/standard/array.c
index 65d721f6bbe..dc0b97192d8 100644
--- a/ext/standard/array.c
+++ b/ext/standard/array.c
@@ -3217,6 +3217,14 @@ static void php_splice(HashTable *in_hash, zend_long offset, zend_long length, H
        GC_ADDREF(in_hash);
        HT_ALLOW_COW_VIOLATION(in_hash); /* Will be reset when setting the flags for in_hash */
 
+       /* Capture current internal pointer key for restoration after splice */
+       zend_string *pointer_key_str = NULL;
+       zend_ulong pointer_key_num = 0;
+       int pointer_key_type = HASH_KEY_NON_EXISTENT;
+       if (in_hash->nInternalPointer < in_hash->nNumUsed) {
+               pointer_key_type = zend_hash_get_current_key(in_hash, &pointer_key_str, &pointer_key_num);
+       }
+
        /* Get number of entries in the input hash */
        num_in = zend_hash_num_elements(in_hash);
 
@@ -3376,7 +3384,7 @@ static void php_splice(HashTable *in_hash, zend_long offset, zend_long length, H
        HT_SET_ITERATORS_COUNT(in_hash, 0);
        in_hash->pDestructor = NULL;
 
-       /* Reset internal pointer while COW violation flag is still set */
+       /* Reset internal pointer while COW violation flag is still set to avoid assertion */
        zend_hash_internal_pointer_reset(in_hash);
 
        if (UNEXPECTED(GC_DELREF(in_hash) == 0)) {
@@ -3397,6 +3405,53 @@ static void php_splice(HashTable *in_hash, zend_long offset, zend_long length, H
        in_hash->nNextFreeElement  = out_hash.nNextFreeElement;
        in_hash->arData            = out_hash.arData;
        in_hash->pDestructor       = out_hash.pDestructor;
+
+       /* Restore internal pointer to the original key position if possible */
+       if (pointer_key_type != HASH_KEY_NON_EXISTENT) {
+               uint32_t search_idx = 0;
+               bool found = false;
+
+               if (HT_IS_PACKED(in_hash)) {
+                       /* For packed arrays, search by numeric key */
+                       if (pointer_key_type == HASH_KEY_IS_LONG && pointer_key_num < in_hash->nNumUsed) {
+                               if (!Z_ISUNDEF(in_hash->arPacked[pointer_key_num])) {
+                                       in_hash->nInternalPointer = pointer_key_num;
+                                       found = true;
+                               }
+                       }
+               } else {
+                       /* For regular arrays, search through arData */
+                       Bucket *p = in_hash->arData;
+                       for (search_idx = 0; search_idx < in_hash->nNumUsed; search_idx++, p++) {
+                               if (Z_ISUNDEF(p->val)) continue;
+
+                               bool key_matches = false;
+                               if (pointer_key_type == HASH_KEY_IS_STRING && p->key && pointer_key_str) {
+                                       key_matches = zend_string_equals(p->key, pointer_key_str);
+                               } else if (pointer_key_type == HASH_KEY_IS_LONG && !p->key) {
+                                       key_matches = (p->h == pointer_key_num);
+                               }
+
+                               if (key_matches) {
+                                       in_hash->nInternalPointer = search_idx;
+                                       found = true;
+                                       break;
+                               }
+                       }
+               }
+
+               /* If key was not found, reset to beginning, but we need to set COW flag temporarily */
+               if (!found) {
+                       HT_ALLOW_COW_VIOLATION(in_hash);
+                       zend_hash_internal_pointer_reset(in_hash);
+                       HT_FLAGS(in_hash) &= ~HASH_FLAG_ALLOW_COW_VIOLATION;
+               }
+       } else {
+               /* No valid pointer was set originally, reset to beginning with COW flag */
+               HT_ALLOW_COW_VIOLATION(in_hash);
+               zend_hash_internal_pointer_reset(in_hash);
+               HT_FLAGS(in_hash) &= ~HASH_FLAG_ALLOW_COW_VIOLATION;
+       }
 }
 /* }}} */ | 
7eae214    to
    68720d0      
    Compare
  
    | I think it should be sufficient to replace  
 That's exactly opposite of the existing behavior, no? The existing behavior always resets the position to 0. | 
68720d0    to
    de19dc2      
    Compare
  
    | I think I've got things mixed up a bit. But your suggestion worked, thanks! | 
| The unfortunate part is that destructors might still mess with the iterator position, but at least it shouldn't crash. | 
| It seems to be a very specific case for it to have only been spotted now. I imagine that this solution should be sufficient to limit the damage | 
| Yeah, that's unfortunately how many of these fuzzing issues go. They complicate the code for little real-world benefit. Destructors, error handlers and  | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO this suffices
| IMO, we should consider declaring such destructor issues permanent wontfix, unless we can fix them globally with GH-20001 or similar. Same with error handlers and GH-12805. After that, only  Sadly, the code base is already littered with attempts for fix such issues, it will be hard to get rid of all of them if/once we have a general solution. | 
| But also, I don't object to this fix. Maybe you can clarify why we're not using  | 
de19dc2    to
    e3f73e8      
    Compare
  
    | If it's fine for you, I'll merge during the day 🙂 | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the adjustments!
…while COW violation flag is still set
e3f73e8    to
    9b81e45      
    Compare
  
    
No description provided.