Skip to content

SimpleXML's unset can break DOM objects #17040

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
YuanchengJiang opened this issue Dec 4, 2024 · 1 comment
Closed

SimpleXML's unset can break DOM objects #17040

YuanchengJiang opened this issue Dec 4, 2024 · 1 comment

Comments

@YuanchengJiang
Copy link

Description

The following code:

<?php
$dom = Dom\HTMLDocument::createEmpty();
$body = $dom->appendChild($dom->createElement("body"));
foreach (["style", "script", "xmp", "iframe", "noembed", "noframes", "plaintext", "noscript"] as $tag) {
$tag = $body->appendChild($dom->createElementNS("some:ns", $tag));
}
$fusion = $tag;
$html = simplexml_import_dom($fusion);
$script1_dataflow = $html;
$array = ['foo'];
foreach ($array as $key => &$value) {
unset($script1_dataflow[$key]);
}
var_dump(get_defined_vars());

Resulted in this output:

/php-src/ext/dom/token_list.c:206:25: runtime error: member access within null pointer of type 'php_libxml_ref_obj' (aka 'struct _php_libxml_ref_obj')

PHP Version

nightly

Operating System

ubuntu 22.04

@nielsdos
Copy link
Member

nielsdos commented Dec 4, 2024

It's not related to the new classes, here is a reproducer for old DOM that repros on 8.0:

<?php
$dom = new DOMDocument;
$tag = $dom->appendChild($dom->createElement("style"));
$html = simplexml_import_dom($tag);
unset($html[0]);
$tag->append("foo");

I wouldn't be surprised if you can find a reproducer for even older versions.

Basically, simplexml thinks it owns the node and uses the "free resource" approach to get rid of it when unsetting. This causes the pointer to the node in the internal object to become NULL, causing a failure in all places that get the node assuming that it can't be NULL.
I'm not sure yet whether this is better fixed in simplexml or dom.

@nielsdos nielsdos changed the title Segmentation fault ext/dom/token_list.c:206 SimpleXML's unset can break DOM objects Dec 4, 2024
nielsdos added a commit to nielsdos/php-src that referenced this issue Dec 4, 2024
Don't free the underlying nodes if we still have objects pointing to
them, otherwise the objects are left with a NULL node pointer.
@nielsdos nielsdos linked a pull request Dec 4, 2024 that will close this issue
nielsdos added a commit to nielsdos/php-src that referenced this issue Dec 4, 2024
Don't free the underlying nodes if we still have objects pointing to
them, otherwise the objects are left with a NULL node pointer.
nielsdos added a commit that referenced this issue Dec 6, 2024
* PHP-8.3:
  Fix GH-17040: SimpleXML's unset can break DOM objects
nielsdos added a commit that referenced this issue Dec 6, 2024
* PHP-8.4:
  Fix GH-17040: SimpleXML's unset can break DOM objects
charmitro pushed a commit to wasix-org/php that referenced this issue Mar 13, 2025
Don't free the underlying nodes if we still have objects pointing to
them, otherwise the objects are left with a NULL node pointer.

Closes phpGH-17046.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants