Skip to content

libxml streams use wrong `content-type` header when requesting a redirected resource

Moderate
bukka published GHSA-p3x9-6h7p-cgfc Mar 13, 2025

Package

No package listed

Affected versions

< 8.1.32
< 8.2.28
< 8.3.18
< 8.4.5

Patched versions

8.1.32
8.2.28
8.3.19
8.4.5

Description

Summary

When requesting a HTTP resource using the DOM or SimpleXML extensions, the wrong content-type header is used to determine the charset when the requested resource performs a redirect.

Details

When the HTTP stream wrapper follows a redirect, it does not clear the list of captured headers before performing the following requests. This means in the returned array containing the response headers, the headers of multiple requests are stored one after each other. The final request comes last in this array.

The php_libxml_input_buffer_create_filename() / php_libxml_sniff_charset_from_stream() function scans the header array from top to bottom, returning after finding the first content-type header. This content-type header does not necessarily belong to the response that corresponds to the HTML body that is being parsed.

PoC

redirect.php

<?php

header('content-type: text/html;charset=utf-16');
header('location: http://example.com');

Run: php -S localhost:8080 and then execute

<?php

// Or using DOMDocument / SimpleXML
$document = \Dom\HTMLDocument::createFromFile("http://localhost:8080/redirect.php");

if (\str_contains($document->querySelector('body')->textContent, 'Example')) {
  throw new Exception('Refusing to store example content');
}

var_dump(\str_contains($document->saveHtml(), 'Example')); // bool(true)

Impact

This allows an attacker to cause a document to be parsed incorrectly, changing its meaning and possibly bypassing validation. When exporting such a document with ->saveHtml() the document will be returned with the original charset.

Users that request documents via HTTP using the DOM or SimpleXML extensions are impacted.

Severity

Moderate

CVE ID

CVE-2025-1219

Weaknesses

No CWEs

Credits