Skip to content

Conversation

TimWolla
Copy link
Member

There were two issues with the previous implementation of normalization:

  • php_raw_url_decode_ex() would be used to modify a string with RC >1.
  • The return value of php_raw_url_decode_ex() was not used, resulting in incorrect string lengths when percent-encoded characters are decoded.

Additionally there was a bogus assertion that verified that strings returned from the read handlers are RC =2, which was not the case for the parse_url-based parser when repeatedly retrieving a component even without normalization happening. Remove that assertion, since its usefulness is questionable. Any obvious data type issues with read handlers should be detectable when testing during development.


This is a follow-up for the issue detected in #19587.

@TimWolla TimWolla requested a review from nielsdos August 26, 2025 20:18
@TimWolla TimWolla requested a review from kocsismate as a code owner August 26, 2025 20:18
@TimWolla TimWolla force-pushed the uri-parse_url-memory-management branch 2 times, most recently from 6c6197e to cd01b22 Compare August 26, 2025 20:27
@@ -724,6 +725,67 @@ static ZEND_FUNCTION(zend_test_crash)
php_printf("%s", invalid);
}

static ZEND_FUNCTION(zend_test_uri_parser)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I've done for DOM once is define a debugging function under #if ZEND_DEBUG. Doing that here too would keep the uri stuff in ext/uri.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll leave the decision to Máté 😃

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with adding this function to ext/zend_test. :) It doesn't make it more difficult for any tooling to parse stub files.

E.g. adding any symbol to stub files that is not for production should have the @undocumentable phpdoc, otherwise the documentation generator tries to sync it with the manual. AFAIK PHPStan also parses these files. So 👍 overall!

There were two issues with the previous implementation of normalization:

- `php_raw_url_decode_ex()` would be used to modify a string with RC >1.
- The return value of `php_raw_url_decode_ex()` was not used, resulting in
  incorrect string lengths when percent-encoded characters are decoded.

Additionally there was a bogus assertion that verified that strings returned
from the read handlers are RC =2, which was not the case for the
`parse_url`-based parser when repeatedly retrieving a component even without
normalization happening. Remove that assertion, since its usefulness is
questionable. Any obvious data type issues with read handlers should be
detectable when testing during development.
@TimWolla TimWolla force-pushed the uri-parse_url-memory-management branch from cd01b22 to ad9e557 Compare August 26, 2025 20:55
@TimWolla TimWolla merged commit e99f1b4 into php:master Aug 26, 2025
9 checks passed
@TimWolla TimWolla deleted the uri-parse_url-memory-management branch August 26, 2025 21:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants