Skip to content

Conversation

TimWolla
Copy link
Member

Looking at the Lexbor implementation, the lxb_url_parser_t.idna field is private and must not be touched from the outside. Lexbor expects to be able to manage it by itself when destroying a parser object.

Fix the issue by putting the lxb_unicode_idna_t into a thread-local variable that we own. This also avoids one level of dynamic allocation. The same is done for the mraw.

Looking at the Lexbor implementation, the `lxb_url_parser_t.idna` field is
private and must not be touched from the outside. Lexbor expects to be able to
manage it by itself when destroying a parser object.

Fix the issue by putting the `lxb_unicode_idna_t` into a thread-local variable
that we own. This also avoids one level of dynamic allocation. The same is done
for the mraw.
}

PHP_RSHUTDOWN_FUNCTION(uri_parser_whatwg)
{
lxb_url_parser_memory_destroy(&lexbor_parser);
lxb_unicode_idna_destroy(&lexbor_idna, false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose the destroy logic may be separated to another function that gets called from RSHUTDOWN and the failure path in RINIT.
Anyway, this is only a small remark

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered that, but thought that being explicit here is useful, since the use-cases are a little different: The failure in RINIT might need to correctly handle partial initialization, whereas the cleanup in RSHUTDOWN can rely on proper initialization.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

/* Unconditionally calling the _destroy() functions is
* safe on a zeroed structure. */
lxb_unicode_idna_destroy(&lexbor_idna, false);
memset(&lexbor_idna, 0, sizeof(lexbor_idna));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if it's strictly necessary to use memset as Lexbor seems to reset the fields. But extra safety is fine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lexbor_mraw_destroy() does not reset the ref_count. It's correctly reset in lexbor_mraw_init(), but I'd rather not take chances when the initialization fails halfway through or other dumb stuff like that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ait

@TimWolla TimWolla merged commit 90822f7 into php:master Aug 26, 2025
9 checks passed
@TimWolla TimWolla deleted the uri-whatwg-idna-private branch August 26, 2025 20:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants