Skip to content

Fix GH-10229 http_build_query() skips Stringable params #10235

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Girgias
Copy link
Member

@Girgias Girgias commented Jan 5, 2023

Did some refactoring to make the implementation more obvious as to what is going on and added some tests for some other weird types.

@Girgias Girgias requested a review from sgolemon January 5, 2023 14:59
@Girgias Girgias linked an issue Jan 5, 2023 that may be closed by this pull request
@cmb69
Copy link
Member

cmb69 commented Jan 5, 2023

BC alert! :) See e.g. https://3v4l.org/nH479, which might actually be used by some people, and behaves this way for ages. I don't think that it should behave that way, but we probably don't want to fix the behavior in stable versions (and maybe not even in a minor version). Maybe we should deprecate passing objects (or array containing objects) instead.

@nielsdos
Copy link
Member

nielsdos commented Jan 5, 2023

Hey, I'm just gonna chime in quickly regarding deprecating passing objects.
This function uses a key-value relationship to build a query string, and associative arrays and objects also follow a key-value relationship.
That's why I don't like deprecating passing objects: it fits the description nicely. In APIs, I also like to use objects in a plain-data-object way, the advantage over arrays is the type checking you can get because of typed properties. Now I could just pass in such an object to this function without converting to array first, which would no longer be possible if this gets deprecated.

@cmb69
Copy link
Member

cmb69 commented Jan 5, 2023

@nielsdos, thanks! Point taken. However, what should we do with Stringable objects?

@nielsdos
Copy link
Member

nielsdos commented Jan 5, 2023

One possible mitigation that works for the code that was posted in the issue is the following:

  • If there are no public properties, and the object is Stringable, use __toString
  • If there are no public properties, and the object is not Stringable, it should be the empty string just like is the case in current stable versions
  • If there are public properties, always use the behaviour of current stable versions

I believe this way results in only minor BC break

@Girgias
Copy link
Member Author

Girgias commented Jan 7, 2023

Arf, we're back to the annoying case where if PHP had proper struct this would be less of an issue as objects can be used in so many different ways.

I attempted to implement @nielsdos idea but somehow messed up the implementation in regard to recursive checks to prevent needed to recompute the visible (because if this function is called within a method it unmangle and adds protected/private) properties.

@Girgias
Copy link
Member Author

Girgias commented Jan 7, 2023

Well I've tried something and spending way too long on something that is still buggy and having no idea why. So if someone has an either as to what I'm doing wrong feel free to let me know...

@nielsdos
Copy link
Member

nielsdos commented Jan 7, 2023

@Girgias I think I got it working, it's less engineered than your attempted solution (i.e. it does not do any caching related stuff) but I think it works (all tests pass and it implements my suggested behaviour): You can see my branch at https://github.com/nielsdos/php-src/commits/gh10229-attempt

I started from the commits of this PR, but dropped the last 3 commits (except that I squashed the added tests in the last commit to the "Add more tests with objects" commit. I also fixed a memleak which was in the PR. The last commit contains my code changes which seems to work.

@Girgias
Copy link
Member Author

Girgias commented Jan 8, 2023

@Girgias I think I got it working, it's less engineered than your attempted solution (i.e. it does not do any caching related stuff) but I think it works (all tests pass and it implements my suggested behaviour): You can see my branch at https://github.com/nielsdos/php-src/commits/gh10229-attempt

I started from the commits of this PR, but dropped the last 3 commits (except that I squashed the added tests in the last commit to the "Add more tests with objects" commit. I also fixed a memleak which was in the PR. The last commit contains my code changes which seems to work.

Yeah me attempting to not do multiple passes of objects was the issue, so I think at least for a backport this makes more sense, as there is a lot of stuff that can (and should) be optimized in that function anyway (which might make a caching solution easier). @cmb69 what do you think about @nielsdos's solution?

@nielsdos
Copy link
Member

nielsdos commented Jan 8, 2023

Coincidentally I was just working on a small optimisation I just pushed: checking if __toString exists before attempting to determine if it has public properties.

@Girgias
Copy link
Member Author

Girgias commented Jan 8, 2023

Coincidentally I was just working on a small optimisation I just pushed: checking if __toString exists before attempting to determine if it has public properties.

I don't know if all the internal objects that can cast to string actually implement __toString() :-/

@nielsdos
Copy link
Member

nielsdos commented Jan 8, 2023

Coincidentally I was just working on a small optimisation I just pushed: checking if __toString exists before attempting to determine if it has public properties.

I don't know if all the internal objects that can cast to string actually implement __toString() :-/

Oh wow that's so weird behaviour, you're right: https://3v4l.org/kD6HV
Okay then we should drop that one optimisation commit I guess...

@Girgias
Copy link
Member Author

Girgias commented Jan 8, 2023

Maybe something to try to fix for PHP 8.3

@cmb69
Copy link
Member

cmb69 commented Jan 8, 2023

Hmm, I still don't see the current behavior strictly as bug. Unfortunately, the man page is underspecified, but it says:

If data is an object, then only public properties will be incorporated into the result.

While that doesn't talk about nested objects, the "Using http_build_query() with an object" examples clarifies this.

Given that @nielsdos' suggestion would prevent most BC breaks, we may not need to have to wait for a major version, or do some deprecation (or notices/warnings), but I'm still uncomfortable to change the behavior in PHP-8.1 (besides that the diff is quite large). I would suggest to target PHP-8.2, but the decision is up to the RMs. @ramsey, @patrickallaert, what do you think?

@nielsdos
Copy link
Member

nielsdos commented Jan 8, 2023

I don't know if all the internal objects that can cast to string actually implement __toString() :-/

Speaking about this again: I fixed the optimisation and added a test for it. The optimisation still holds if the zend_object uses zend_std_cast_object_tostring, so I added that condition.

@Girgias
Copy link
Member Author

Girgias commented Jan 8, 2023

Hmm, I still don't see the current behavior strictly as bug. Unfortunately, the man page is underspecified, but it says:

If data is an object, then only public properties will be incorporated into the result.

While that doesn't talk about nested objects, the "Using http_build_query() with an object" examples clarifies this.

Given that @nielsdos' suggestion would prevent most BC breaks, we may not need to have to wait for a major version, or do some deprecation (or notices/warnings), but I'm still uncomfortable to change the behavior in PHP-8.1 (besides that the diff is quite large). I would suggest to target PHP-8.2, but the decision is up to the RMs. @ramsey, @patrickallaert, what do you think?

This makes sense, I wasn't expecting it to get that complicated. And I'm even leaning towards making this change only in master if we are doing it. But this is one of those edge cases that's hard to figure out what the "correct" behaviour even should be.

@nielsdos
Copy link
Member

nielsdos commented Jan 8, 2023

Imo this is a feature change instead of a bugfix, because docs say:

If data is an object, then only public properties will be incorporated into the result.

It says nothing about Stringable objects.

So imo it's best to get this in master.

@Girgias Girgias force-pushed the gh10229-http-build-query-81 branch from 9f35838 to 117564a Compare January 8, 2023 18:29
@Girgias Girgias changed the base branch from PHP-8.1 to master January 8, 2023 18:29
@Girgias
Copy link
Member Author

Girgias commented Jan 8, 2023

Well I've tried something and spending way too long on something that is still buggy and having no idea why. So if someone has an either as to what I'm doing wrong feel free to let me know...

So I've figured out the issue while rebasing and refactoring the function.

I was shadowing the new_prefix variable in some interior scope meaning it wasn't modifying the variable that actually needed to be changed... (-Wshadow would have been useful here but that's a no go because of the amount of macros that rely on this).

Thus my cached implementation should now work

@nielsdos
Copy link
Member

nielsdos commented Jan 8, 2023

The only thing I wonder now about your cached implementation vs my non-cached one is whether the cached one is actually faster as it does more memory allocations and hashtable additions I think. I wonder whether we can easily benchmark that somehow?

@Girgias Girgias force-pushed the gh10229-http-build-query-81 branch from 117564a to 75ba014 Compare January 13, 2023 12:36
@Girgias
Copy link
Member Author

Girgias commented Jan 13, 2023

The only thing I wonder now about your cached implementation vs my non-cached one is whether the cached one is actually faster as it does more memory allocations and hashtable additions I think. I wonder whether we can easily benchmark that somehow?

Ideally I would not do a complete copy of the properties array, but I don't understand why when I try to just create a new array and add key/value pairs I getting use after frees within the shutdown sequence, as it seems to want to destroy the properties HashTable twice...

The main other reason as to why I'm returning a new array is that I would expect that other extensions may want to get all the visible/public properties and having this as a ZEND_API may make sense.

The reason for this is that objects are treated like a key:value pair, like arrays, and called recursively
Therefore, if the object is Stringable and does not have any visible propertiese we handle it like a scalar.
Otherwise, we keep the existing key:value behaviour of recursing through the object properties.
@Girgias Girgias force-pushed the gh10229-http-build-query-81 branch from 5426f9d to 330b5e4 Compare January 15, 2023 16:14
@Girgias Girgias requested a review from bukka as a code owner October 7, 2023 13:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

http_build_query() skips Stringable params
3 participants