Skip to content

[RFC] Add json_encode indent parameter #7093

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 24 commits into from

Conversation

tdgroot
Copy link

@tdgroot tdgroot commented Jun 3, 2021

@@ -2,7 +2,7 @@

/** @generate-class-entries */

function json_encode(mixed $value, int $flags = 0, int $depth = 512): string|false {}
function json_encode(mixed $value, int $flags = 0, int $depth = 512, string|int $indent = 4): string|false {}
Copy link
Contributor

@mvorisek mvorisek Jun 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think accepting string only here is enough, passing ' ' is easy as passing 4 but simple type is enough.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does make it more difficult to see that $indent=' ' specifies 4 spaces and that this indentation is actually consistent with some other call site that uses $indent=4.

Also, I prefer parameters having pure types rather than composite. Since we have named parameters now it might make sense to split into two parameters:

json_encode($value, indent: 4, indent_char: ' ');

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @dtakken about pure types rather than composite. As I was scanning the code I kept trying to figure out why the string|int types were being used instead of only int. Realizing finally if a string is passed as the argument the string literally is the indent value.
However, spiting this into two parameters does remove or complicate the feature of multiple characters:

json_encode($value, indent: 4, indent_chars: '🚀🌙🌠🌎');
// What to do if indent = 2 and indent_chars has more than 2 chars?
json_encode($value, indent: 2, indent_chars: '🚀🌙🌠'); // error? ignore the 3rd char?
// What to do if indent = 3 and indent_chars has less than 3 chars?
json_encode($value, indent: 3, indent_chars: '🚀🌙'); // repeat '🚀🌙🚀'? error? something else?

@krakjoe krakjoe added the RFC label Jun 4, 2021
@tdgroot tdgroot requested a review from nikic June 4, 2021 13:38
ext/json/json.c Outdated
bool default_indent_used = 0;

if (indent_str == 0) {
indent_str = zend_string_init(" ", strlen(" "), 0);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to base the indent_str off of the stack instead of heap?


if (options & PHP_JSON_PRETTY_PRINT) {
if (encoder->indent_str) {
indent_length = ZSTR_LEN(encoder->indent_str);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this care about indent_length? Can't we just append indent_str unconditionally? I don't think we need to optimize for the indent=0 case.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because indent_length = 0 is basically a noop for the indent function. I have updated the function to be more clear about that scenario.

Let me know if you still think it'd be better to not care about the indent_length!

@mallardduck
Copy link
Contributor

Just wondering, are there any cases today where json_encode will produce an invalid JSON today? If that's not the case, then I would advocate that this new indent parameter throw exception for invalid characters. Granted, it does require adding the incorrect value for indent, but intentionally allowing invalid JSON to be produced seems off.

I could very well be wrong about the current state of json_encode and it may already be unpredictable in ways I don't know. However consider the following - today you can pretty safely encode and decode if you wanted to:

var_dump(
json_decode(json_encode($someData));
);

However after this change doing the same thing, but with this example:

var_dump(
json_decode(json_encode(['unicode' => "supported"], JSON_PRETTY_PRINT, 512, '🚀🚀'));
);

You get the error in the decode, but not the encode. This example may not seem practical - however other JSON decode functions will also be expected to fail as well. Yet your app is none the wiser producing invalid JSON for you APIs.

Wouldn't it be better (i.e more clear) if the error occurred closer to the source of the issue? Instead it could throw an exception when indent doesn't conform to rfc4627:

   Insignificant whitespace is allowed before or after any of the six
   structural characters.

      ws = *(
                %x20 /              ; Space
                %x09 /              ; Horizontal tab
                %x0A /              ; Line feed or New line
                %x0D                ; Carriage return
            )

Co-authored-by: Tyson Andre <[email protected]>
@iluuu1994
Copy link
Member

@tdgroot There has been no change here for a while. Are you still planning on pursuing this RFC?

@tdgroot
Copy link
Author

tdgroot commented May 13, 2022

@iluuu1994 yes I intent to finish this PR, had quite a busy year after I dived into this.

tdgroot added 3 commits May 13, 2022 08:28
Removed the possibility to enter a string as indentation, as it raises more
questions than answers. So for now, it's only possible to change the indentation
by passing a number to the `indent` parameter.

Perhaps in the future, we could introduce a fifth parameter like `indent_char`,
with which a character can be passed to specify which character needs to be used
for the indentation.
@Furgas
Copy link
Contributor

Furgas commented May 16, 2022

Instead of future indent_char you may consider adding a flag JSON_TAB_AS_INDENT. Personally I would like this flag to be added with this PR, because I'm in the team "tabs are for indentation, spaces for alignment".

@bukka
Copy link
Member

bukka commented Jun 26, 2022

@tdgroot It might make sense you start voting on the RFC so we can get it to 8.2...

@timint
Copy link

timint commented Jul 17, 2022

Hi, Tim here from issue #8864. I just wanted to say don't forget Tabs. 🙂

I would add two new pretty print variation flags:
JSON_PRETTY_PRINT
JSON_PRETTY_PRINT_TABS
JSON_PRETTY_PRINT_DOUBLE_SPACES

Tabs would shrink indentation size to 1 bytes instead of 4 bytes which could have a significant impact on an entire serialized JSON object. Tabs are suited for both byte optimization and readability.

@Girgias
Copy link
Member

Girgias commented Jul 21, 2022

RFC has been declined

@Girgias Girgias closed this Jul 21, 2022
@maximal
Copy link

maximal commented Dec 5, 2022

+ for having customizable indent string or at least tab possibility.

@marcing
Copy link

marcing commented May 16, 2023

+1 because of lack of compatibility with current MySQL JSON format

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.