Skip to content

support doc comments for internal classes and functions #13130

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ju1ius opened this issue Jan 12, 2024 · 7 comments · Fixed by #13266
Closed

support doc comments for internal classes and functions #13130

ju1ius opened this issue Jan 12, 2024 · 7 comments · Fixed by #13266

Comments

@ju1ius
Copy link
Contributor

ju1ius commented Jan 12, 2024

Description

Hi,

I recently built a small tool that generates IDE stubs for an extension module using the runtime Reflection API.

However, since the zend_internal_function and zend_class_entry.info.internal structs don't currently hold a zend_string* doc_comment field like their userland counterparts, ReflectionFunctionAbstract::getDocComment() and ReflectionClass::getDocComment() always return the empty string for internal functions/methods and classes.

It is thus currently impossible for this tool to include doc comments for internal classes and function/methods in the generated stubs (which, in addition to plain-text documentation, is also useful for i.e. specifiying @throws annotations, generic type overrides like @returns string[], etc).

A zend_internal_function is currently much smaller than a zend_op_array, so there's definitely room in here to stash a zend_string* pointer. The size of zend_class_entry.info.user is currently 24 bytes on 64bit platforms, and zend_class_entry.info.internal is 16, so there's just enough room for an additional pointer.

Note that having this feature would not only be beneficial for the aforementioned tool. For example when encountering an unknown extension, an LSP server could now fetch all needed informations using runtime reflection.

Is this something that could be considered?

@kocsismate
Copy link
Member

kocsismate commented Jan 21, 2024

Thanks for the report, it's an interesting problem. First of all, I think it's a good idea to store PHPDoc comments for internal functions/methods too.

Currently, we also use a couple of them: most notably, type related stuff as you mentioned, but there are also a few custom comments related to documentation handling (i.e. @undocumentable), related to internal behavior (@implementation-alias, @refcount, @compile-time-eval, @prefer-ref), related to stub verification (@no-verify), and finally other kinds of metadata which are useful for userland (@alias, @deprecated) . Additionally, we have a variety of doc comments for classes, constants, properties.

There are two important things to know in connection with our doc comment usage:

  • They are added because we make use of them directly. And we explicitly decided not to add such comments which we didn't. So for example we won't add throws annotations as long as we don't use them.
  • The types added in a PHPDoc comment sometimes cannot be trusted 100%, since they are mostly for display purposes in the manual.

That's why we don't plan to expose any of our PHPDoc. But that's all for us, let's think about 3rd party extensions.

I see that being able to expose PHPDoc comments can be useful for them. However, we should find a way to indicate that a PHPDoc comment is supposed to be exposed. I'm against to automatically exposing all comments by default because of the reasons mentioned above. Do you have anyidea about such a syntax? If we could come up with a way then I think we made a big step towards implementing this feature.

@ju1ius
Copy link
Contributor Author

ju1ius commented Jan 21, 2024

I'm against to automatically exposing all comments by default because of the reasons mentioned above.

Indeed, the metadata used by gen_stubs.php to generate the C headers is compile-time only and should not be exposed by default in a doc comment.

Do you have any idea about such a syntax?

I don't know all the possible ways gen_stubs.php uses doc tags, but off the top of my head, here are three possible options (which could be used in combination):

Option 1: switch to attributes

One solution for gen_stubs.php would be to use attributes instead of doc tags, so that doc comments can be passed verbatim, i.e.:

/**
 * Exchanges all keys with their associated values in an array.
 *
 * @see https://www.php.net/manual/function.array-flip
 * @since 4.0
 * @return array<int|string, int|string>
 */
#[GenStubs\SupportsCompileTimeEvaluation]
#[GenStubs\ReturnInfo(refcount: 1)]
function array_flip(array $array): array {}

That would be the most semantically correct solution, since metadata about language items is what attributes are for.

Unfortunately, that wouldn't work for constants because they cannot be attributed (which is a shame IMO). 😓
It would also require a significant refactoring effort.

Option 2: use a docblock parser

Another solution would be to parse the docblocks into an AST (using something like phpDocumentor/ReflectionDocBlock), so that the codegen-related doc tags can easily be removed from the comment (it might also be a good idea to use a common prefix for those tags).

I.e. this:

/**
 * Exchanges all keys with their associated values in an array.
 *
 * @see https://www.php.net/manual/function.array-flip
 * @since 4.0
 * @return array<int|string, int|string>
 * 
 * @genstubs-compile-time-eval
 * @genstubs-refcount 1
 */
function array_flip(array $array): array {}

would generate the following doc comment:

/**
 * Exchanges all keys with their associated values in an array.
 *
 * @see https://www.php.net/manual/function.array-flip
 * @since 4.0
 * @return array<int|string, int|string>
 */
Option 3: use a special start-of-comment marker

Since nikic/php-parser associates all the comment nodes that appear before an item to said item, another option would be to use special start-of-comment markers to disambiguate between codegen related docblocks and actual docblocks.

For example, it could be decided that only comments starting with the line /** @genstubs-expose-comment are to be included in the final doc comment (after stripping the special start-of-comment marker).

/** @genstubs-expose-comment
 * Exchanges all keys with their associated values in an array.
 *
 * @see https://www.php.net/manual/function.array-flip
 * @since 4.0
 * @return array<int|string, int|string>
 */
/**
 * @compile-time-eval
 * @refcount 1
 */
function array_flip(array $array): array {}

@kocsismate WDYT?

@kocsismate
Copy link
Member

@ju1ius Thanks for the ideas, they are great options to think about.

For me, the 2nd option with some modifications looks the most promising. I think we don't necessarily have to use a parser, but we can store the doc comments for each symbol and then expose the ones we want. That's why we should support two versions of each tag: one for internal only purposes (starting with an i.e. @genstubs- prefix), and one which is exposed (the ones we currently have without prefix). The caveat of this solution is that we have to preg replace thousands of tags :/

Apart from this, I also like the 3rd option as well. Its syntax is a bit unconventional, but at least it doesn't cause any additional work for people who want to keep things internal.

@Girgias do you maybe have a preference?

@Girgias
Copy link
Member

Girgias commented Jan 22, 2024

I think I prefer option 3, the amount of work touching up all our stubs for option 2 makes it equivalent in burden to option 1 IMHO.

Option 3 is rather simple to handle and clear for external extensions (and we should add a test case in zend_test for it)

@kocsismate
Copy link
Member

I've just submitted #13266. it required much more effort to implement these features as I initially thought (complexities in gen_stub.php), but here we are, it kinda works.

P.S. I'll document all the features of gen_stub.php the next month so that extension authors have an easier way to learn about all the possibilities they have when using this tool.

kocsismate added a commit to kocsismate/php-src that referenced this issue Jan 29, 2024
kocsismate added a commit to kocsismate/php-src that referenced this issue Jan 29, 2024
kocsismate added a commit to kocsismate/php-src that referenced this issue Feb 2, 2024
@kocsismate
Copy link
Member

The PR is ready for review :)

@ju1ius
Copy link
Contributor Author

ju1ius commented Feb 25, 2024

Thanks @kocsismate !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants