Skip to content

Add a separate mode to parse footnotes the same way GitHub does #654

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jun 6, 2023

Conversation

notriddle
Copy link
Collaborator

@notriddle notriddle commented Jun 1, 2023

Resolves #20

Resolves #530

Resolves #623

This change is similar to, but a more limited change than, #544. It changes the syntax, but does not touch the generated HTML or event API.

Motivation

This commit is written with usage in mdBook, rustdoc, and docs.rs in mind.

  • Having a standard to follow, or at least a public test suite in cmark-gfm 1, makes it easier to distinguish bugs from features.
  • It makes sense to commit to following GitHub's behavior specifically, because mdBook chapters and docs.rs README files are often viewed in GitHub preview windows, so any divergence will be very annoying.
  • If mdBook and docs.rs are going to use this syntax, then rustdoc should, too.
  • Having both footnote syntaxes use the same API and rendering makes it more feasible for rustdoc to change the syntax over an edition. To introduce a syntax change in a new edition of Rust, we must make rustdoc warn anyone who writes code that will have its meaning change. To do it, run the parser twice in lockstep (with ENABLE_FOOTNOTES on one parser, and ENABLE_GFM_FOOTNOTES on the other), and warn if they diverge.
    • Alternatively, run a Crater build with this same code to check if this actually causes widespread breakage.
  • In particular, using tree rewriting to push the footnotes to the end is not as useful as it sounds, since that's not enough to exactly copy the way GitHub renders footnotes. To do that, you also need to sort the footnotes by the order in which they are referenced, not the order in which they are defined. This type of tree rewriting is also a waste of time if you want "margin note" rendering instead of putting them all at the end.

Footnotes

  1. cmark-gfm is under the MIT license, so incorporating parts of its test suite into pulldown-cmark should be fine.

@notriddle notriddle force-pushed the notriddle/gfm-footnotes branch from 33f63c1 to 0f84d7e Compare June 2, 2023 23:28
Resolves pulldown-cmark#20

Resolves pulldown-cmark#530

Resolves pulldown-cmark#623

This change is similar to, but a more limited change than,
 <pulldown-cmark#544>. It changes
the syntax, but does not touch the generated HTML or event API.

Motivation
----------

This commit is written with usage in mdBook, rustdoc, and docs.rs
in mind.

* Having a standard to follow, or at least a public test suite in
  [cmark-gfm] [^c], makes it easier to distinguish bugs from features.
* It makes sense to commit to following GitHub's behavior specifically,
  because mdBook chapters and docs.rs README files are often viewed in
  GitHub preview windows, so any divergence will be very annoying.
* If mdBook and docs.rs are going to use this syntax, then rustdoc
  should, too.
* Having both footnote syntaxes use the same API and rendering makes it
  more feasible for rustdoc to change the syntax over an [edition].
  To introduce a syntax change in a new edition of Rust, we must make
  rustdoc warn anyone who writes code that will have its meaning change.
  To do it, run the parser twice in lockstep (with `ENABLE_FOOTNOTES`
  on one parser, and `ENABLE_GFM_FOOTNOTES` on the other), and warn if
  they diverge.
  * Alternatively, run a Crater build with this same code to check if
    this actually causes widespread breakage.
* In particular, using tree rewriting to push the footnotes to the end
  is not as useful as it sounds, since that's not enough to exactly
  copy the way GitHub renders footnotes. To do that, you also need to
  sort the footnotes by the order in which they are *referenced*, not
  the order in which they are defined. This type of tree rewriting is
  also a waste of time if you want "margin note" rendering instead of
  putting them all at the end.

[cmark-gfm]: https://github.com/github/cmark-gfm/blob/1e230827a584ebc9938c3eadc5059c55ef3c9abf/test/extensions.txt#L702
[edition]: https://doc.rust-lang.org/edition-guide/editions/index.html

[^c]: cmark-gfm is under the MIT license, so incorporating parts of its
    test suite into pulldown-cmark should be fine.
@notriddle notriddle force-pushed the notriddle/gfm-footnotes branch from 0f84d7e to 3d184fb Compare June 2, 2023 23:28
@Martin1887
Copy link
Collaborator

Martin1887 commented Jun 3, 2023

Thanks for your contribution, I have to carefully review this and the other related pull requests before merging, as well as to check the possible syntaxes.

It could be included in the 0.10 version but maybe it will not.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This spec should be placed in third_party/GitHub

Copy link
Collaborator Author

@notriddle notriddle Jun 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spec didn't actually come from GitHub themselves, though. It was mostly generated by throwing stuff at GitHub to see what it does (plus looking at some of their test cases, though they weren't really exhaustive).

Copy link
Collaborator

@Martin1887 Martin1887 Jun 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I thought it came from

cmark-gfm is under the MIT license, so incorporating parts of its test suite into pulldown-cmark should be fine.

Am I missing anything or something prevents using that test suite?

Thanks for the changes.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I missing anything or something prevents using that test suite?

  1. We can't use it verbatim, because the generated HTML format is a little different.
  2. The cmark-gfm test suite includes about five test cases for footnotes. I incorporated them into our spec, but also wrote a few dozen more tests for cases that theirs didn't cover.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, thanks for the clarification.

build.rs Outdated
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why using the spec filename here instead of using a new type of example as with other extensions (example_metadata_blocks for instance)?

Doing everything in the same way increases consistency and maintenance.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Examples should be marked as example_gfm_footnotes as commented in build.rs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taking into account the comments of lib.rs, these should be the normal example examples and old-style footnotes should be example_footnotes_old_style or something similar.

tests/lib.rs Outdated
@@ -14,14 +14,18 @@ use tendril::stream::TendrilSink;
mod suite;

#[inline(never)]
pub fn test_markdown_html(input: &str, output: &str, smart_punct: bool, metadata_blocks: bool) {
pub fn test_markdown_html(input: &str, output: &str, smart_punct: bool, metadata_blocks: bool, is_gfm: bool) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are several features related to GFM, I would name this gfm_footnotes.

src/lib.rs Outdated
@@ -358,5 +359,13 @@ bitflags::bitflags! {
/// - `+++` line at start
/// - `+++` line at end
const ENABLE_PLUSES_DELIMITED_METADATA_BLOCKS = 1 << 8;
/// GitHub-compatible footnote syntax. Mutually-exclusive with `ENABLE_FOOTNOTES`.
const ENABLE_GFM_FOOTNOTES = 1 << 9;
Copy link
Collaborator

@Martin1887 Martin1887 Jun 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that nothing prevents here from enabling both options, which is incorrect. This is a special case because there are two different options for the same thing with small differences.

Pulldown-cmark adheres to CommonMark and GFM extensions, so I think the best solution here is maintaining the ENABLE_FOOTNOTES flag enabling the new behaviour and a new FOOTNOTES_OLD_STYLE or something similar to enable the old behaviour not fully compatible with GFM footnotes (note that both flags must be enabled to support old-style footnotes).

This supposes a breaking change, but 0.10 release contains some of them, so I would prioritize GFM adhesion against not breaking changes here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think, @raphlinus?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To actually solving #530, the footnotes documentation should be pointed here.

I think the best solution is putting a link in the docstring and adding a file describing the specification of old-style footnotes and the differences with the new fully compatible GFM footnotes, also indicating the differences with GitHub behaviour if any (missing footnotes backlinks, footnotes order, ...).

Copy link
Collaborator

@Martin1887 Martin1887 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes should also be done in main.rs to use the new footnotes option.

notriddle added 3 commits June 4, 2023 16:46
  * `ENABLE_GFM_FOOTNOTES` is now `ENABLE_FOOTNOTES`, and the old mode
    is now `ENABLE_OLD_FOOTNOTES`.
  * The spec test case mode tweaks are now done using example tags,
    just like every other test case.
  * Instead of declaring it not-allowed to enable both modes, the
    `ENABLE_OLD_FOOTNOTES` mode now *implies* the regular
    `ENABLE_FOOTNOTES`.
  * `main.rs` now has an option to `--enable-old-footnotes`.
@Martin1887
Copy link
Collaborator

Martin1887 commented Jun 5, 2023

Everything seems right now, I will do some checks tomorrow and then it will be merged. Thanks for your work and fast responses!

@Martin1887 Martin1887 merged commit 250799d into pulldown-cmark:master Jun 6, 2023
@notriddle notriddle deleted the notriddle/gfm-footnotes branch June 6, 2023 15:12
@chriskrycho
Copy link

chriskrycho commented Jun 6, 2023

For whatever it’s worth, I was revisiting my own incomplete implementation (and also mucking with markdown-rs) and concluded the same about the tree rewriting: You should not rewrite the source, but should make sure you capture enough info the drive a correct output from the event sequence (whether it’s emitting an AST for going straight to HTML), and preserving source order rather than rewrite allows whatever mix of styles of note you want. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Indented footnote definitions confuse the parser Document footnote syntax Footnote definition does not end
3 participants