Use markdown image title text when generating list of figures #7915

LunkRat · 2022-02-12T13:29:02Z

Currently if pandoc is called with --lof or configured with a format.yml containing:

---
lof: yes

when converting a markdown source, each figure is listed in the LoF using its markdown alt text. This is a problem as others have pointed out in a separate issue because the alt text is also used for image captions, and the text length desired for image captions is very often longer than a short identifying text needed in a list of figures.

Fortunately markdown offers a title text attribute for images which affords a practical and natural solution. The proposal here is to use the markdown image title when generating the list of figures. Usage would look like this:

[My image **caption** can be long with _formatting_.](figures/my_figure.jpg "Figure title for LoF entry")

The title attribute is simple text only (no formatting or math, etc.) which makes it perfect to use in an LoF entry where you don't want that anyway.

This solution allows for a semantic separation between image caption (markdown alt text) and LoF entry text (markdown title text).

The text was updated successfully, but these errors were encountered:

jgm · 2022-02-13T01:40:20Z

I guess my main question is whether it's too much of a limitation if this is confined to being plain string content? (no formatting or math, unless the math is plain text unicode) If not, then this does seem a nice solution, until we get fancier figure support worked out.

LunkRat · 2022-02-13T03:53:33Z

My understanding is that a list of figures in a document should contain only plain text strings to identify each figure. In my opinion being restricted to plain text is a virtue in this case.

jgm · 2022-02-13T05:39:10Z

I can imagine wanting short figure titles like:

Alliances in the Iliad
Frequency of copain vs pote
Graph of
Unable to render expression.
```
$y=x^2$
```

LunkRat · 2022-02-13T13:21:02Z

@jgm You make a fair point. Does the need for italics or math in LoF titles outweigh the need for the ability to have figure captions/alt text be independent of LoF titles which this issue is attempting to solve? I argue that formatting LoF titles is a small price to pay in exchange for the ability to write figure captions that are multiple sentences without having them essentially break the LoF by rendering it unrecognizable.

One solution could be to have a conditional that uses the plain text image title attribute if present, otherwise fall back to using the alt text. I don't like this from a usability perspective, but it would allow those folks who want to use formatting in LoF titles to continue to do so using the alt text (would also make this change more backwards compatible).

Thoughts?

tarleb · 2022-02-13T14:57:16Z

Not a full solution, but for the time being you can use the "short-captions" Lua filter at https://github.com/pandoc/lua-filters/tree/master/short-captions. It only works when going from Markdown to LaTeX and has a few other limitations, but it allows for math in the short caption.

See also #3177, which may have to be completed first.

LunkRat · 2022-02-13T22:57:49Z

I tried the Lua filter and it works; however, when I combine it with pandoc-fignos from the pandoc-xnos package, the syntax for the Lua short-captions breaks the @fig:[name] in-text figure reference:

[My long alt text figure caption](figures/psychophysics_stimuli.png){#fig:stimuli width=75% short-caption="Psychophysics stimuli"}

The above results in an error from pandoc-fignos when I reference the figure by name with @fig:stimuli:

pandoc-fignos: Bad reference: @fig:stimuli

So while the Lua filter does give the desired behavior, it breaks other desired functionality.

tarleb · 2022-02-14T17:43:35Z

Make sure that pandoc-xnos runs before the Lua filter. Filters are run in the order in which they appear on the command line.

I wrote an updated, shorter filter that uses new pandoc features and might give better results in some cases:

if FORMAT ~= "latex" then return end

function Para (para)
  if #para.content ~= 1 then return end
  local img = para.content[1]
  if not img or img.t ~= 'Image' or #img.caption == 0
     or img.title:sub(1,4) ~= 'fig:'
     or not img.attributes['short-caption'] then
    return nil
  end

  local short_caption = pandoc.write(
    pandoc.read(img.attributes['short-caption']), FORMAT
  ):gsub('^%s*', ''):gsub('%s*$', '')  -- trim, removing surrounding whitespace

  local figure = pandoc.write(pandoc.Pandoc{para}, FORMAT)
  return pandoc.RawBlock(
    'latex',
    figure:gsub('\n\\caption', '\n\\caption[' .. short_caption .. ']')
  )
end

LunkRat · 2022-02-14T18:32:24Z

Thank you for the suggestion @tarleb I mistakenly thought I had tried both orders but I tried again and I got your Lua filters to work with pandoc-xnos by ordering my command so that pandoc-xnos filter runs before the Lua filter. I am using the newer, shorter Lua filter you posted on this issue and it works beautifully. So this does indeed solve my immediate need.

I still think it is worth implementing the original issue idea into Pandoc, for two reasons:

It would not require any extra filter to be called.
It would not require extra syntax above standard markdown. Using the Lua filter approach, documents will get littered with short-caption="[...]" which has no meaning or use outside of the specific Lua filter.

So I'm happy to be all fixed up but I still argue that this issue should be implemented as specified. Thank you @tarleb and @jgm for your attention and help!

tarleb · 2022-02-14T19:28:09Z

Thanks for the feedback, happy to hear that it works. I think we all agree that this should be implemented and become a part of pandoc. It will be easy once support for figures has been improved, and I hope to do that soon.

jgm · 2022-02-15T01:01:40Z

The reason I'm hesitant to implement this suggestion now is that the plain string limitation seems like a problem.
(And it wouldn't be right to parse it as markdown in the writer, because we don't know that the source was markdown.)

LunkRat · 2022-03-17T12:30:16Z

@tarleb is there a comparable technique available for something like short-caption that could work for Table captions? I'm hoping to clean up my LoT but I see the statement about lack of support for table captions in the Limitations section of the short-caption lua filter README. If you know of any workarounds for this problem please let me know.

tarleb · 2022-03-24T15:20:06Z

I'm not aware of anything. You could try with commonmark_x instead of the classic Markdown parser and use the attributes extension.

LunkRat · 2022-09-27T18:40:45Z

I'm using https://github.com/pandoc/lua-filters/tree/master/short-captions for images, works great. However, I still don't have a solution for markdown tables.

I am able to get a short caption for LoT if I use a raw latex table with this syntax:

\caption[My short LoT caption]{My longer caption which appears in the body table caption but not in the LoT.}

Would be great to find a solution for markdown tables, even if it is a workaround/hack and ugly.

jpcirrus · 2023-02-04T06:21:14Z

@LunkRat have you had a look at the table-short-captions Lua filter? I've not used it so not sure if it will do what you're looking for.

jpcirrus · 2023-02-04T22:59:30Z

@tarleb until pandoc 3.0+ I have been successfully using your figure short caption filter (thank you), but since upgrading, short captions are ignored. I assume this is due to the support for "complex figures" made in pandoc 3.0. If so, is it still possible to use figure short captions to LaTeX output by mererly amending your filter?

tarleb · 2023-02-05T11:22:38Z

There are two issues here: the first is that the filter needs updating. It could now be as short as function Figure (fig) local short = fig.attributes['short-caption'] if short and not fig.caption.short then fig.caption.short = pandoc.utils.blocks_to_inlines( pandoc.read(short, 'markdown') ) end return fig end However, this won't work yet as there is a second problem: the LaTeX writer currently ignores short captions. This must be fixed, too. I'll see to it.

jpcirrus · 2023-02-05T20:25:06Z

Thank you @tarleb . Appreciated.

jpcirrus · 2023-02-10T22:03:49Z

I have just upgraded to pandoc 3.1 and tried compiling to latex using this updated filter but the short caption is still not being inserted in the \caption command, so is obviously neglected in the list of figures. When going to the json format I can see short-caption in the output but don't know enough to work out what the issue could be.

jpcirrus · 2023-02-12T19:30:21Z

After reading the Lua filters manual and many thanks to @wlupton's logging module I have got the filter working after amending it to:

PANDOC_VERSION:must_be_at_least '3.1'

if FORMAT:match 'latex' then
  function Figure(f)
    local short = f.content[1].content[1].attributes['short-caption']
    if short and not f.caption.short then
      f.caption.short = pandoc.Inlines(short)
    end
    return f
  end
end

The title of an implicit figure, if set, is used as the short caption of a figure. The short caption of a figure replaces the full caption in the list of figures. Closes: jgm#7915

prakaa · 2023-04-27T23:39:46Z

After reading the Lua filters manual and many thanks to @wlupton's logging module I have got the filter working after amending it to:

PANDOC_VERSION:must_be_at_least '3.1'

if FORMAT:match 'latex' then
  function Figure(f)
    local short = f.content[1].content[1].attributes['short-caption']
    if short and not f.caption.short then
      f.caption.short = pandoc.Inlines(short)
    end
    return f
  end
end

Thanks @jpcirrus and @tarleb , I updated the short-captions filter myself but then came across this issue. This is much more succinct, thanks for sharing!

Just confirming that this code as a filter, as well as table-short-captions, means that with pandoc 3.1+ I can use short captions in the list of figures and list of tables

I might reference this issue in a few repos where others may be looking for a similar fix

jpcirrus · 2023-04-28T19:54:38Z

@prakaa I can confirm that the above code used as a filter ouputs figure short captions, but have no requirement for table short captions so don't know about that. Why don't you give it a go and let us know.

prakaa · 2023-04-29T05:11:13Z

@jpcirrus clarifying what I meant above:

Using the code you provided (let's call it figure-short-captions.lua), I can get short captions for figures in the list of figures (by using the flag --lua-filter=/path/to/figure-short-captions.lua)
Using the separate Lua filter table-short-captions.lua, I can get short captions for tables in the list of tables (by using the flag --lua-filter=/path/to/table-short-captions.lua)
- Simply attempting to replicate the Lua code that works for figures does not work for tables (don't know much about how pandoc parses tables). However, the linked filter for tables still works for pandoc 3.1.2 (and presumably 3.0+) , though it requires a particular syntax to work (see README in linked repo)

leowill01 · 2023-12-03T23:48:24Z

After reading the Lua filters manual and many thanks to @wlupton's logging module I have got the filter working after amending it to:

PANDOC_VERSION:must_be_at_least '3.1'

if FORMAT:match 'latex' then
  function Figure(f)
    local short = f.content[1].content[1].attributes['short-caption']
    if short and not f.caption.short then
      f.caption.short = pandoc.Inlines(short)
    end
    return f
  end
end

you just saved my ability to render my dissertation revisions. MANY THANKS

EDIT: this ended up not being able to render markdown or latex expressions in the short captions in the LOF, so after some tinkering with gpt, here is a modified version that supports those as well:

PANDOC_VERSION:must_be_at_least '3.1'

if FORMAT:match 'latex' then
  function Figure(f)
    local short = f.content[1].content[1].attributes['short-caption']
    if short and not f.caption.short then
      -- Parse the short caption as Markdown to handle formatting and then convert to LaTeX
      local short_caption = pandoc.read(short, 'markdown').blocks[1].content
      f.caption.short = pandoc.Inlines(short_caption)
    end  
    return f
  end  
end

LunkRat added the enhancement label Feb 12, 2022

LunkRat mentioned this issue Feb 12, 2022

Short caption in Markdown figures for \listoffigures #2417

Closed

mb21 added format:LaTeX writer labels Feb 12, 2022

tarleb linked a pull request Feb 13, 2023 that will close this issue

Markdown reader: use title of implicit figure as short caption. #8617

Open

This was referenced Apr 27, 2023

Short captions for figures for pandoc 3.0+ pandoc/lua-filters#267

Open

Fix: Short captions for figures and tables tompollard/phd_thesis_markdown#116

Closed

leowill01 mentioned this issue Dec 3, 2023

short-captions Lua filter not working in list of figures pandoc/lua-filters#273

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use markdown image title text when generating list of figures #7915

Use markdown image title text when generating list of figures #7915

LunkRat commented Feb 12, 2022 •

edited

Loading

jgm commented Feb 13, 2022

LunkRat commented Feb 13, 2022

jgm commented Feb 13, 2022

LunkRat commented Feb 13, 2022

tarleb commented Feb 13, 2022

LunkRat commented Feb 13, 2022 •

edited

Loading

tarleb commented Feb 14, 2022 •

edited

Loading

LunkRat commented Feb 14, 2022 •

edited

Loading

tarleb commented Feb 14, 2022

jgm commented Feb 15, 2022

LunkRat commented Mar 17, 2022 •

edited

Loading

tarleb commented Mar 24, 2022

LunkRat commented Sep 27, 2022

jpcirrus commented Feb 4, 2023

jpcirrus commented Feb 4, 2023

tarleb commented Feb 5, 2023 via email

jpcirrus commented Feb 5, 2023

jpcirrus commented Feb 10, 2023

jpcirrus commented Feb 12, 2023 •

edited

Loading

prakaa commented Apr 27, 2023 •

edited

Loading

jpcirrus commented Apr 28, 2023

prakaa commented Apr 29, 2023 •

edited

Loading

leowill01 commented Dec 3, 2023 •

edited

Loading

Use markdown image title text when generating list of figures #7915

Use markdown image title text when generating list of figures #7915

Comments

LunkRat commented Feb 12, 2022 • edited Loading

jgm commented Feb 13, 2022

LunkRat commented Feb 13, 2022

jgm commented Feb 13, 2022

LunkRat commented Feb 13, 2022

tarleb commented Feb 13, 2022

LunkRat commented Feb 13, 2022 • edited Loading

tarleb commented Feb 14, 2022 • edited Loading

LunkRat commented Feb 14, 2022 • edited Loading

tarleb commented Feb 14, 2022

jgm commented Feb 15, 2022

LunkRat commented Mar 17, 2022 • edited Loading

tarleb commented Mar 24, 2022

LunkRat commented Sep 27, 2022

jpcirrus commented Feb 4, 2023

jpcirrus commented Feb 4, 2023

tarleb commented Feb 5, 2023 via email

jpcirrus commented Feb 5, 2023

jpcirrus commented Feb 10, 2023

jpcirrus commented Feb 12, 2023 • edited Loading

prakaa commented Apr 27, 2023 • edited Loading

jpcirrus commented Apr 28, 2023

prakaa commented Apr 29, 2023 • edited Loading

leowill01 commented Dec 3, 2023 • edited Loading

LunkRat commented Feb 12, 2022 •

edited

Loading

LunkRat commented Feb 13, 2022 •

edited

Loading

tarleb commented Feb 14, 2022 •

edited

Loading

LunkRat commented Feb 14, 2022 •

edited

Loading

LunkRat commented Mar 17, 2022 •

edited

Loading

jpcirrus commented Feb 12, 2023 •

edited

Loading

prakaa commented Apr 27, 2023 •

edited

Loading

prakaa commented Apr 29, 2023 •

edited

Loading

leowill01 commented Dec 3, 2023 •

edited

Loading