author | date | title | |
---|---|---|---|
|
November 21, 2021 |
Creating Custom Pandoc Writers in Lua |
If you need to render a format not already handled by pandoc, or you want to change how pandoc renders a format, you can create a custom writer using the Lua language. Pandoc has a built-in Lua interpreter, so you needn't install any additional software to do this.
A custom writer is a Lua file that defines how to render the
document. Writers must define just a single function, named either
Writer
or ByteStringWriter
, which gets passed the document and
writer options, and then handles the conversion of the document,
rendering it into a string. This interface was introduced in
pandoc 2.17.2, with ByteString writers becoming available in
pandoc 3.0.
Pandoc also supports "classic" custom writers, where a Lua function must be defined for each AST element type. Classic style writers are deprecated and should be replaced with new-style writers if possible.
Custom writers using the new style must contain a global function
named Writer
or ByteStringWriter
. Pandoc calls this function
with the document and writer options as arguments, and expects the
function to return a UTF-8 encoded string.
function Writer (doc, opts)
-- ...
end
Writers that do not return text but binary data should define a
function with name ByteStringWriter
instead. The function must
still return a string, but it does not have to be UTF-8 encoded
and can contain arbitrary binary data.
If both Writer
and ByteStringWriter
functions are defined,
then only the Writer
function will be used.
Writers can be customized through format extensions, such as
smart
, citations
, or hard_line_breaks
. The global
Extensions
table indicates supported extensions with a
key. Extensions enabled by default are assigned a true value,
while those that are supported but disabled are assigned a false
value.
Example: A writer with the following global table supports the
extensions smart
, citations
, and foobar
, with smart
enabled and
the others disabled by default:
Extensions = {
smart = true,
citations = false,
foobar = false
}
The users control extensions as usual, e.g., pandoc -t my-writer.lua+citations
. The extensions are accessible through
the writer options' extensions
field, e.g.:
function Writer (doc, opts)
print(
'The citations extension is',
opts.extensions:includes 'citations' and 'enabled' or 'disabled'
)
-- ...
end
The default template of a custom writer is defined by the return
value of the global function Template
. Pandoc uses the default
template for rendering when the user has not specified a template,
but invoked with the -s
/--standalone
flag.
The Template
global can be left undefined, in which case pandoc
will throw an error when it would otherwise use the default
template.
Writers have access to all modules described in the Lua filters
documentation. This includes pandoc.write
, which can be used
to render a document in a format already supported by pandoc. The
document can be modified before this conversion, as demonstrated
in the following short example. It renders a document as GitHub
Flavored Markdown, but always uses fenced code blocks, never
indented code.
function Writer (doc, opts)
local filter = {
CodeBlock = function (cb)
-- only modify if code block has no attributes
if cb.attr == pandoc.Attr() then
local delimited = '```\n' .. cb.text .. '\n```'
return pandoc.RawBlock('markdown', delimited)
end
end
}
return pandoc.write(doc:walk(filter), 'gfm', opts)
end
Template = pandoc.template.default 'gfm'
The pandoc.scaffolding.Writer
structure is a custom writer scaffold
that serves to avoid common boilerplate code when defining a custom
writer. The object can be used as a function and allows to skip details
like metadata and template handling, requiring only the render functions
for each AST element type.
The value of pandoc.scaffolding.Writer
is a function that should
usually be assigned to the global Writer
:
Writer = pandoc.scaffolding.Writer
The render functions for Block and Inline values can then be added
to Writer.Block
and Writer.Inline
, respectively. The functions
are passed the element and the WriterOptions.
Writer.Inline.Str = function (str)
return str.text
end
Writer.Inline.SoftBreak = function (_, opts)
return opts.wrap_text == "wrap-preserve"
and cr
or space
end
Writer.Inline.LineBreak = cr
Writer.Block.Para = function (para)
return {Writer.Inlines(para.content), pandoc.layout.blankline}
end
The render functions must return a string, a pandoc.layout Doc
element, or a list of such elements. In the latter case, the
values are concatenated as if they were passed to
pandoc.layout.concat
. If the value does not depend on the input,
a constant can be used as well.
The tables Writer.Block
and Writer.Inline
can be used as
functions; they apply the right render function for an element of
the respective type. E.g., Writer.Block(pandoc.Para 'x')
will
delegate to the Writer.Para
render function and will return the
result of that call.
Similarly, the functions Writer.Blocks
and Writer.Inlines
can
be used to render lists of elements, and Writer.Pandoc
renders
the document's blocks. The function Writer.Blocks
can take a
separator as an optional second argument, e.g.,
Writer.Blocks(blks, pandoc.layout.cr)
; the default block
separator is pandoc.layout.blankline
.
All predefined functions can be overwritten when needed.
The resulting Writer uses the render functions to handle metadata values and converts them to template variables. The template is applied automatically if one is given.
A writer using the classic style defines rendering functions for each element of the pandoc AST. Note that this style is deprecated and may be removed in later versions.
For example,
function Para(s)
return "<paragraph>" .. s .. "</paragraph>"
end
New template variables can be added, or existing ones
modified, by returning a second value from function Doc
.
For example, the following will add the current date in
variable date
, unless date
is already defined as either a
metadata value or a variable:
function Doc (body, meta, vars)
vars.date = vars.date or meta.data or os.date '%B %e, %Y'
return body, vars
end
Custom writers were reworked in pandoc 3.0. For technical reasons,
the global variables PANDOC_DOCUMENT
and PANDOC_WRITER_OPTIONS
are set to the empty document and default values, respectively.
The old behavior can be restored by adding the following snippet,
which turns a classic into a new style writer.
function Writer (doc, opts)
PANDOC_DOCUMENT = doc
PANDOC_WRITER_OPTIONS = opts
loadfile(PANDOC_SCRIPT_FILE)()
return pandoc.write_classic(doc, opts)
end