29 releases (9 breaking)
new 0.10.0 | Feb 16, 2025 |
---|---|
0.8.1 | Jan 1, 2025 |
0.8.0 | Dec 31, 2024 |
0.7.3 | Oct 3, 2024 |
0.1.1 | Nov 23, 2023 |
#67 in Text processing
5,039 downloads per month
200KB
4.5K
SLoC
mdbook-pandoc

A pandoc
-powered mdbook
backend.
By relying on pandoc, many output formats are supported, although this project was mainly developed with LaTeX in mind.
See Rendered Books for samples of rendered books.
Installation
-
Install
mdbook-pandoc
:To install the latest release published to crates.io:
cargo install mdbook-pandoc --locked
The install the latest version committed to GitHub:
cargo install mdbook-pandoc --git https://github.com/max-heller/mdbook-pandoc.git --locked
-
Note:
mdbook-pandoc
works best with Pandoc 2.10.1 (released July 2020) or newer -- ideally, the newest version you have access to. Older versions (as old as 2.8, released Nov 2019) are partially supported, but will result in degraded output.If you have an old version of Pandoc installed (in particular, Ubuntu releases before 23.04 have older-than-recommended Pandoc versions in their package repositories), consider downloading a newer version from Pandoc's installation page.
Getting Started
Instruct mdbook
to use mdbook-pandoc
by updating your book.toml
file.
The following example configures mdbook-pandoc
to generate a PDF version of the book with LaTeX (which must be installed).
To generate other output formats, see Configuration.
[book]
title = "My First Book"
+ [output.pandoc.profile.pdf]
+ output-file = "output.pdf"
+ to = "latex"
Running mdbook build
will write the rendered book to pdf/output.pdf
in mdbook-pandoc
's build directory (book/pandoc
if multiple renderers are configured; book
otherwise).
Configuration
Since mdbook-pandoc
supports many different output formats through pandoc
, it must be configured to render to one or more formats through the [output.pandoc]
table in a book's book.toml
file.
Configuration is centered around output profiles, named packages of options that mdbook-pandoc
passes to pandoc
as a defaults file to render a book in a particular format.
The output for each profile is written to a subdirectory with the same name as the profile under mdbook-pandoc
's top-level build directory (book/pandoc
if multiple renderers are configured; book
otherwise).
A subset of the available options are described below:
Note: Pandoc is run from the book's root directory (the directory containing
book.toml
). Therefore, relative paths in the configuration (e.g. values forinclude-in-header
,reference-doc
) should be written relative to the book's root directory.
[output.pandoc]
hosted-html = "https://doc.rust-lang.org/book" # URL of a HTML version of the book
[output.pandoc.markdown.extensions] # enable additional Markdown extensions
gfm = false # enable pulldown-cmark's GitHub Flavored Markdown extensions
math = false # parse inline ($a^b$) and display ($$a^b$$) math
definition-lists = false # parse definition lists
superscript = false # parse superscripted text (^this is superscripted^)
subscript = false # parse subscripted text (~this is subscripted~)
[output.pandoc.code]
# Display hidden lines in code blocks (e.g., lines in Rust blocks prefixed by '#').
# See https://rust-lang.github.io/mdBook/format/mdbook.html?highlight=hidden#hiding-code-lines
show-hidden-lines = false
[output.pandoc.profile.<name>] # options to pass to Pandoc (see https://pandoc.org/MANUAL.html#defaults-files)
output-file = "output.pdf" # output file (within the profile's build directory)
to = "latex" # output format
# PDF-specific settings
pdf-engine = "pdflatex" # engine to use to produce PDF output
# `mdbook-pandoc` overrides Pandoc's defaults for the following options to better support mdBooks
file-scope = true # parse each file individually before combining
number-sections = true # number sections headings
standalone = true # produce output with an appropriate header and footer
table-of-contents = true # include an automatically generated table of contents
# Arbitrary other Pandoc options can be specified as they would be in a Pandoc defaults file
# (see https://pandoc.org/MANUAL.html#defaults-files) but written in TOML instead of YAML...
# For example, to pass variables (https://pandoc.org/MANUAL.html#variables):
[output.pandoc.profile.<name>.variables]
# Set the pandoc variable named 'variable-name' to 'value'
variable-name = "value"
Features
-
Markdown extensions supported by mdBook
- Strikethrough (e.g.
~~crossed out~~
) - Footnotes
- Tables
- Task Lists (e.g.
- [x] Complete task
) - Heading Attributes (e.g.
# Heading { #custom-heading }
)
- Strikethrough (e.g.
-
Markdown extensions not yet supported by mdBook
These extensions are disabled by default for consistency with mdBook and must be explicitly enabled.
- Blockquote tags
(Enabled by
output.pandoc.markdown.extensions.gfm
) - Math
(Enabled by
output.pandoc.markdown.extensions.math
) - Definition Lists
(Enabled by
output.pandoc.markdown.extensions.definition-lists
) - Superscript
(Enabled by
output.pandoc.markdown.extensions.superscript
) - Subscript
(Enabled by
output.pandoc.markdown.extensions.subscript
)
- Blockquote tags
(Enabled by
-
Raw HTML (best effort, almost always lossy)
- Linking to HTML elements by
id
- Strikthrough (
<s>
), superscript (<sup>
), subscript (<sub>
) - Definition lists (
<dl>
,<dt>
,<dd>
) - Images (
<img>
) withwidth
andheight
attributes- Class-based CSS styling (
width
/height
)
- Class-based CSS styling (
<span>
s and<div>
s- Anchors (
<a>
)
- Linking to HTML elements by
-
Table of contents
-
Redirects (
[output.html.redirect]
) -
Font Awesome 4 icons (e.g.
<i class="fa fa-github"></i>
) (LaTeX output formats only)
Rendering Pipeline
To render a book, mdbook-pandoc
parses the book's source (Parsing), transforms it into Pandoc's native representation (Preprocessing), then runs pandoc
to render the book in the desired output format.
Parsing
HTML
mdbook-pandoc
does its best to support raw HTML embedded in Markdown documents, transformating it into relevant Pandoc AST elements where possible.
Each chapter is parsed into a hybrid Markdown+HTML tree using pulldown-cmark
and the browser-grade html5ever
HTML parser.
This approach captures the full structure of the document -- including implicitly closed elements and other HTML quirks -- and makes it possible to accurately render HTML elements containing Markdown elements containing HTML elements...
This approach should also make mdbook-pandoc
better able to handle malformed HTML, since html5ever
performs the same HTML sanitization magic that browsers do.
However, the standard principle applies: garbage in, garbage out; for best results, write simple and obviously correct HTML.
Preprocessing
Structural Changes
- In order to make section numbers and the generated table of contents, if applicable, mirror the chapter hierarchy defined in
SUMMARY.md
:- Headings in nested chapters are shrunk one level per level of nesting
- All headings except for H1s are marked as unnumbered and unlisted
- Relative links within chapters are "rebased" to be relative to the source directory so a chapter
src/foo/foo.md
can link tosrc/foo/bar.md
with[bar](bar.md)
Known Issues
- Linking to a chapter does not work unless the chapter contains a heading with a non-empty identifier (either auto-generated or explicitly specified). See: https://github.com/max-heller/mdbook-pandoc/pull/100
Comparison to alternatives
Rendered books
The following table links to sample books rendered with mdbook-pandoc
.
PDFs are rendered with LaTeX (LuaTeX).
Book | Rendered |
---|---|
Cargo Book | |
mdBook Guide | |
Rustonomicon | |
Rust Book | |
Rust by Example | |
Rust Edition Guide | |
Embedded Rust Book | |
Rust Reference | |
Rust Compiler Development Guide |
Rendering to PDF
- When
mdbook-pandoc
was initially written, existingmdbook
LaTeX backends (mdbook-latex
,mdbook-tectonic
) were not mature enough to render much besides the simplest books due to hand-rolling the markdown->LaTeX conversion step.mdbook-pandoc
, on the other hand, delegates this difficult step to pandoc, inheriting its maturity and configurability. - "Print to PDF"-based backends like
mdbook-pdf
are more mature, but produce less aesthetically-pleasing PDFs. Additionally,mdbook-pdf
does not support intra-document links or generating a table of contents without using a forked version of mdbook.
Rendering to other formats
- By delegating most of the difficult rendering work to pandoc,
mdbook-pandoc
supports numerous output formats. Most of these have not been tested, so feedback on how it performs on non-PDF formats is very welcome!
Dependencies
~16–28MB
~435K SLoC