pandoc-citeproc

Background

LaTeX is all good and well, but I do find I spend too much time typing out commands, mostly for emphasis. More importantly, there is no complete conversion solution for when I inevitably have to turn over copy in docx format. Since I’d been drafting all manner of notes and teaching materials in markdown for years, I’ve recently decided to take the leap and draft upcoming books and articles in pandoc markdown. Once I’m happy with the text, I use pandoc to convert to docx. I then cease work on the markdown file and continue in LibreOffice instead, where I adjust pandoc’s mostly default formatting to meet the publisher’s requirements. Only once they get back to me with tracked changes do I renew my Office 365 subscription (or use an office machine), fire up Windows, and finish the job in Word.

In pandoc markdown, I use pandoc-citeproc to generate my citations and bibliography. pandoc-citeproc relies on style information from Zotero, so I download the desired csl stylesheet file from their database and identify it in a YAML header. In a book draft, I’ll relegate my header to a file of its own, header.md, which might look somewhat like the following:

% Old Testament Narratives in Anglo-Saxon England
% P. S. Langeslag
% Draft of \today

---
documentclass: book
mainfont: Junicode
lang: en-GB
toc: true
indent: true
parskip: 0
figPrefix:
    - "Figure"
    - "Figures"
tblPrefix:
    - "Table"
    - "Tables"
csl: modern-humanities-research-association.csl
bibliography: ot.bib
reference-section-title: "Bibliography"
---

Issues

I have never really used Zotero on its own, but this new workflow is really driving home the shortcomings of this well-known citations manager relative to a system like biblatex. Some of the issues encountered must arise from pandoc-citeproc itself, however. Some first observations follow.

Since I deal with a lot of foreign-language publications, I am glad to see that pandoc-citeproc chooses its form of title case based on the langid field in my biblatex bibliography. Complications arise when a title in one language contains a phrase in another. In biblatex, one has the power to specify formatting in these cases by containing the material in question in curly braces, but pandoc-citeproc appears to ignore such clues:

subtitle = {Genre, Rhetoric and the Origins of the {\mkbibemph{ars dictaminis}}}

In one case when a citation comes at the end of a paragraph, vertical space is inserted between it and the next paragraph although my header specifies none:

A number of biblatex fields are not yet supported in pandoc-citeproc. An example I’ve run into repeatedly is editora and editorarole, meaning that whenever a publication has a revising editor I have to make a note to add this into the docx citation manually:

@book{web07bib,
   editor = {Robert Weber},
   editora = {Roger Gryson},
   editoratype = {reviser},
   title = {Biblia sacra iuxta vulgatam versionem},
   shorttitle = {Biblia sacra},
   edition = {5},
   location = {Stuttgart},
   publisher = {Deutsche Bibelgesellschaft},
   year = {2007},
   langid = {latin},
   keywords = {primary}
}

Titles are sometimes not reproduced on subsequent occurrence, yielding two commas separated by a space. It appears not to matter whether the citations processor calls title, shorttitle, label, or shorthand in these cases, and playing with the citation command itself seems to have no effect:

The upshot of all this is that I note down any issues I spot while reading back what I have written in the PDF I generate for this purpose, but above all I have to go through my citations with a fine-toothed comb once I have moved to the docx stage. A notable downside of this is that this makes it even more prohibitive to return to the markdown file once I have made the switch to docx. It’s not all bad, though: when working in LaTeX, I would often get lost in the minutiae of formatting and citations when I was supposed to be drafting. Now I can draft freely, stopping only to add items to my bibliography and take occasional notes on citations gone wrong, but I am far less tempted to bugfix during the drafting stage.

posted by paul on 7 oct mmxvii at 16:20 EST
blog comments powered by Disqus