20.20 Paper template

See markdown kitchen sink

The paper development process is based on markdown, git, Docker, pandoc, and csl. The goal is to support the collaborative writing process effectively, building on powerful open source tools, and supporting efficient production of documents for submission.

Start your manuscript development by using the Markdown template. This template provides a structured format for your initial drafts, helping to organize content efficiently.

Table of contents

Setup

This section explains how to install the environment and build the paper. Git, Docker and make are required. The following steps describe how to

  1. Set up the repository (create a new one or contribute to an existing one),
  2. Set up the paper production pipeline, and
  3. Build the paper

The manuscript template creates the following files (additional files for the analysis can be added)

├── README.md     <- Readme containing summaries of the paper's revisions (completed
│                    and ongoing). Revision sheets are created from this document.
├── paper.md      <- Contains the paper
├── Makefile      <- Contains instructions to build the paper (as pdf or docx),
│                    and run the analyses (if a Makefile exists in the analysis
│                    directory)
├── Dockerfile    <- Describes the pandoc container for building the paper
│                    (called by the Makefile)  
├── figures       <- Directory contains all conceptual figures
├── output        <- Directory contains all output, such as data plots and
│                    tables, which is generated by the scripts (do not store
│                    anything else that is not generated by the scripts in this
│                    directory)
└── .gitignore    <- Contains a list of files that are excluded from git
                     versioning (e.g., paper.pdf, which would create conflicts
                     when merging changes)

Templates (*.docx, *.tex) and styles (*.csl) are versioned in the paper repository. This can be accomplished by using the code in comments at the beginning of the Makefile.

1: Set up the repository

Option 1: Create a new repository

  • Select a short and pronounceable title for the paper
  • Do not refer to papers by their target journal (e.g., “the Journal of Information Technology paper”)
  • Use labot to set up the paper:
labot paper --init
Manual setup
git clone git@github.com:digital-work-lab/paper-template.git
# MANUALLY rename the folder using a short project title
# remove the .git directory containing older versions
rm -rf .git
# repo setup:
git init
# MANUALLY create paper (update titles etc.)
mkdir analysis data figures output
pre-commit install
cd .git/hooks
cp ../../post-xxx-sample.txt post-checkout
cp post-checkout post-merge
cp post-checkout post-commit
rm ../../post-xxx-sample.txt
cd ../..
git add .
git commit -m 'initial commit'
make pdf
# connect to git remote
# MANUALLY update url in the following line
git remote add origin https://github.com/....
git branch -M main
git push -u origin main
# MANUALLY invite coauthors/provide access to the remote repository
# git clone template-repository
# git clone https://github.com/citation-style-language/styles
# MANUALLY symlink the templates and styles repos

Option 2: Contribute to an existing repository

# pull repository
git clone https://...
pre-commit install

2: Set up the paper production pipeline

# Build the Docker container
docker build -t pandoc_dockerfile .
# MANUALLY select a citation style by updating the link at the beginning of the Makefile
# MANUALLY select templates for docx/tex by updating the link at the beginning of the Makefile
  • Citation styles can be selected here
  • Templates are available here

3: Build the paper

make pdf
make docx

Edit the paper

The paper is written in markdown. Useful guides on markdown are available online (1, 2). Several editors are available for markdown documents, for example:

The bibliography is stored as a bib file and versioned by git. It can be edited using tools like JabRef. It is recommended to configure save actions to store titles as sentence case, normalize names of persons for author fields, and to normalize page numbers. Changes in the bib-file should be checked before committing and adding {} to protect cases when necessary (e.g., {J}ab{R}ef).

  • Begin each sentence in a new line to make the history more readable and to make merging easier (see semantic line breaks).

Formatting the bibliography (*.bib file)

  • Title field: use sentence case (e.g., “On the origin of species”, not “On the Origin of Species”). CSL styles that require title case will use an automatic title-case conversion (1).

  • Journal field: use long version (e.g., “Journal of Information Technology”, not “JIT”). For CSL styles that require abbreviations, temporarily replace the journal field in the bib-file since CSL does not handle style-specific journal abbreviations (1). JSON-files containing journal abbreviations can be passed to pandoc using the citation-abbreviations field.

  • Inproceedings: when using crossrefs (e.g., crossref = {icis2010}) for conference papers, the year and month field should not be set, only the original \@Proceedings entry should include the date formatted as follows: date = {2015-12-13/2015-12-16}.

Principles

This section summarizes underlying architectural principles and considerations.

  • Minimal functionality: the most frequent use cases (writing, citing, creating submission files in docx/latex format with different templates and citation styles) must be supported in a highly efficient way. More complex functionality (such as tables with merged cells and detailed formatting) is used less frequently (requiring less effort if not supported in an efficient way). It may also create fewer problems (e.g., merging complex tables - as in Word/comparisons of tables)

Advantages of the markdown/pandoc workflow:

  • Separation of content and format
  • Collaboration and versioning of text with git
  • Relying on active open source projects such as the CSL standard
  • Flexibility regarding the bibliography type (pandoc supports .bib, .enl, …)
  • Combined writing and comments in the same document
  • Reproducibility across platforms
  • Generating docx/tex/pdf as output

Since the underlying open source projects (especially CSL, pandoc, pandoc-citeproc) evolve constantly, it is recommended to work with the latest versions. This is a limitation of the popular manubot project or pandoc scholar, which work with old versions of pandoc (which may be hard to update).

Possible extensions (lua-filters): bibexport, abstract-to-meta, spellcheck, wordcount.

Resources