About

Read

How to ditch Word

December 4, 2012
by Karthik Ram.
Average Reading Time: about 4 minutes.

I spent an hour this morning polishing up a proposal. This mostly involved running spell-checks, cleaning up tables, and making sure I added in all the right references. That’s when I realized something. I haven’t used Microsoft Word to write anything in over 6 months. How fantastic!

Like everyone else I’ve been complaining about MS Word since the last ice age but never had a better alternative. When Markdown came around I was smitten but there were several things missing. Tables were hard to format and I still had to get the final text back into Word to insert citations. I’m happy to report that I’ve found great solutions to both these issues. So here’s a quick how-to (following up from my earlier post) for switching your writing workflow away from Word. There is a small learning curve but the payoff is wonderful.

Software you’ll need

Pandoc – It’s like the swiss army knife of document conversion. Although it’s command-line only, pandoc is easy to use and quickly converts any document into whatever format you desire.

Mendeley – A free reference manager. Mendeley is great for two reasons. First, it allows you to collaborate via shared libraries (especially when writing with multiple authors). Second, Mendeley can automatically export those libraries to a bib file anytime you make changes to them. (update: You are free to use any reference manager. Just export a bibtex file to the folder containing your writing).

A markdown editor (optional) – Technically you don’t need any special software to write in markdown. Any text editor will do. However, there are several tools and helpers that make the process easier and more fun to use. Marked for e.g. renders a live preview into one of several styles (or custom ones). If you’re on a mac, here is a complete roundup of Markdown editors. My favorites are iA Writer, and Sublime Text with the SmartMarkdown package. Mou is great for beginners.

knitr (optional) – If you plan to insert data tables, either from a spreadsheet or if you need to incorporate summary statistics, knitr will run the code for you in R and insert the output in pandoc friendly format (with the help of the pander package). This step isn’t necessary if you don’t require tables. I’ll describe this process in more detail in my next post.

That’s it as far as set up goes.

Writing your document

The markdown syntax is super easy to learn. It takes all of 5 minutes to learn and the documents are easily readable even when unparsed (unlike LaTeX). Here’s a quick guide to markdown syntax. Here’s what a simple markdown document looks like:

# Title
some text.
some more text.
## a sub-heading
More text. A [link](http://google.com/).
A figure
![Figure 1: caption](figure.png)

Adding in citations

Now if you need to cite anything, first add documents to your Mendeley folder or group and have it automatically export to a bib file into the same folder as your document (see Mendeley desktop’s settings). To cite any document, look at the details pane for a citation key.

You can create a Make file for each project and run that instead of typing in the pandoc call into your terminal (although this is super easy to remember once you use it a few times). That’s really it. You can do a lot more like adding in results, tables, figures, and equations using mathjax but I’ll save the more advanced stuff for a future post.

Workflow

When starting any new writing project, I create a new folder with two files (my markdown document and a small script). If this folder doesn’t already sit inside a git repository, I initialize one so my writing is version controlled (to avoid this) from the very beginning. Version control makes it really easy to return the document to any stage, remotely back it up on GitHub (and or other locations), and edit asynchronously with multiple coauthors (all of which are impossible with Word). When I need the formatted version, I run the script which:
* Copies in the most current version of the bib file from Mendeley
* Parses my markdown with pandoc using the settings I need (citations, equations, margins) and outputs a pdf (for viewing) and Word (for some collaborators that still prefer this format).

I’m very interested in the question of how we get people into using tools that can allow us to publish online faster, and that reduce the processing burden. I like this approach, and I mostly use plain text (notational velocity is an amazing plain text everything bucket).

Getting researchers who are tied into the wintel model to shift is going to be where the biggest movement of the needle can happen. With that in mind I think we need to look to provide them visually appealing tools, that mock the experience of working with a text editor, and that can produce a better experience for writing and a better output format.

I don’t what that is going to look like yet, but I feel that something potentially built on top of webkit might be the way to go. I’m keeping my eyes open, and if you see anything shout out.

Just wanted to give a hint as you mentioned `pander` package, that you might also transform your markdown text with R code inside to pdf/docx/odt etc. with even a simple function – which handles the evaluation of R code, rendering images to png, tables, linear models etc. to markdown, passes the results to Pandoc and optionally opens up the generated pdf/docx etc. on your computer.

Currently, passing the bibliography file to Pandoc with `Pandoc.brew` is not possible without a quick workaround, but if you would find this useful, I would be very happy to add this feature in the next few days.

Very nice post, and now I’m inspired to make the switch. But I wonder if you have any idea how difficult this workflow might be to implement if you’re working with someone who uses Word *and* track changes?

Hey Matt,
Thanks for the feedback. This is an unfortunate problem that one cannot always avoid. In some of my projects I often work with a senior academic or two who prefer Word and track changes over plain-text and git. In that case I do two things.

First I also have pandoc produce a word document. It’s easy to do by changing the output extension to docx. Next I have this copied into a shared Dropbox folder so with each update, a Word doc is available to collaborators who prefer it. When I receive a marked up document, I just open it alongside my markdown file, and manually incorporate (or reject) each change. In the associated commit message, I make a note to indicate who suggested the change. It’s a bit cumbersome but not that much more work than accepting them directly into Word. This way there is a recorded history of who suggested changes with a waypoint to revert later if need be. With track changes alone, once accepted there is no history or record of that change anywhere in the document (or associated metadata).

[…] too much like using Word. My bf introduced me to markdown, and I saw that others had used it for their thesis or papers, and that it was possible to add citations and everything else academics needed, […]

Thanks ever so much for this useful guide! After hearing about markdown and finding your page, I managed to get it to work for me I did spend/waste a few days getting it all to run smoothly, but the results were quite rewarding and I think it was still better than writing in LaTeX the entire time. Just handed in my thesis last week
Thanks also to Sean who commented above for the xelatex tip as I had plenty of symbols in my thesis.