Docx file version control

Being a huge friend to version control systems I thought it would be fun to add also my writings to version control. Despite the fact that the files themselves get entered into VCS fine, the content difference could not be read so easily. In order to provide that functionality I had to come at this problem a little from the left field. I had to provide the same text in a format that VCS diff could disgern and display.

The answer to my problem (well the first half at least) was Pandoc. Pandoc is a piece of software that converts files from one format to another . Want to convert from html to docx? Pandoc does that. Vice versa? Pandoc can handle. I personally decided to use markdown as the format that would be displayed in my version control diffs.

So how to make this magic happen? The first step of course is to install the library. Most Linux distributions probably already has it. As for Windows… sorry, not covering that.

After pandoc has been installed the command to convert is very simple:

My second problems was how to make the conversion process continuous so I wouldn’t have to run this command every time. The answer to that question is another library called inotify-tools and more precisely: inotifywait. What inotify-tools does is it provides tools necessary to watch and react to file system changes (like creating new files or editing existing). So by combining these two tools I can create an automated system that watches my files for changes and converts them automatically. The full script ended up being as follows:

All I have to do is run this script and give the proper source and target folders and everything withing (recursively) is automatically formatted. Afterwards I just commit both files to version control.