Information technology, applied.

On Documentation

I have been learning how to use open source applications such as Ansible, git, iptables, freeswitch, and many others. With few exceptions they take hours and often days to learn properly. The reason they take so long, even for experienced technology professionals, is the universally poor quality of their documentation. This is a known problem with engineers and I see this at work as well. As an engineer myself I have to pay attention avoid this pitfall.

I do not know exactly why engineers are so bad at writing documentation. I think it is caused by a few things: a) they don’t know how to write good documentation, b) they prefer to write code instead, and c) they are in too much of a hurry. What they fail to see is that documentation is just as much part of developing good software as coding. The best and most clever code in the world is useless unless documented. You won’t even be able to use your own code 6 months from now unless you document it!

Here are some examples of poor documentation and some guidelines for how to do better. I picked a few of these because I have experience with them and because I think the software itself is quite good.

FreeSwitch I really like FreeSwitch but not its documentation. Projects that rely on wikis usually have poor documentation. Group collaboration is a good thing, but it always—always—needs editing. The FreeSwitch wiki documentation does not clearly define its terms, it does not provide step-by-step instructions for common tasks, it does not provide the level of precision and specificity required for input parameters, it does not provide up-to-date explanations of features, and it does does not provide useful and up-to-date examples.

Here is a quote that explains what the term “domain” means in FreeSwitch:

The domains inside the xml registry are completely different from the domains on the internet and again completely different from domains in sip packets. The profiles are again entirely different from any of the above. Its up to you to align them if you so choose. […] When you want to detach from this behavior, you are probably on a venture to do some kind of multi-home setup.

Got that? In fairness to the author, who is the founder of FreeSwitch, he is obviously a talented programmer and has contributed something of real value to the world. My intent is not to criticize but to point out that poor documentation is like a boat anchor; it holds a project back and irritates users.

libvrt This project provides an api to help administrators and developers manage virtual machines. The problem with its documentation is that it jumps immediately into details with no context. The first thing below the (empty) documentation start page is instructions for compiling. Why would I want to compile it if I don’t even know if this is the product I need? This documentation lacks a task-centric framework just like the FreeSwitch docs.

The libvrt documentation makes the all too common mistake of a deep-dive into the “Architecture” along with a lot of theory. It does this again under “Internals”. As someone who simply wants to use this library to accomplish a few tasks, I really do not want to learn any of this. None. This might be the best architected library in all of open source but I really don’t care. I have 12 different libraries to learn and use, and simply don’t have the time to wade through all of this superfluity.

And this is the real problem. Documentation like this is not user friendly; it does not approach things from the perspective of the user. Poor documentation is always from the perspective of the developer or the expert. I have many, many examples of the same problems but I will leave it at these two.

Here are a few ways to make better documentation:

Good documentation is just as important as well written code. Only amateurs think otherwise.

Put yourself in the shoes of your user. Your user is busy, probably overwhelmed and stressed, and just wants to accomplish a few critical tasks using your software. Don’t make this person wade through pages of outdated wiki, pointless detail, and theory to find what they need.

Organize your documentation like this:

What will this software allow you to do and why should anyone use it?

What features does it provide?

For each feature, provide a step by step guide for accomplishing some task within that feature. Arrange these guides from the general to the specific. Use hyper-links to connect them.

Add a reference section that defines all parameters in exact, precise, and exhaustive detail. Don’t say “enter an address”. Say, “Enter an ipv4 address in CIDR notation.” Include a BNF grammar for all commands. Include an XML schema for XML documents. Remember that we are dealing with computers! They don’t like guesses.

Provide a table of contents that let users quickly find the task they want to accomplish. Do not expect someone to read the documentation from cover to cover. In fact, expect them to read the absolute minimum they possibly can. Help them do this.

Use standard computer science terms. Do not overload terms like FreeSwitch does with “domain”. Do not invent new labels for standard things. Avoid cute terms such “recipes” that OpsCode uses in Chef—rules or scripts would have been more clear.

Always use plain language and avoid jargon where possible. This is good for everyone and greatly helps non-native speakers.

As an example, the libvrt documentation could be organized like this (I almost certainly have the technical details wrong but its the structure that counts.) As a user, I want to accomplish the task “Install a VM on a XenServer host.” This is my starting point in the documentation. I read in the introduction that libvrt has this feature so now I go directly to the section of the documentation for my task.

This task might list these steps:

Ensure that libvrt is installed on the client and on a supported target server. (This is a hyperlink to the task of installation, “supported” can link to a list of supported hypervisors)

Test that the libvrt client has SSH access to the target server. (hyperlink to steps enabling and for testing this)

Test that libvrtd is running on the target server. (link to steps)

Create an XML file with the parameters of the VM

Edit the file. Save file with any name and extension of “.xml”. (link to example and reference)

Validate the XML file (link to steps how to do this)

Pass this file to the libvrt command shell (link to steps, example, etc)

On error troubleshoot. (link to errors list and how to correct them)

Test that the VM was properly created (link to how to do this)

The basic pattern is preconditions, steps, and post-conditions. Explain only as much theory and terminology as needed to perform the task. Hyperlink terms to their definition. Provides lots of examples. Offers ways to test if the step succeed. Offer troubleshooting steps for errors that could occur.

Now that I think about it, these steps should look like an Ansible playbook or Chef recipe. Hopefully, one day we will use scripts instead of manual steps. The scripts could become the documentation!