The collected works of Lana Brindley: writer, speaker, blogger

Tag Archives: documentation

This Summit was a homecoming of sorts. OpenStack started in Austin with 750 people, and returned six years and twelve conferences later with 7500 people. Even the baristas in the downtown coffee shops noticed us the second time around.

For documentation, this conference was bigger than usual as well. We had a total of eight sessions, in addition to the contributor meetup on the last day, which is more docs sessions than we have ever had before.

And we had a lot to talk about! The biggest thing on our minds was the future of the OpenStack Installation Guide. The Big Tent has changed the way that projects go about joining the OpenStack ecosystem, and with Foundation having an increased focus on ensuring new projects have sufficient documentation, we needed to change our approach to documenting the installation of an OpenStack cloud. There is no ‘right’ way to install a cloud any more, and there is certainly no ‘right’ set of components you should be installing when you do it. But with a small documentation team, and a seemingly endless parade of new components requiring documentation, we were faced with a big technical challenge, where everyone had some kind of skin in the game. Despite some differences of opinion, the session itself was extremely productive, and we came away with a solid set of deliverables for Newton. First of all, we’re going to create the infrastructure to allow projects to write their own installation docs in their repos, and then publish them seamlessly to the docs.openstack.org front page. This means that projects have responsibility for their own docs, but the docs team will provide assistance in the form of templates and infrastructure support to ensure that all projects are treated as first class citizens. Secondly, the existing Installation Guide will change focus to be more about an installation tutorial, giving people a highly opinionated and completely manual installation method to learn the ropes, but not to install a production cloud. Thanks to the OpenStack User Survey, we can safely say that most production clouds are installed using some kind of automated tool, so having manual installation instructions is useful as a training tool, but not in a real world scenario.

With the big question more or less settled, we got on to the fairly long laundry list of other things that needed to be done, which all ended up focusing mostly on streamlining some of our processes, being clearer about the way we operate, consolidating guides that had (for obscure historical reasons) been in their own repos into the main one again, and general editing and tidying up. A full list of the goals can be seen here: Newton Docs Deliverables. And, for historical interest, here’s the whiteboard from the Summit session:

During the Mitaka release, docs had a focus on Manageability, aiming to work more effectively and efficiently, with a focus on collaboration. For Newton, while manageability themes are still very much present, the focus is more on Scalability, and making our documentation efforts scale out to represent a much greater proportion of products, contributors, operators, and users. From empowering projects to write their own documentation with our support, to making our processes simpler to find and understand, to ensuring our documentation is as accurate, up-to-date, and effective as possible, it’s going to be an exciting cycle for docs!

I leave you with one of my favourite Texan big things: a bathtub margarita!

Those of you reading this post, with your laptops, and mobile phones, and iPads, and vanity email accounts, and your single sourced, content-reuse, DITA-compatible Docbook XML toolchains, with all your fancy Javascript elements and mind-boggling CSS overlays. You are just the latest in a long line of human beings who have been doing the same thing for millennia. Albeit with different tools.

The original owners of the land we are standing on today are the Wurundjeri people. Australian indigenous art is the oldest unbroken tradition of art in the world. These weren’t just the pre-history version of hanging a Monet print on your loungeroom wall. Indigenous art exists on all manner of things: paintings on leaves, wood and rock carvings, sculptures, and of course cave drawings. This art gave early Australians a way to record the things that mattered most to them in their lives: they often involve scenes of hunts or special ceremonies. In the case of Australian art, many include megafauna and other extinct species, and even the arrival of European ships. More than a record of events, though, they were probably also a method of teaching. Each indigenous tribe had its own mythology (collectively known in English as ‘the Dreaming’), which used stories to convey morals or other educational information. Most children who grew up in Australia would be familiar with the Dreamtime story about Tiddilik the Frog, a fable about greed and about finding humour in bad situations. Indigenous art and the stories that lie behind them are really just an early technical manual for life itself, especially in a world where living for any length of time could be quite difficult.

Who here remembers the story of Archimedes and his bath? It’s a demonstration of how Archimedes used water displacement to measure the density of an object (in this case, the king’s crown). Of course, the bit we all remember of the story, though, is that Archimedes, having made his discovery in the bath, went running naked through the streets of Syracuse, crying “Eureka! I have found it!”. This story comes to us from one of the oldest surviving technical manuals in existence, the “De architectura” by Marcus Vitruvius Pollio, which was published in around 15BC. Of course, the Ancient Greeks & Romans were well known for their literature, their scholars, their philosophers, and perhaps above all, their library. The Royal Library of Alexandria in Egypt was the largest repository of knowledge in the world between the 3rd century BC and 30BC. The famous fire that destroyed it was probably set by Julius Caesar himself in 48BC, but the library continued in some capacity until the Roman Emperor Aurelian destroyed what remained in about 270AD. This was of course a massive blow to literature, but it also an incredible loss of technical data as well. Thankfully, the Ancients managed to keep going even after the library was destroyed, and we now have surviving copies of wonderful pieces like Pliny’s Naturalis Historia, which is essentially the world’s very first Natural History encyclopaedia, and which set the stage for many more technical manuals to come.

Jumping over to Europe, Gutenberg did his thing with the printing press in the mid 1400s, but printed books were still a terrifically rare and expensive thing until well into the 15 and 1600s. Up until that period, if you were a fairly ordinary person in a fairly ordinary European town, you were probably aware of the existence of almost exactly one book: the bible that your local clergy had sitting on a plinth in your church. You probably couldn’t read yourself, or if you could probably not well enough to be able to read and understand a book written predominantly in a particularly stuffy version of Latin, and even if you could read that well, you wouldn’t be allowed to touch it. No, the bible was the word of God, and as such could only be read and interpreted by men of the cloth. They didn’t really want people going off and reading the Bible on their own and drawing their own conclusions about things. Of course, this got really interesting once the Reformation really started to get underway in the mid 1500s, and people started to read the Bible for themselves. In fact, for a little while there in England, Henry VIII decided that ordinary folk (and all women) were banned from reading the Bible. All this running around reading things and learning by everyday people was just a little too much for him to bear, especially when they started disagreeing with him.

Still in Europe, with better access to mass printing, publishing written versions of early verbal history became the thing to do. We all know the Brothers Grimm were writing fairy tales in German in the early 1800s, but they certainly weren’t the first to try and document the oral history of early Europeans. Charles Perrault is considered the original author of many of the Disney favourites, including Cinderella, Little Red Riding Hood, and Sleeping Beauty, and he was writing in French over a century before the Grimms, in the late 1600s. But even he was just writing down stories he’d heard from others. My favourite version of Cinderella comes from Giambattista Basile, published in Neapolitan in 1634, some years after he died. These stories, gruesome as they were before Disney got a hold of them, were intended in many cases to be fables for children, with a moral story, but were also used as cautionary tales for adults. In Basile’s version of Cinderella a husband is warned of the horrors of not being too picky about your second or third wife, he gives a general warning to the household about choosing your housekeeping staff carefully, a warning to parents about treating children fairly, and a warning to young women about being proud. And that’s before we get to the bit Disney likes: “if you’re a good person, good things will happen to you”. Some versions of the story also slam home the opposing moral: “if you’re a bad person, bad things will happen to you”, with both the step-sisters either mutilating their own feet to fit the slipper, having their eyes pecked out by birds at Cinderella’s wedding, or some equally terrible combination. As for other horrifying fairy tales, anyone who has read anything by Hans Christian Anderson will know that they often got worse before they got better. There’s a reason Disney never took on “The Little Match Girl”. For a long time, what we now know as fairy tales were the easiest and most entertaining way for a largely illiterate population to record and share moral stories and warnings.

A ribbon that runs through all of these is the idea of the master and apprentice. These types of relationships began in Europe in the 1300s, and were a way for a trade person to get cheap labour, while a young apprentice got a bed to sleep on, food to eat, and the hope of a trade later on. This system was used throughout England and Europe for all skilled trades: from seamstresses and blacksmiths, to Knights with their squires. However, the general principles of apprenticeships exist throughout the world, with one of the earliest examples being the idea of a Maiko, or a trainee Geisha. Geisha have existed in Japan since around 700, and still take in Maiko to this day. While this isn’t written knowledge, it is an important footnote when we’re discussing the history of content, as this was the main way that specialised technical knowledge was handed down.

Of course, a young apprentice, wishing to remember all the things they had learned, might be inclined to write them down. By the time the Industrial Revolution was in full swing, paper and books had become affordable, schooling was more available to children throughout Europe, and literacy was becoming much more widespread, especially to those bright young apprentices who left home to seek their fortunes. And while young people have written home to their families since ancient times, letter writing really hit its stride around the turn of the century when it became not just a way to record their days and connect with their families, but also a way to explore political and religious matters, and explore emotions: poison pens, love letters, and obituaries are all well represented in letters. Another form of writing more like the manuals we know today of course, is the recipe book. Many household cooks would enshrine their recipes in writing, to be handed down to the next generation. I regularly bake a family choc chip biscuit recipe that has been handed down mother to daughter for at least five generations, and possibly quite a few more than that.

But enough history. The older writers in the audience will probably remember most of these more recent forms of technical communication. Some of the more unfortunate among you may still be working with some of them. In that case, I’m sorry.

Printed books are pretty all of our yesterdays. In some ways, it still feels as though you’re not a REAL writer until you’ve got your name on the outside of an actual book, made out of dead tree, and sent from some printer. I chose a picture of O’Reilly books on purpose, as OpenStack released yet another of our manuals as O’Reilly dead tree version last year, although we have no immediate plans to repeat that in a hurry. Personally, I’m part of the problem here. I love having dead tree reference books, especially for things like Style Guides, which are somehow easier to have sitting on my desk as I write, rather than relying on an internet search (which can, for me, at least, be very distracting. Hello, Twitter!). As for writing them, though? No, I love the idea of being able to catch and fix errors even after publication. Nevertheless, printed books, especially technical manuals, are our history, our present and, to some extent at least, probably also our future.

SONY DSC

A close cousin of the printed manual, whitepapers are caught somewhere between marketing material and technical documentation. In digital form, they are probably not going to go away any time soon, but the printed whitepaper has almost certainly been confined to the recycling bin these days. My very first piece of technical writing was a white paper. I had a Marketing undergraduate degree and half an MBA, so it was a fairly logical piece of work for me to be doing at the time. I enjoyed it immensely, and immediately set out to become the whitepaper expert, intending to build a career around it. Thank goodness I discovered technical manuals in the meantime, and was saved from a life of writing whitepapers!

And, finally in the ‘recent’ category, I have a screenshot from my very own project. This is, for all intents and purposes, an online version of a printed ‘book’. It has a table of contents down the side, divided into chapters and sections, and it’s designed to be read from beginning to end: simple concepts at the beginning, more complicated procedures as you move through, with reference information (tables of data, contact details, and a glossary) at the end.

These have all been great methods of getting information out there, but they are all destined to become as archaic as the fairy tales and the cave paintings we discussed earlier. Let’s take a look at those things we’re doing a little differently today, that will drive the way we revolutionise and improve content management in the future.

First of all, I want to briefly touch on MOOCs. These are the future of face to face training courses. MOOCs not only allow people all around the world to study when and where they choose, but they also allow institutions to create online tool that mimic real world scenarios, and allow students to learn real skills in a safe environment. This is great especially for the tech industry, where students can work on realistic IT setups that they might not be able to recreate in their own environments, but it also works well for teaching other knowledge work skills such as customer service and financial skills.

The main thing that, I think, changed the way we looked at the information we were creating, was DITA. Of course, DITA isn’t new. It was named in 2001, and formalised in 2005, but varying groups have been working on data mapping and the like since the 60s and 70s, and it became especially popular in the 90s, with the publication of JoAnn Hackos’s book ‘Managing Your Documentation Project’ (and later ‘Information Development’) a book probably most of us have on our shelves, and to which I (at least) still refer to regularly. DITA was really the first formal, open standard that let us consistently and accurately categorise data into formal types. And it was simple enough that we could all use it, remember it, and above all teach it to others easily. Even if you’re not using a specific DITA tool, the general principles of DITA–splitting content into one of only three data types–could be used to underpin any tooling system.

Of course, the main driving principle behind DITA (besides the categorisation) is about content reuse and single sourcing. This is another key component of how we’re changing the way we look at content. It’s not about a beginning and an end any more. With this idea, we walked away from the age old idea of delivering a story, and moved towards this critical period of considering what information is required where, and when. This was important mostly because we were actually starting to consider how people consume information, and learned difficult concepts. We no longer assumed that information we gave to people in the beginning of a book stuck with them as they moved through the rest of the content. Sometimes, learners needed to go over information again and again before they actually learned it and could apply that information to later, more complicated, tasks. And, being the inherently lazy writers that we are, we didn’t want to retype that every time. So single sourcing and content reuse were naturally very easy for us to adopt.

And that leads me to perhaps my favourite topic right now: every page is page one. This is a model designed by Mark Baker, and while his model is certainly not the only one out there, it’s certainly one of the best developed. The general idea behind this is that no piece of content is more or less important than any other. It’s not quite DITA, in that a ‘page’ in EPPO terms is much bigger than a ‘topic’ in DITA terms. The best example comes from Baker himself, where he refers to a recipe. A recipe contains, in DITA terms, a concept (some information about the recipe, that describes what you’re actually creating, and maybe some background, where the recipe has come from, and the types of ingredients that you need), followed by a procedure (the actual steps of the recipe), and finished with reference information (serving suggestions, maybe information on converting measurements, or ingredient substitutions). In EPPO, the entire recipe is the ‘page’: it contains everything you need to be able to perform the task, including all that concept and reference info. One of the best ways to think about EPPO is in terms of a Wikipedia page, there are links to further information if you need it (and I’m sure all of us here have gotten sidetracked by clicking those links in a Wikipedia article!), but that page contains all the specific information about a particular topic. There is no beginning to Wikipedia, and there is most certainly no end.

So this leads me to the big question: what does the future hold for content? I think there are a few main themes we can tease out of our little journey through documentation:
The internet is making things possible that never were before
Control over content is shifting from those producing it, to those consuming it.
Consumers are used to being able to search vast resources for content, and filtering those results themselves. They don’t want us to tell them what they need to know.

Since well before the birth of Christ, in one form or another, we’ve been writing stories. Now the internet allows people to create their own stories, not just have one told to them. In many ways, this shows a maturation in human development: we’re no longer willing to receive whatever is fed to us, we want to create our own realities, and we have the tools to be able to do that.

But that is a massive challenge–and (I would argue) an opportunity–for technical writers. We get to break new ground, and thankfully we’ve been working on the building blocks of this type communication for a few decades now. The challenge now is to start delivering documentation in a completely new way, without leaving our organisations, our management, or our more stubborn clients behind. Nobody said breaking new ground would not require effort, or determination. As we shed old ideas, old processes, old technologies, and old systems, there will be people who decry change, and impede our progress. But even if you only manage to implement a small piece of your grand vision, even if all you ever get to do is plant a seed of an idea in someone’s head that maybe–just maybe–there’s a different way to do things, then you have succeeded. After all, every one of the pieces of content I have mentioned here had its detractors, from every day ‘concerned citizens’, right up to royalty, and the literati.

I mentioned Archimedes earlier, but now I would like to pick a different quote of his: give me a lever and a firm place to stand, and I shall move the world.

Right now it seems to me, that where we could go next is almost infinite. People have always created and consumed content. As long as we continue to put the information out there, and give people the tools to find it, they will continue to do so. We are not at the end of a journey, nor at the beginning of one. We are merely at a step along a very long road. Let’s find out where it leads us.

Day 1 is drawing to a close at linux.conf.au 2015 and we’ve just wrapped the documentation miniconf. There was an interesting mix of talks today, and as the first documentation miniconf at an LCA, it’s given me some great ideas for growing the miniconf in future years.

As for me, after doing the Agile Documentation Lego talk at LCA in Perth in 2014, I felt I needed to give a good follow up show, this time focusing on Every Page is Page One. To do this, I devised a game based on the children’s book “We’re Going on a Bear Hunt”, and using Play-Doh to make it a little more hands on.

Like most people I learn best by doing, so one of the first tasks I set myself when I started working on OpenStack documentation was to get a docs patch in as quickly as possible. In the end, this turned out to be a copyedit on the introduction to the High Availability guide, and it happened on day two, right before I did my induction in Sydney on day three. I had no idea this was a big deal until I got to the Sydney office to find myself being lauded, which was somewhat amusing, and slightly embarrassing.

So the main upside of this is that I am now officially an OpenStack ATC (active technical contributor). The next step from here is to keep making patches of course, along with code reviews and the like and hopefully to continue being useful to the community in this way.

One of the more important things about coming to a new project is working out the workflows. All the formal stuff is documented, of course, but sometimes it’s the common knowledge things that are hardest to pick up; the bits that ‘everyone knows’. I’m still feeling a little daunted by code reviews (what exactly constitutes an acceptable patch? To what extent should I be testing the proposed content?), but after discussions with the core docs contributors I’m starting to get a handle on those. This week has been spent largely as a sponge: just trying to soak everything up. After a day or two off over the weekend, hopefully my brain has made sense of most of the things I’ve learned so far, and next week will be a week of action!

Possibly my favourite conference, linux.conf.au is coming to sunny Perth in January 2014. I’ll be returning to the Haecksen miniconf driver’s seat (check out haecksen.net for more info and the Call for Proposals), and also will be giving a talk myself, called There and Back Again: An Unexpected Journey in Agile Documentation. This is a talk I’ve given a few times already, including at OSDC 2013, so I’m really looking forward to sharing it with the linux.conf.au audience. That, and I’ve never been to Perth before, so yay!

Writing procedures can be much more difficult than you’d think. We see procedures everywhere, so it’s natural to think that we should be able to write one without too much trouble. For that reason, I wanted to take you through some terrible real-life procedures. This is at least partly so we can all have a chuckle at other peoples’ mistakes, and feel a little bit better about ourselves. But it’s also because it’s a lot easier to find examples of bad procedures than good ones.

With that end in mind, I went through my junk drawer, and pulled out one or two manuals that I had lying around, and I’m going to use them as examples of what not to do as we go along.

The first thing you need to look at is whether you’re documenting a process or a procedure. It’s easy to use these terms interchangeably, but they actually mean different things. The main thing to remember is that a process can contain many procedures. A process gives an overview of tasks: you might need to install the package, configure the package, and then use the package. Overall, that’s a process. Each of those things, though, is a procedure. Procedures are instructions for doing something.

Here’s an example of a certain hand-held computer game. As you can see, the instructions for using the stylus are … step 5? Every procedure in this book is numbered. What’s happened here is each procedure in a process has been numbered, rather than each step in a procedure.

So the next thing to worry about is whether you should be using bullets or numbers. This one is a really simple test: is the order important? If the order is important, use numbers. If it’s not, use bullets. Oddly, though, we get this one wrong all the time …

These ones should all be bullets. You don’t need to operate the product from a power source before you remove the unit from the packaging.

Let’s try this one together: Most of these ones should be numbered, the text even tells us that. The ones on the left under “Cutting Tips” are bullets, the order isn’t important, it’s a list of tips. What about at the top under “Starting and Stopping the Trimmer”? This one probably doesn’t matter, I’d be inclined to use numbers, though, mostly because you can’t stop the trimmer unless you’ve already started it.

And just another one, because it’s so easy: The bullets in red are fine, but then we go to numbers in the purple, and then for a little variety we throw in some upper-case letters in green. Bullets would have been for all of these.

So the next thing to worry about is whether you’re describing a concept or a task. A concept is a description, it answers the question “What do I need to know?”. A task is an action, it answers the question “What do I need to do?”. As writers, it’s much easier for us to think about things rather than tasks. Users think about tasks, though, not things. Remember the old adage about not needing a drill, but a hole? That’s the essence of this point.

This one just has so much wrong with it it’s hard to know where to start. Considering we’re talking about concepts and tasks though, let’s start with pulling those out. I’ve marked the concepts in blue, and the tasks in purple. To add insult to injury, we also have numbers where we should have bullets (in red), because this really is such a hodge-podge of information that there’s no way the order is important. Just to round things off, we also have a typo, and a vaguely insulting term about our children (in yellow).

But looking at that brings me nicely to the next point, which is about the level of detail. Make sure you don’t suddenly change depth in the middle of your procedure. If you find yourself doing this, you might actually need to do more than one procedure, or consider whether you’re actually writing a process. This one is best explained by example:

This certainly isn’t the worst example I could have picked, but it’s interesting all the same: a few of the steps here go into detail about some extra function that your product may or may not have (in yellow), while others are as simple as “open the velcro strap” (in blue). We also have process/procedure issues here, with procedures being numbers in order, and steps getting lowercase letters (in red). This is just confused by the photo references typed in red, and both angle brackets *and* square brackets being used. We also have a few stray bullets in one step. And having said all that, I’ll remind you that this is for a pair of boots. Admittedly, slightly more complicated boots than you’re wearing today, probably, but they’re just boots in the end. Also, I’m more than a little disturbed about the idea of “closure and locking of the foot” (in green).

Everyone knows what anthropomorphism is, right? Someone like to explain it? Yep, it’s applying human qualities to non-human things or animals. We do this a lot, especially to animals, but we also tend to do it to computers a lot.

I went online to find these ones, since I didn’t have any good examples in my stack of manuals. It seems to be something we do almost exclusively to computers rather than appliances, but we *really* do it a lot.

I’ll give you a pro tip: computers don’t actually *think*. They might display things, they might take a while to process commands, but they definitely do not think.

Have to say, though, that going through manuals looking for anthropomorphism does make this one sound slightly creepier than the author intended …

Which brings me to one of my favourite words, and it should be one of your favourites too: parallelism. When you’re writing fiction, you don’t want every paragraph or sentence to start with “Then”. When you’re writing procedures, though, it’s a good thing to have each step start with “click” or “type” or something like that. When you mix it up, it might sound more interesting, but it just becomes confusing. When faced with two statements that seem to be saying different things, users often think you want them to be doing something different. Every step should start with an action, and the same action should use the same verb. Use “click” for a mouse click, “type” for typing on the keyboard, “press” for a hardware button, etc.

This manual almost gets it completely right. Three procedures here all need to start with the same three steps. But in one procedure, they write it using different terms. Is “tilting the motor head back” a different action to “raising the motor head”?

So, finally some takeaways:

The main elements of a procedure are:

Main heading (‘ing’ verb)

Concept

Before you begin

Warnings

Procedure sub-heading (infinitive ‘to’ verb)

Numbered steps

Reference info

Related topics

And the things you really need to remember when writing:

Mouse or keyboard, GUI or CLI? Stick to it!

Verb (or location) first

Active voice

Give instructions, not suggestions

Complete sentences

Plain English

I’ve also created a handout with these for you to print and hang up somewhere, which you can download here.

This article was originally given as a public tech talk at Red Hat Brisbane, in September 2012.

The writing industry has a schism. It’s not always obvious. We like to play it down. Some deny its very existence. But one day, you’ll be happily writing away in your new job, safe in the knowledge that you have a good grasp of spelling, comma placement, the use of industry terms and jargon, and can even confidently place a semi-colon in a position of value, when it hits you: are you a prescriptivist or a descriptivist? Suddenly your bubble collapses. The ship you were happily sailing on just moments ago collapses beneath you, and you’re cast away on an ocean of meta-questions. It’s all well and good to understand the basic tenets of grammar, but why do you understand those things, and in what way do you apply them? Are you putting a comma there because it makes historical and logical sense to put it there, or are you doing it just because that’s the way it’s always been done?

Most technical writers, those who have forged their career path through a combination of traditional university-level education in the field and into-the-deep-end, on-the-ground experience will give you a quick answer: Because the style guide says so.

Others, though, have had a somewhat more bumpy journey. They might have come from other fields such as journalism or creative writing, or they just might be the sort who overthinks this type of thing. Their answer will be more considered. They will describe the history of the word you have chosen, how its spelling and usage has changed over time, how the impact of technology has shaped its use, and how it fits into global trends.

Neither of these answers is wrong, although the two groups have argued (and probably will continue to argue) the point ad infinitum.

Prescriptivism is an easy answer, and the one that allows the writer to get on with things. They think “how do I handle this”, and there will be some guide – Chicago, AP, the Australian Government Style Guide, an internal document – that they can consult. They get their answer, they correct it in their work, they carry on. I have been this person.

Descriptivism is a more time-consuming method. In the absence of a font of all historical grammatical knowledge, it involves discovering historical usage, charting current usage, and predicting future usage. The answer, in many cases, is probably more ‘correct’ (or at least more considered), and the writer will certainly have a very long list of very good reasons for choosing the answer they do. It is the territory of the writer who needs to understand, not just do. I have, also, been this person.

In a recent conversation it was remarked to me that the speaker could not tell if I was in favour of neologisms or not. I argued that I am in favour of new words (and usage) entering the lexicon, but that I also feel it’s important to maintain an historical accuracy within our writing and grammar. It’s a somewhat contrary position to take, I agree. After all, new words and usage will not ever enter the lexicon unless they are used, surely? Dictionaries won’t include words just because they might be used next year. Fair point.

But, just as movies are not all science fiction, writing is not all technical, language is not all written, and audiences are not all university-educated, technically adept, native English speakers between 25 and 35. Writing needs to suit the audience, and different audiences have different expectations. This gives us a startling ability to watch language shift. Words can be used in a completely informal and slang way throughout most fiction and magazines, newspapers have historically been held to a higher standard of formality (although that is changing as the golden age of print takes hold), high-brow magazines and professional journals are expected to be quite formal, along with most technical documentation, and then academic papers hit the highest highs of formal language, with their stiff tone and impenetrable prose. But before new words are taken in at the beginning of that literary river to begin their long journey to towards the sea of obscurity, they must first be coined in the spring of neologism, and that occurs in the spoken word, and rarely in print. Take, as a simple example, the word “hello”. Originally used as an exclamation (and often historically cited as being used in hunts) or a way of gaining someone’s attention, it turned into a more formal greeting after Alexander Graham Bell’s initial plan of having people answer the telephone with ‘Ahoy’ fell through. ‘Hello’ eventually became the accepted standard, leading to call operators being referred to as ‘hello girls’ for decades, and the term itself becoming almost so banal as to not require a definition. It is interesting to note that, although ‘hello’ moved easily from almost-exclusive telephone use into a general greeting, the equivalent term in Japanese ‘mushi mushi’ (????) has not, remaining a term used on the telephone only. Language is a living thing, and will change constantly, even in the face of criticism, denial, and plain old refusal to change.

Simply put: words begin in spoken slang, are gradually normalised through various print mediums, until time and usage turns them into stiff and ‘correct’ terms to be used until they fade into obscurity. Some fade into obscurity sooner than others, some have an amazing longevity with histories that fade into the fog of time. Technology has a tendency to speed the process up significantly, with terms such as ‘facetime’, ‘diskette’, and ‘webinar’ having had their brief golden age and are now (some would say thankfully) dying out again. It is also technology to blame for the re-purposing of words such as ‘login’, ‘instantiate’, and ‘friend’. And it is technology, I fear, that has spawned this entire debate between prescriptivism and descriptivism. It might seem strange to us now, where we are encouraged to be one or the other, but I believe that in times past, we have all been a little bit of both.

As a professional word-wrangler it is my job to understand my tools. I would expect a carpenter to fully understand saws: different blades, the angle of the teeth, the size and weight all make a difference to the final product (I imagine. I believe I sawed a piece of wood once, many years ago. These days I value my fingers too much!). I make it a point to understand my tools – words, and the grammar that binds them together – completely. As such, I enjoy taking the time to research the history of words and usage, and work out exactly why it is that I should (or shouldn’t) use them in the way that I do. To me, that’s essential knowledge that I require in order to do my job well.

On the other hand, I have a job to do. I need to get words on paper. The words I set to paper need to be accurate, they need to convey the right message, and they need to be able to be understood by my audience. They also need to be given to my readers when they need them, which means hitting deadlines. And that means that I don’t have always have the time to indulge my scholarly side and look up the history of every comma use, or fully analyse whether I should be using “shut down”, “shut-down”, or “shut down” in this particular instance. So I have to make a call: I spend time researching the important ones, the contentious ones, and the ones that will hopefully lead me to a greater understanding of other words. For the rest, I have my Chicago Manual of Style, our internal Word Usage Guide, and my dictionary. I lay my faith at the feet of the prescriptivists, but make sure I pay my tithe to the descriptivists, because who knows where all this is going to lead?

At Red Hat, we have a content services department that is about sixty people strong. Even though the department is pretty big these days, back when I started with the company, we were still trying to work out the best way to run a successful enterprise-level documentation team. What that means is that I have been involved in some of the big discussions that we have had over time about what processes we needed to get in place in order to allow us to produce the massive amounts of documentation we required as our product offerings grew. As a department, we grew very big, very fast, and our processes needed to be flexible enough to accommodate the large number of new hires we had, and still have, coming in, but robust enough to be valuable and reliable. They also need to fit in well with the engineering practices in place in the company, and the tools that our development teams use and are familiar with. Of course the other really important factor was that we had to be open. We wanted to use completely open tools to produce our docs, but we also needed to be able to work with community teams, such as the Fedora group.

Like many documentation groups, at Red Hat we use a five-phase waterfall model to produce documentation. It’s based on the ever-popular JoAnn Hackos method: starting with planning, the content specification, then writing and editing, translation and production, and then a retrospective review. At Red Hat at the moment we’re at a place where our development teams are increasingly using Agile-style development models to produce software, and that means the pressure has been on us to develop in a less rigid way than the old waterfall model has been allowing us to do. Also, it’s no secret that the online world is changing, and people now expect to be able to interact with information at a much deeper level than ever before. They don’t want to be presented with static, hard-copy books any more. They want dynamic, interactive, usable, and above all useful documentation.

In order to be able to work out what kind of model we needed to use, we needed to go back to basics. All technology is about solving problems. Back when we were sitting around in caves, we had a problem: there was all this food running around outside, but we didn’t have a way to get it to stop running around, so we invented a club and solved the problem. Since then, we’ve used technology to solve all sorts of problems: horses were sometimes problematic to control, and they didn’t go very fast, so we invented cars. The hard wheels used on early cars weren’t very comfortable, and when they broke they really broke, so we invented pneumatic tyres. We also had problems being able to see in the dark so we invented electric light, being able to go to the toilet when it was raining or cold so we invented indoor plumbing, being able to send messages to people on the other side of the country so we invented the telephone, or on the other side of the world so we invented email.

Even these really technological things that we find ourselves documenting now, are all solutions to problems. One of the first things you need to be aware of when you’re writing documentation is what problem your users have. If you can’t describe the problem in one or two sentences, then you don’t understand it well enough, and you need to keep researching. Because if you keep going, all you’re going to end up with is hollow marketing spin. That’s how we end up with documentation that talks about “leveraging synergies”: words that sound great, but have no meaning.

So at Red Hat we came up with a fairly simple model, and that is that documentation needs to be able to be boiled down to three things:

Describing the problem

Solving the problem

Giving any additional information

Anyone who has done any work with DITA would understand that what I’m really talking about here is:

Concept

Task

Reference

So we’ve more or less said that DITA is where we need to go next. But we didn’t want to completely restructure the tools we were using. We have a fairly large people investment in our tools. The main tool we use is Publican, which was developed by an engineer in our Brisbane office. It uses Docbook XML and gives us a command line interface that we can use to create new blank books, apply corporate formatting, and it integrates into our internal packaging system so we can create all these different formats for our books – HTML, PDF, and ePUB on the website, and we can also create RPM packages and man pages to package in with software. We combine Publican with SVN to give us a complete CMS, in short.

We looked at DITA and DITA-OT, the DITA Open Toolkit. We realized two things: first of all, it would take a significant amount of work for us to bring an open DITA toolchain to the level of maturity and system integration of our existing Docbook toolchain. Secondly, we wouldn’t get the really significant benefits of topic-based authoring without a Component Content Management System – a CMS that manages content at a very granular level. Putting those two things together made it clear that if we changed to DITA all in one hit, it would take us significant time and energy just to get back to where we already were with a mature open source complete tool chain. So we decided to take an evolutionary, rather than revolutionary approach. It’s a much more open source approach: to re-purpose something that you already have, add a script here, a small command-line tool there, release early, release often, and let the user community guide the development, rather than trying to design and implement some grand system in a distant (and expensive) future.

What we needed was something that worked in a similar manner to DITA, gave us content re-use and all that good stuff, but that would work with our existing Docbook XML and Publican tools. The first thing we did was to start creating topics in Docbook, using Docbook syntax, and a command line tool that we called the “Topic Tool”. This was a really simple command line tool that allowed us to write XML snippets (or ‘topics’), and save them in SVN. We used an extensible template model, where the topic tool retrieves a Docbook template from a central repository to match the topic type you specify. That way we can create new topic types, and even modify the Docbook syntax of existing topic types, without changing the tool on users’ machines. That was an important decision, and a major part of the evolutionary “Release, Review, Refine” approach we wanted to use. Over time we did change the Docbook syntax of the basic topic types and create new topic types, validating the open source maxim “plan to throw the first one away”.

The basic workflow with the Topic Tool is like this: you tell the tool which topic type you want, and it will then download the template and prefill some information for you. You can then edit the topic in a text editor, and import it back into the repository. It’s then possible to view your topic from the repo directly, which means anyone can now see it and use it. We then include those snippets into any book you want using an xi:include, build the book as normal with Publican, and voila! we have a book with content reuse. So that was pretty awesome, and if you read any of our Virtualisation documentation you’ll probably not know it, but that’s all based on topics and maintained using the topic tool.

Of course, once we got to about 300 topics in the topic tool, we started to notice that we have another problem, we were having trouble locating topics within the repository. This made us realise that what we needed was a better way to organise it all, so we wrapped a neat interface around Topic Tool using Open Grok. OpenGrok is designed for software engineers to search source code repositories, so it worked well for what we were trying to do. This is where the open source ecosystem came into its own all over again – there are a million off-the-shelf components and projects that you can choose from to build your own system. In the end we had a web-based search tool that was pretty basic, but did the job.

Content reuse is an obvious application of topic-based authoring, but by this stage, we’d started to realise something even more exciting. Our definition of a topic is a unit of information with a single subject – that means that it talks about one thing, and one thing only – and that has a single information role: that is, it’s a concept, a task, or a reference. If we gave three topics – a concept, a task, and a reference – to a robot, along with a rule describing the “explain, answer, extra info” pattern, and some kind of graphical template, that robot can assemble those topics into meaningful and useful output for an end user. What we wanted to do was to automate this process.

When humans assemble content into a book, they are making decisions. What aspects of the information are their decisions based on, and what rules are they consciously or unconsciously using? That was what we wanted to create: A system that would allow us to store metadata about a topic, and use rules to automate assembly on a scale that we just couldn’t do by hand-coding.

So we developed a system that we call Skynet, which allows us to dynamically sort and locate topics. Select the topics you want, and Skynet will download the code that presents those topics in a consumable way. Of course, we started dreaming big after all this. We’ve started thinking about moving away from the documentation-as-a-book paradigm, and started considering “Documentation 2.0”. Why not include comment fields on our documentation, that will allow our reviewers – quality engineers, subject matter experts, editors, and the like to make comments directly in the book rather than creating a separate list? And why not offer that functionality to our users as well? What if we had the equivalent of a Facebook ‘like’ button? Users could ‘like’ sections that they found useful, or leave comments saying “when I tried to follow these instructions, X happened” or “this seems to be missing a step” or the like. If we break away from the book model, we start to be able to think about documentation as something that our users can interac with. We could have popular topics bubble up to the top of a list, or divide books into audiences, and present the information for each audience differently, giving them a tab to click to see the information in various ways. We could implement something similar to the Amazon “customers who bought this also bought this” and present similar topics to our readers. Using single-sourced content, and content reuse, through a system like Skynet, is going to allow us to move into these more innovative delivery methods.

The team working on the Skynet project have 110% discoverability as one of their goals, to quote the team leader: “the documentation finds you”. In other words, when you’re working on something, and you get stuck, the documentation is there at a click or a glance, ready for you to interact with it. Of course, I’m sure some of you are saying “Help” right now, and yes, I agree with you. That is something else we’re talking about, and something that Skynet will enable us to do. Skynet pushes out XML now, and of course there’s plenty we can do with that as it is, but we can also extend it to push out all manner of things, including Mallard for Gnome Help.

So let’s take this conversation back to processes. All this dreaming is fantastic, but at some point we still have to actually do the hard work. Without a solid process, and a great set of standards, we’re not going to be able to get there. We’re doing a lot of internal testing, and we’re dipping our toes in the water with the topic tool and with Skynet. So far, we’ve been able to slip these in to our existing standards, but that’s not going to last for long. With a paradigm shift as big as this, everything is going to have to change, and that includes the way we go about producing our documentation. We need to be organised, we need to make sure what we do is repeatable, and we need to maintain our high standards of quality and accuracy in our documentation. Most of all, though, we need to maintain and even increase our focus on the customer. These changes come about not because we got bored with doing things the old way, but because we believe it’s a better way to serve our audience. Never, ever forget who you’re writing for, it’s those poor sods out there with their problems that they’re trying to solve. Our goal is to give them the tools they need to solve them.

So, to recap:

One of the main things that we have learned is that process is king. If you don’t have a solid process for producing documentation, then you’re going to find yourself floundering at every point along the way. You’re going to end up with documentation that doesn’t cover what it needs to cover, isn’t accurate or well-written, and doesn’t get out on time. Without a plan for how you’re going to tackle the project from end to end, then you’re not going to succeed. It’s that simple.

The second thing is about tools. You need to decide ahead of time what tools you are going to need during the project, and make sure you have them ready and up and running before you start. It’s horrible to get halfway through writing and find out that one of your writers doesn’t understand how to use a semi-colon. It’s even worse if you get halfway through and realise that one of your writers doesn’t understand Docbook XML, or whatever authoring tool you’re using.

While we’re talking about tools, it’s important to keep it open everywhere you can. This can seem counter-intuitive to those of you who have worked in big companies, but being open doesn’t mean giving away business secrets, or exposing your competitive advantage. I think Red Hat of all companies really proves that the openness can co-exist with secure business practices.

Part of keeping it open is about keeping it real. The people behind your processes, the people doing the actual work day in and day out: they’re real people. They’re real people, with real lives, and real families. You need to be able to work with people, and ensure that the loss of one person isn’t going to make the whole project tumble. The other thing you need to remember is that your readers are real people as well, you need to make sure that you’re giving your readers something useful, and something that they will get value out of.

And finally, I want to remind you about reviews. We all understand the importance of reviewing our writing for correctness, and reviewing our projects to make sure we can learn from our mistakes. You need to extend reviews to the documentation process itself, as well. Never be afraid to change things around. Just because it worked last time doesn’t mean it’s going to work next time. And just because it’s worked in the past, doesn’t mean it’s the best way to do it in the future.

This post was originally a talk given at the Open Help Conference in Cincinnati Ohio, on 5 June 2011.

Also this year, for the first time, they’re asking for miniconf proposals too. I would love to do a whole miniconf on open source documentation, but I’m not sure I have that kind of stamina. Of course, If you’re interested in helping me out, let me know!

I spoke at OSDC last year, when it was in Melbourne, and the footage is on my videos page. I thoroughly enjoyed the experience, so it will be interesting to see what kind of event Canberra can put on this year.

It looks as though I’ll be winging my way back to the United States again shortly. I’ve been asked to speak at the Open Help Conference in Cincinnati (and one day soon I’ll learn how to spell it!), Ohio on 3-5 June.

If you’re half as dedicated as I am to American candy, get your orders in now. There will only be so much room in the suitcase …

Where the bloody hell are ya?!

Who wrote all this stuff?

All writing on this blog is the work of Lana Brindley and does not necessarily reflect the view of Rackspace, OpenStack, or any other group with which I am affiliated, except where I have used direct quotes (which are attributed to the original authors where possible). All images were either taken by me, shared under their individual licencing requirements, or obtained as stock photography.