APIs & Documentation – Who’s the Tail and Who’s the Dog?

API Docs: The Traditionalist View

I’ve written a lot of APIs in my time (think RESTful-type APIs, not library APIs). Often times, clients don’t have access to the code or don’t care to pore through it, so documentation is important. I’ve generally held the following to be somewhat of a maturity model for API documentation:

No docs – call me and I’ll show you how to use it!

Go to the wiki

The docs are on a special page of the web application hosting the service. This is nice because the docs
are versioned with the app; in fact, they are part of the app.

The docs are auto-generated from the comments, annotations, attributes, or source code

Recent experiences have made me question this maturity model. Indeed, I’m beginning to wonder if deriving the documentation from the code is exactly backwards from the natural order of things. In short, I’ve begun to wonder: Is it better to generate the docs from the code, or test the code through the docs? I’m starting to think that the way we write API documentation based on the code may be a bit of the tail wagging the dog.

Experience 1 – Cracks in the Traditionalist View

On my current project, we’re auto-generating docs using Swagger plugins for Java and using Swagger UI to present them. Swagger – a wire format for exposing docs – is nice. The Java plugins are nice, reading many built-in annotations of the framework we’re using for exposing services. The UI is nice. None of it fits our needs perfectly, but that’s the nature of borrowing code, and, at first, the win of having documentation generated from our code seemed completely worth it.

The belated observation is that, in our noble effort to keep the documentation complete, we’ve made some subtle and insidious design compromises to satisfy the metadata Swagger requires. This isn’t a knock on the Swagger community – the plugins really are very nice. Our Java Swagger plugins use reflection to, for example, describe the structure of the return type of the resource (controller) method representing the response body. Any such technology would have to do the same.

There are a few examples of the design smells we’re noticing, but the big one is that we have an anemic domain model since our service is essentially just a pass-through to a downstream system that doesn’t have the common courtesy to expose itself over HTTP. Using lists and maps would be actually quite a bit simpler for our scenario. All we originally created the domain objects for was for serialization and deserialization, but that’s a relatively simple problem to solve with lists and maps. However, we piggy-backed Swagger on top of those domain objects, and now we’re stuck with them.

If we suddenly switched to lists and maps, any use of reflection would fail to adequately describe the shape of the response object, since simple reflection depends on data known at compile time. To dynamically generate documentation through a tool, we would actually have to call our service at runtime and inspect the lists and maps being returned for the keys and values. This isn’t feasible, so we’re stuck (for now) with an anemic domain model.

Experience 2 – An Alternative Presents Itself

When I was writing mountebank (the tool we’re using for stubbing TCP and HTTP dependencies), I did what I imagine most open source developers do when they think they’re close to the first release. The sequence goes like this:

Write a bunch of crappy documentation by hand (following the third option on the maturity model – inline with the app)

Take one last look at the code to “polish” it

In a fit of passion, write a bit more code, including complete rewrites of core functionality and substantial enhancements

(Roughly one month later) notice how out-of-date the docs are. Write more code to ease the pain.

Eventually I realized I could ignore the docs no more, but in a desperate effort to continue writing code, I rationalized that I should test the docs to prevent them going stale again. So I added markup in the HTML to give me hooks to test, wrote the necessary code to interpret that markup and assert on the entire JSON, post-processed out the ephemeral bits like timestamps, and verified my docs. Simple enough. So then I wrote the next bit of documentation and re-ran the test. That was when something magical happened.

It failed.

Turns out, I had had this bug in the code for quite some time, and never noticed it. That was the case despite a fair bit of automated functional testing which examined substantial bits of the (parsed) JSON structure. I fixed it and carried on writing the docs. In the process, I discovered several more bugs, as well as a few “UX” fixes (e.g. the ordering of fields to make developer comprehension easier when testing over curl). I had not up to this point faithfully tested against the full JSON structure, nor had I previously tested the JSON structure as text instead of some internal representation. I caught more bugs in the app writing documentation than in all previous tests I had written added together.

I realize some would call this BDD, but it’s a constrained version of BDD, where the docs are really my production API docs, and my experience makes me wonder if this isn’t a better way of writing docs. The docs were, without question, much harder to write, as I painstakingly wrote out example narratives by hand. However, they now serve as the bests tests I have in my entire codebase.

I’ve been coding for a long time, and I can’t remember the last time I experienced such a welcome surprise. What I intended to be a routine regression test turned out to completely reshape the way I think about the documentation, at least in mountebank. Based on just one experience, I’m not confident enough to suggest that the traditionalist view of API documentation is wrong. I am, however, hedging my bets.

How to test your docs, or how your docs test your API

The documentation on JavaScript injection describes the most complex scenario in the mountebank docs. The test hook markup is in each <code> tag. The docs in my example show a series of requests and responses surrounded by narrative. In some cases, I leave out the response; in others I use CSS to hide a request or response but keep it in the HTML to make the test possible.