Get more out of your service with machine-readable API specs

Bring these up at your favorite API meetup and you'll hear awful stories from someone that was in the frontlines back when this was the most popular way to expose services. I'm talking ten-thousand-lines-of-unreadable-generated-code awful.

So it shouldn't be a surprise that the people that didn't go insane or changed professions decided to step back and work on simpler APIs. Long XML files and generated code are not made for humans, it was time to optimize for fellow developers: simpler primitives, easier access, documentation we can understand!

And while that certainly made things more pleasant, it was still far from perfect. Suddenly we had to deal with problems that just didn't exist in the WSDL days:

API documentation getting out of sync

Developers spending a lot of time writing API clients, all prone to errors

Inconsistent APIs making it hard to get up to speed and reuse clients

In this article we'll see how machine-readable API specs can help you address these problems1 while making you and your consumers more productive. And all without forgetting about the human factor.

What's a machine-readable API spec?

To understand lets take a simplified, pseudo Twitter API as example:

Tweet

A tweet is a 140 characters status update that is displayed to your followers.

They're composed of:

id (uuid): unique tweet identifier

message (string): status update with up to 140 characters

To create a new tweet:

Issue a POST to /tweets.

Inform the message as a param.

This should look very familiar to anyone that ever wrote code to consume an API.

Behind the scenes it could have been typed like that as raw HTML or Markdown, or it could have been generated from a more structured document:

Benefits

When we first started looking for a machine-readable API spec at Heroku our focus was on documentation. Maintaining our API reference in Markdown was cumbersome, prone to errors, and really inflexible. The spec allows us to change how our documentation looks without changing the actual API definitions.

With that we started looking at another property that was cumbersome, prone to errors and inflexible: our API client. With the machine-readable spec we can now generate API clients, as we'll see in more details later.

Another side effect to describing all APIs in structured format is that it makes it really hard to provide an inconsistent interface. Every machine-readable spec has constraints that work in your favor when you're designing APIs.

The irony here is that while developers hate WSDL, we are loving to work with some of the ideas behind it.

Now lets look at how machine-readable API specs typically deliver these benefits.

Standards

Most API specifications will cover at least three things:

What endpoints are available in your API

What parameters they take

What do they render back

Note that this nomenclature might change. For instance, "endpoints" are called "methods" in Google's API spec, and "apis" in Swagger.

One way to avoid confusion here is to rely on some previously established format, like JSON Schema: a JSON document that describes arbitrary JSON data. It's really simpler than it sounds – take this description of a Tweet for example:

Creating and maintaining a spec

Nothing will make you appreciate YAML more than fiddling with a long JSON file.

That's why we decided to ultimately edit our schema as a YAML file. Since YAML is a superset of JSON, we can automatically generate our JSON Schema the same way we generate our docs. To further facilitate maintenance we also decided to keep resources in their own files, which are merged together by a command-line tool called Prmd.

Those editing their API spec by hand should also consider Blueprint: being a subset of Markdown, it really makes it easy for anyone to edit, including non-developers. It also provides a Sublime plugin for live validation and syntax highlighting.

But my favorite approach is the one Swagger took, by embedding the API spec to the code that implements it. This allows you to write the spec in the same language you write your server, be it in Python or Ruby. By removing the gap between server and spec it guarantees that both are always going to be in sync. auto-crud took a similar path, generating endpoints for Node.js apps using MongoHQ as a backend.

Generating documentation

Avoid the WSDL fiasco by making human-readable docs your first priority.

For other formats you'll need tooling to convert your serialized spec into HTML. There are quite a few options available for JSON Schema: you can generate interactive docs with I/O Docs in Node.js or JSON Schema Browser in Ruby.

At Heroku we wanted our API docs to fit with other developer resources we already have, so we made Prmd capable of generating static HTML for docs.

Generating clients

How is this not WSDL again? Ah, yes, developers first!

And it was a developer tired of writing custom API wrappers that created Unio, a Node.js REST API client that can be customized to work with different APIs via a machine-readable spec. Here's how you can configure it to consume the simplified Twitter API described above:

Unio is probably a bad choice to specify APIs (it carries much less detail than the other formats), but still shows that developers are willing to use some of the concepts behind WSDL if that means they will be more productive and avoid errors.

Additional benefits

Swagger and auto-crud don't just allow you to write the spec in a programming language, they also use it for validating requests. We can get a similar effect based on a JSON Schema with Committee, a set of Ruby middlewares to validate request params and serialized responses.

This is the point where API specs come full circle – specify a mandatory paramater once, and have that indicated in the schema, clarified by your docs, verified within clients and enforced by your server.

Finally, it's worth noticing machine-readable data can be transformed deterministically; similar to how people convert from RSS to ATOM, it shouldn't be hard to translate a spec across different formats. I first saw this happen when Blueprint announced support for JSON Schema; nothing prevents you from enjoying both JSON Schema-based docs and Swagger-based clients, for example.

Conclusion

Machine-readable API specs help you design consistent APIs that offer a lot of leverage to your team and clients. We've seen a lot of these benefits by applying it to APIs at Heroku, and I consider it worth the investment required in terms of picking a format and maintaining a spec.

HATEOAS also addresses every single issue described above. It's an interesting alternative to say the least, but I just couldn't get it to fit my needs yet. More on that in another post. ↩