Month: January 2013

Alert.Us launched yesterday. It is a child monitoring application we at Wildfuse helped to develop. I was responsible for designing and building the RESTful API and the whole server-side of the product. It was a great learning experience. Here are the main points I took away:

Distributed systems are hard. This was the first distributed system I did. Although in comparison to what other people are building it’s almost a toy, still, it was hard. Even though I knew about fallacies of distributed computing and the CAP theorem, it’s still difficult to get it right the first time. This recent article sums it up nicely and I have to confest I felt a bit stupid after reading it.

When building a distributed system you’ll need all the help you get. We used AWS extensively for the whole project and I believe it was a major success factor. I have to point out especially DynamoDB and SQS. These components just work. As my ops-chops are not as great, I welcomed the fact that I could just enable them and forget about them. Isn’t this the holy grail of cloud computing?

Distributed systems are fun. They didn’t teach distributed computing at the university when I was attending and I doubt they do now. You have to learn it yourself, make mistakes and fix them, read white-papers and other people’s experiences. Learning this and working with new technologies is the fun part, but getting it finally right even more so. Unfortunately I don’t know of a good “centralized” source of information on this topic. Most of it is randomly scattered all over the Internet. For now, I’m using Prismatic to follow the news, but if you know of a good source, please share`.

A central element of the app is an activity feed. Here are a couple of tips that might be useful if you’re building it into your product too:

Make the data structure representing a feed event as flexible and extensible as possible. In our case, the project requirements changed during development, as usual, and some of it also affected the feed. Be prepared.

As the server and client development is independent, the client apps should have a default way (a fallback) to present new, from their point of view essentially unknown, event types.

There should be a “beginning of time” event in every feed. It can be the date of birth of a user or the date when a user signed up. It can be used to display a welcome message. It’s also a good way to indicate that there’s nothing more in the past so the clients can stop paginating.

Having a feed means having events ordered by time, but time is a bitch. Consumers of this API are mobile apps. Mobile means the delivery of an HTTP request my be somewhat late, both ways. Furthermore, in our case, the apps have a “freeze” feature — if there is no network connection at a given moment, they’ll won’t fail but will wait with the sending of events until the connection is restored. All these factors raise a lot of questions. Do you trust the event timestamp from the mobile clients? What if the event is too far in the past? Do you process it the “normal” way, process without “side effects” (such as sending a push notification) or do you discard it altogether? And how far in the past is too far — 10 minutes, 1 hour? What if the event timestamp is in the future? Maybe your system will be different in that you can always attach a timestamp on the server — if so, good for you. I came to the conclusion that a good enough solution is to trust mobile clients, but if they report an event with a timestamp in the future, replace it with current time. This handles most of the cases as intended.

One thing I learnt about only after the feed was done was Activity Streams. I’ll leave it here for reference. It’s definitely something worth checking out.

Speaking of time, represent time in RFC 3339 format and keep it in UTC. Always, everywhere. This will save you a lot of headaches. RFC 3339 is a subset of ISO 8601 for use on the Internet; hence it is friendlier. Follow the “Be strict in what you send, but generous in what you receive” rule — accept time with any timezone offset but send out time in UTC.

Log like crazy. The bare minimum you should log is errors (d’oh), requests, responses, response times, all 3rd party integration points interactions (especially response codes and response bodies!) system state transitions, but feel free to add to this list. You can never log too much. Log in a machine readable format — JSON works pretty well. Ideally, all logs should be publicly accessible and searchable. If you don’t do any kind of analysis on your logs, you can safely delete them after ~7 days. This timeframe is enough to answer even late questions about what went wrong.

Always have an up-to-date documentation of the API. In my opinion, a feature isn’t done until the documentation for it is written. I followed this rule closely. We even had an agreement that if there’s an implemented but undocumented feature of the API or a mistake in the docs, I owe a drink to the dev who found it. I never had to buy one. I got into the habit that the first thing I did after a commit was to update the docs. Accept it as part of the work. It pays its dividends — I saved a lot of time answering questions from the client developers just by pointing them to the docs. We kept them on the GitHub wiki and it worked quite well, but services like Apiary.io might work even better.

Even if you have the best documentation, there will still be information that will get lost. For example, one developer may come up with an enhancement other platforms may benefit from too or a change request comes in. A central messages board of some kind with this info aggregated would help. We didn’t have one (hence some information got “lost” or didn’t reach everyone involved in time), but it seems to me the Stripe way of having every email CCed to email groups would work well for us here. Have you encountered this problem during your carreer? I would love to hear how others are approaching it.

Help client developers as much as possible. Sometimes I feel frontend devs have it even harder. They have to face crazy demands of multiple parties and ever-changing feature request. I know, I’ve been there too. The last thing they want to face is a half-assed API. Ease the pain of development of your fellow comrades, they’ll love you for it.

Utilize the capabilities of HTTP to its fullest. I’m still amazed how well HTTP works for “modern day” use cases. Sure it’s not perfect, but it goes a long way. Be pragmatic rather than dogmatic about using HTTP. Tip #1: use the User-Agent header to identify the device type, operating system and client version/build number (e.g. “iOS 5.0; App v0.82 (1554)”). It helps when debugging. Tip #2: use Accept-Language to determine the locale of the resource. Works like a charm.

If you want to learn more about the design of HTTP APIs, I strongly recommend following the api-craft group and people like @johnsheehan, @kinlane, @mamund or @mikekelly85 on Twitter; surely, you’ll find others. If you know about “distributed systems” people on Twitter I can follow, please let me know, either in comments or tweet me.

Errors are the third class citizen of any interface; nobody wants to deal with them. Not enough care goes to crafting proper errors, but this is a mistake. It’s when things break your API consumers need most help. Proper error handling distinguishes good APIs from great. If you follow the recommendations outlined in this blog-post, you’ll be one important step closer to having a great API.

The basics

Since we’re dealing with HTTP here, be sure to take advantage of it. Always respond with a 4xx or 5xx status code when an error occurs. If you’re sending 200 OK, you’re doing it wrong. Make sure the payload is of the same content type as the rest of your API, whether you’re using JSON, XML, HTML or any other format, otherwise you’ll create a headache for API consumers. Be sure you send the appropriate Content-Type header (optionally respecting the Accept header). Most of this is covered in this great training video from Layer7.

Error payload

A title, description and an error code is the bare minimum you should send. The title and description are components of the error message which I write about in the next section.

As mentioned earlier, HTTP already provides us with a set of error codes and for simpler APIs, these work great and are sufficient. If you need more granular reporting, add another field to the payload which holds an application-specific error code. Have a document online with a detailed description of every application error code and add a URL pointing to it to the error payload (h/t to John Sheehan for bringing this out in the comments) This greatly helps when debugging less obvious error states.

Anatomy of a good error message

A good error message should be as specific as possible. Don’t use general terms as “Something went wrong” or “Error occurred”. That’s obvious. An error message should help the user recover from it. Provide hints to what went wrong, if something is missing, not well formed or conflicting. You’ll get bonus points for giving instructions on how to actually fix the error.

Also, keep in mind who’s the intended audience. Most API errors will be seen by developers, so you can “talk” to them in more technical terms. Some, however, may bubble up to regular app users, keep that in mind. In this case, send the title and description localized, according to the Accept-Language header.

Finally, don’t be afraid to write the error message in a human way, it will appear less frightening. UPDATE: Check out this great answer on StackExchange.

Logging

It goes without saying that all errors should be logged. What I found useful is to log the full HTTP request and the application stack trace. Have this log available online and add a new field to the error payload containing a link to the encountered error. The API consumers can then paste this link in their bug report. It makes debugging and cooperation between developers much easier.

Conclusion

Error states are an important part of every interface. Treat them as such, don’t ignore them. Users of your API will praise you.