Considering our approach to API iteration

One of the key aspects of how we work is constant iteration. As Government as a Platform progresses into beta with application programming interfaces (API) driven transactional services (rather than pure data access), it’s time to explain how we think about iterating without disrupting clients.

The topic is approached with some caution, as there are strongly held and divergent views on the matter across industry.

We don’t want to cause unnecessary work for clients consuming our APIs. We recognise any change for clients is likely work for them, and if it’s not for new or improved functionality, then it’s unnecessary cost. We should minimise this cost to clients.

Some aspects of APIs are unlikely to change and these don’t need to be considered as part of our versioning approach. For example, we’re using JSON for data structure and Unicode Transformation Format (UTF-8) for encoding.

Our recommendations

Other aspects do change and we recommend supporting agile methods of developing API’s while minimising disruption to clients. This can be achieved if services:

avoid backwards incompatible changes where possible

use a version number as part of the URL when making backwards incompatible changes

make a new endpoint available for significant changes

provide notices for deprecated endpoints

Making backwards compatible changes

To maximise compatibility, we need to avoid breaking changes. We can specify the changes are backward compatible by using techniques such as specifying parsers ignore properties they don’t expect or understand. This allows us to add fields to update functionality without requiring changes to the client. And this rule goes beyond syntaxes.

For example, our Notify team could change how they respond to a request by adding the ‘delivered’ time to their response message. The client should not have to make any updates due to this change.

Another area to consider is the ordering of entries in JSON arrays. As JSON doesn’t have a data structure to represent a Set, it’s common for a client to use an array, even when no order is meant by the API.

Assuming the data structure is intended to be an array can cause misunderstanding between different systems. If there are changes to how an array is generated, an assumption about the ordering can break the consumer of the message. At the least, documentation should cover whether there is any ordering. It may make sense to implement (and document) arbitrary ordering to make things easier for client developers.

When you need to make a backwards incompatible change

Sometimes though, it’s not possible to make a syntactically and semantically backwards compatible change. In this case, we take the approach of incrementing a version number in the URL. We start with /v1/, and increment with whole numbers when we have to make breaking changes. While this is perhaps not the most beautiful approach, it’s simple, robust, and familiar to developers. Other solutions such as content negotiation suffer from the problem of being difficult for humans to debug and view easily with common tools.

If we want to make a larger change, like simplifying a complex object structure by folding data from multiple objects together, one option we consider is making a new object available at a new endpoint. Concretely, we may combine data about users and accounts from /v1/users/123 and /v1/accounts/123 and produce /v1/consolidated-account/123. While there are a couple of downsides to this approach (lack of RESTfulness, and maintenance of the old endpoints), it has the benefit to users of simplicity, backwards compatibility and incremental adoption

These versioning approaches mean clients don’t have to upgrade immediately. We don’t want to have to support all old clients forever so we set clear deprecation policies - in particular, how long clients have to upgrade, and how we’ll notify them of these deadlines. While we’ll normally contact developers directly, we’ll also announce deprecation in HTTP responses using a ‘Warning’ Header.

In summary

APIs will change over time as we discover new user needs and expose different data and operations.

A traditional approach may result in many versions of the API and the object descriptions, but we can maintain backward compatibility in the majority of cases using the patterns described above. This approach doesn’t force changes on our downstream users and allows them to take advantage of new features as soon as they are available. The only time we need to perform full versioning of the API is when significant changes occur.

Are version numbers incremented endpoint by endpoint, so the latest release of the API might include /v1/users and /v3/accounts ? If so, have you done user/developer testing on that to see if that's confusing?

I'm pretty used to thinking about APIs I use as having a single version number. But that would seem to require a monolithic release/version bump of all endoints, preventing small iterations.