Context, context, context

16 June 2015

At Yoyo wallet, we are always handling a lot of RESTful API requests to our platform.

We built that platform from the ground up based around a microservice architecture. All communication between layers and services is routed via message queues. This pattern gives us separation of responsibility and the ability to scale where it is needed.

The API by design is is stateless, however an API request will have additional metadata that will need to be passed through the relevant microservices. This metadata holds information like the identity and roles of the user making the request. We wanted to do this in a way that didn't transform the payload and was consistent across all microservices.

To satisfy this requirement we hijacked the pattern used here in Google's golang context package. This package is designed for sharing context across multiple parallel processes in a single application. We have taken the concept and expanded it for multiple parallel microservices.

This allows us to easily implement features in our microservices such as:

authentication (who are they?)

authorisation (are they allowed to do it?)

auditing (remember they did it)

monitoring (how many times have they done it)

Service Context

The simplest form of context we care about is when one microservice calls another microservice.
In this scenario the only context information we pass are details of Microservice A as it calls Microservice B.

ServiceContext information includes:

service name (which service?)

service version (which code?)

service instance (which running instance?)

Service version and service instance information is essential when things go awry.

Service to Service Authorisation

This allows us to implement Service to Service Authorisation. We may wish to restrict which services have permission to call a particular service.
As you can see in the diagram Microservice 1 can call Microservice 3, however Microservice 2 is denied.

User Context

Passing details of a user that initiates an API call is handled in the same way as a Service context. (Actually behind the scenes we also pass ServiceContext details too)

UserContext information includes:

userID (who they are?)

sessionID (which session?)

roles (what are they allowed to do?)

The UserContext is derived in our edge-api when the user is authenticated. The UserContext then travels with the request from service to service.

Tick tock

With all these service calls going on it is often hard to decide how long we should wait for requests to complete. "Deadlines" are another feature hijacked from the Google context package.

The concept is simple, when you initiate a request you specify a hard deadline. This is a fixed time that the initial request will wait until. If the time is exceeded the service will return a timeout error.

This deadline is passed from service to service as a hard deadline for the overall request to complete. This prevents downstream services from continuing to run when there is no-one waiting to receive their response.

The diagram below shows an example of a successful deadline where all the associated service calls complete within the time limit.

Deadline Success

The diagram below shows when a deadline has been exceeded. In this example Microservice 3 was taking an excessive amount of time, therefore the request was cancelled and returned a timeout error.

Deadline Exceeded

Deadlines naturally rely on accurate clock synchronisation between machines, or deadlines which are “fairly” relaxed.

Making life easy

Enough about context, how about some code? Being developers we like an easy life, so we have developed a client package that we use to interact with all our microservices.

There are two ways to call our microservices. With a context and without.

Intra-service calls are being made using messages on a bus. We are using RabbitMQ as our chosen messaging layer. The client pacakge handles the marshalling and unmarshalling of context information as message headers.

Endpoint handlers

Each one can receive a context and therefore pass it into any child services it invokes.

AMQP Gotcha

We implemented this code and 90% of our requests were working fine but we would get occasional timeout failures on a handful of requests. Upon investigation we realised that we were using an AMQP Timestamp as a header for the deadline. Unfortunately the precision of timestamps is in seconds. Therefore if a request was made at 12:59:59.999999 the timestamp would default to 12:59:59. This would leave us only 1 nanosecond to complete the request, when we expected a full second.

The code has been changed now to use an int64 header and persist the deadline as a value in nanoseconds.

Benefits

We have found this to be a very helpful pattern to implement thoughout our stack as you are always going to want context information at some point in your code. By passing in explicitly, all the time in the same way it is always there when you need it.

It can be used for role based authorisation, database update auditing, business logic rules, API rating limiting etc.

Your experience

We're always interested in hearing about other people's experiences with developing microservices. Have any of you used similar tools or had similar experiences around the propogation of context between services?

Let us know what you did, what challenges you faced etc. using the comments below, or tweet us @yoyoengineering