Introducing Sockii: HTTP and WebSocket aggregator

Developing the upcoming Server Density v2 over the past year has forced us to tackle some very interesting and difficult problems. This post introduces Sockii, a Node.js daemon which aggregates multiple WebSocket connections and proxies them to a single socket to the user’s browser.

Shiny SOA

Server Density v2 was re-designed from the ground up as a distributed set of HTTP services, following much in the same vein as Netflix or Amazon’s Service Orientated Architecture approach. What that means is that each distinctly different moving part of the project was separated into its own service (when it made sense to do so), e.g. “users”, “devices”, “cloud” etc. That means you gain full separation of concerns, the users service doesn’t need to care what the sharding policy for devices is as long as the RESTful HTTP API it provides conforms to a known specification.

That’s all fine and dandy I hear you say, but if the brave new world of SOA were so cut and dry you wouldn’t be writing this blog post! Well true, the issues that arise are very different from those when you develop everything as “one app” and here are some of the harder ones:

Authentication and sessions – do all services talk to the same session store?

If we want to do “real time” in the browser, e.g. using WebSockets, does that mean all of our backend services need to be WebSocket aware, or even socket.io aware (if we use that library)?

Will we need to use rewrites to mitigate cross domain request issues in the browser?

Does the UI need to be aware of all the possible hosts/ports/paths to services on the backend?

With multiple backend services possibly providing push updates to the frontend, do we need multiple socket connections from the browser?

So how do we go about tackling these? Step this way, I’ll try not to bore you…

Authentication, authorisation and sessions

As we wanted to avoid duplication of methods across multiple services we decided on building a thin layer service that sits between the two. Any request performed by a user must first go via this thin layer before being routed to the backend. By doing this, we get to perform authentication and authorisation in one central place rather than having every service check if user is currently logged in and if they should be allowed to perform an action.

We created two separate services to handle authentication:

The auth app handles performing user account lookups with the backend users service before creating a session in MongoDB and forwarding the user to the main UI. This allows us to both scale auth separately and allows for single sign-on capabilities if we want to go down that road. [1]

Our main thin layer service uses the same MongoDB session store to check if a user is authenticated before appending their account details to the request and then acting as a HTTP router to the requested service. Any request that doesn’t also have a corresponding authenticated session is immediately stopped, with an appropriate HTTP error status code. [2]

[1] In order to access the session storage in the same way, the auth app and UI share a core library with session middleware.
[2] Backend services are not externally routed, our Nginx load balancers only allow access via the thin layer, and most services do not even have externally accessible network addressees.

By accepting account details appended to the request query string, an individual service doesn’t have to perform any extra user detail lookups and for the most part just has to use the provided user ID and account ID to restrict database queries. This will change slightly as we introduce more fine grained access controls to some areas of some services, but for the most part this covers our needs.

WebSockets, socket.io and services

One decision that was made early on was that we didn’t want to be performing inefficient polling for updates like we do in v1. As reliable as this is in terms of being proven technology (nothing can be simpler than issuing a GET every minute), there are now several alternative solutions for “[more] real time” updates, under the umbrella of the HTML5 specifications:

Server Sent Events – a stream of events sent as a special part of the HTTP response body during a request that can be listened for using a JavaScript API.

WebSockets – a two way persistent socket connection created by issuing an HTTP upgrade header from a regular request that can be opened, listened to and have messages sent over using a JavaScript API.

Take note that both have advantages and disadvantages – while WebSockets provide efficient two way messaging, because they’re effectively a different protocol that runs on separate ports, some users may have issues connecting through firewalls. There are also other networking issues to consider such as whether your load balancer supports them.

Server Sent Events have the upper hand in that they are just an extension of regular ol’ HTTP and can just be used on a longpoll connection. Any messages you might need to send upstream can be sent as extra regular concurrent HTTP requests (the open request and event API is asynchronous like any regularXMLHTTPRequest).

For our needs, WebSockets seemed like the best fit and most concerns over compatibility (both networking and browser support) can be circumvented by using a library such as socket.io or SockJS. They provide a convenient wrapper around various socket or socket-like browser messaging solutions, e.g WebSockets, Flash sockets, HTTP longpolling etc., with any failure in creating a messaging channel failing back to a more reliable method (in most cases longpolling).

We eventually chose socket.io due to it’s stability and well established support across platforms, but what about backend support for sockets? Or even libraries like socket.io? There is fairly good support in our core platform (Python) and even some support penetrating our secondary platform (PHP) for socket.io but we didn’t want to have to support our wrapper library across every service. Instead we decided to keep to standard web technology like WebSockets and instead leave compatibility between the frontend abstraction and backend transport up to the thin layer service.

This made our choice of platform for the thin layer quite natural given the following constraints:

N.B. We could have also chosen Python to write it in because we already use both Gevent and Tornado which both have thirdparty socket.io compatibility libraries available and provide fast event based networking capabilities.

Despite this, we chose Node.js because it is the defacto target platform for socket.io. This allows us to keep up with the latest version of the library and API. We also chose to write the thin layer in CoffeeScript because our frontend is written in CS and we have lots of experience from other parts of the system. Since the thin layer is a long running daemon process, we can deal with a few cycles spent at startup to compile the CS to JS.

Routing frontend requests to services

Due to cross domain restrictions (CORS alleviates many of these issues), session cookie and SSL certificate requirements, we require the ability to proxy requests to separate services via a single thin layer address, using the URL path to seamlessly route to a backend address.

As we’re already using Node.js we were able to take advantage of the excellent node-http-proxy library, which pretty much does exactly this. We add a few extra goodies onto the plain routing in http-proxy, such as configurable HTTP headers for cross domain requests and CSRF protection.

Our two main requirements are authentication handling and address mapping, both for HTTP requests and socket connections. To enable this we added a Connect style plugin system that allows addresses to be remapped easily for different purposes beyond the mapping built into http-proxy or for data to be injected into requests before forwarding, like adding account details to query strings after querying MongoDB.

How simple is the plugin API? Just add one or several methods like get, post, put, delete, map or socket to your plugin class and the routing handlers will do the rest.

One of http-proxy‘s features is WebSocket routing, however due to using socket.io this doesn’t get used, instead…

Aggregating multiple backend connections over a single frontend socket

With the socket.io server side handling built into the thin layer we are able to take advantage of having full control over the stack between thin layer and backend service, by terminating the abstracted socket at the Node.js instance and then creating one or more real WebSocket connections to the backend services.

Instead of creating new connections from the frontend to services we just append a different ID to our messages and the thin layer routes the messages to the appropriate service, crating a new WebSocket if it needs to. On receipt of messages from this connection, it appends the same ID to them before sending them back to the frontend.

This socket aggregation also means connections are lazy, only created as needed by the frontend, but once created they remain and will continue to push “real time” updates back to the UI.

Another extension of this method is the ability to push other messages over a single connection. Once you have a connection to a service over a WebSocket you can do anything over that socket you might have done with individual requests; this was part of the reason for choosing sockets. However, we are also able to tell the thin layer to send HTTP requests with a socket message and to tag the response with an ID before sending back as a socket message.

This doesn’t save us much in terms of requests on the backend but this part of our infrastructure is relatively cheap to scale due to various tricks we can employ such as SSL termination. Instead it saves the user’s browser from sending and receiving lots of expensive HTTP requests.

Sockii to the rescue!

Sockii is our internal name for the thin layer we created – it can do everything described above and some more besides, all from a simple JSON config.

We will continuously improve Sockii, and it’s currently sparse documentation, and as of today we’ve released it under a BSD license available for everyone to fork and deploy!

So when you’re using Server Density to see real time updates of your cloud instance build status or the response time of your websites, you know a little about what’s connecting our backend services to your browser client.