Jan 12, 2016

A. Introduction

So many systems use Protocol Buffers to define APIs and DTOs that it’s easy to forget there is also Thrift. In contrast to the PBs, Thrift has always lacked proper documentation. It is enough to say that one of the most popular guides for it was written by an outsider.

On the other hand, Thrift comes with an actual RPC library. The PBs world will get one only once GRPC is released. For those of us who do not base our backend services on something like Finagle, libthrift could be a reasonable choice at least initially.

B. Documentation and examples

When one considers a new RPC framework one of the first questions to ask is how well its support for asynchronous communications is. Strangely enough, in case of Thrift the official documentation is pretty silent. As a matter of fact, I was not able to google an example of an asynchronous Thrift-based service or any guidelines for using Thrift-generated classes with “Async” in their name.

WhatI found was all about synchronous calls. Strangely enough, even one libthrift alternative I foundhad nothing to say about asynchronous calls to my utter surprise.

So I had to dig a little deeper to understand how to do it. While exploring it I created a small project that I will use as a complete example. The idea was to prototype a service with two different service interfaces. Each one takes one request object and returns a response object.

C. Thrift-generated code walk-through

If you look at the file generated by the Thrift compiler for your service you’ll see a few classes with the prefix “Async”:

AsyncIface: the asynchronous version of your service interface with additional AsyncMethodCallback argument

AsyncClient: client-side view of your RPC service interface; it requires some configuration to actually make calls

AsyncProcessor: server-side intermediary between your actual request handler and the TServer you will use to wrap everything into a running process

D. Wiring up libthrift infrastructure

The generated classes need to be plugged into the libthrift infrastructure. On the server side, you will:

create an instance of your request handler

wrap it into an instance of Thrift-generated AsyncProcessor

in case of multiple service interfaces, create a multiplexed processor and register each async processor with it using a unique name

create a server configuration with typical parameters such as TCP port and a j.u.c executor to accept requests. There are a few TServer implementations to choose from

start the server

On the client side, you need to:

create two reusable instances: a client manager and a protocol factory

with those two instantiate (for every service interface) a client factory

make the factory create a client class instance

call the client

Each client takes a TCP host/port pair so in real life it would take a discovery service of some kind to find those. Notice that AWS ELB supports TCP traffic load balancing and so there would be the only URL for any number of servers in that case.

E. The curious case of multiplexed asynchronous processor

When I mentioned a multiplexed processor I lied to you. It turns out there is no such thing currently in libthrift-0.9.3.jar. So if you use a multiplexed protocol factory on the client side there will be no peer on the server to demultiplex it. As an immediate workaround, I implemented one. As a real solution we have got a pending pull request for THRIFT-2427.

At first glance a glaring hole of this size in a mature library from a big company makes me think that Thrift is an obsolete dead-end. That would explain why they never bothered to produce reasonable documentation. I have some hope for GRPC in this respect.

F. Service client and request handler ideas

Aside from wiring up the auto-generated classes with the infrastructure from the libthrift library you will also need to implement an actual request processor and to call the client class somewhere. In my project you can find both pieces in the unit test. As a working end-to-end example it mostly follows this description even though it makes some effort to represent and configure RPC service definitions in a more generic way.

On the server side it is convenient to process request by

supplying the corresponding job to some executor (different from the one used for networking)

call Thrift RPC callback from a CompletableFuture listener

In a similar way, on the client side a CompletableFuture allows to receive a response either synchronously with get or asynchronously in a CompletableFuture listener. Notice how an actual response instance is wrapped into a Thrift-generated class representing an RPC method call.