Protocol Buffers with Spring Integration

Today most http APIs use JSON as the data format. But it's not the only data format available to a modern application. Depending on the use case a different format might sometimes be a better fit. For example if an application needs to send binary data (e.g. sound, pictures, ...) a binary data format could be more suitable than JSON which is text based. Or when you work in an environment where the network bandwidth is constrained to a certain limit (e.g. mobile data with monthly data caps) and you have to make sure that the messages are as small as possible.

In this post we take a look at Protocol Buffers, a binary data format developed by Google.

A JSON message like {tx:18912,rx:15000,temp:-9.3,humidity:89} has a size of 41 bytes. The same data encoded with Protocol Buffer results in a message that has a size of only 14 bytes. To be fair the difference would not be that big if the message contains strings and JSON can be made much smaller when using shorter keys ({t:18912,r:15000,e:-9.3,h:89}) and it can be compressed very efficiently when the message gets bigger and contains arrays of similar objects. But in general Protocol Buffer messages are smaller than JSON messages.

For the following example we create a sender and receiver program in Java. Protocol Buffers is not limited to Java. Officially supported are C++, C#, Dart, Go, Java, JavaScript, Objective-C, Ruby and Python. Third party libraries add support for even more programming languages.

Before applications can send and receive Protocol Buffer messages, the messages need to be defined in a text file and then run through a compiler that creates source code for serializing and deserializing objects in the target language to and from the binary wire format. Usually the definition file has the suffix .proto. For this example we name our file SensorMessage.proto

This definition is written in version 3 of the Protocol Buffer syntax. If you omit the syntax keyword the compiler uses version 2 which is a bit different. The message keyword encapsulates a message. One .proto file may contain more than one message definition. Implicitly every field is optional (no mandatory fields in proto3). Every field has a data type assigned to it. Protocol Buffers supports different datatypes like integers, floats, booleans, strings, binary and enums. You find the list of all supported types on the official documentation page.

After the name of the field follows the tag (e.g. = 1). The tag must be a unique number and should never be changed when the message is in use. The tag is what Protocol Buffers sends over the wire to identify a field. Tags from 1 - 15 take one byte in the binary message, tags from 16 - 2047 take two bytes and so on.

After we defined the message we have to compile it. For this purpose we need the Protocol Buffer compiler which can be downloaded from the release page. The download links that start with 'protoc' contain the compiler.

But because we write our example application in Java and use Maven as the build system we can take advantage of a Maven plugin that compiles our proto file.

After you added the plugin to the pom.xml file, copy the .proto file to the <project_home>/src/main/protobuf folder and start the compiler with mvn generate-sources. The plugin automatically downloads and runs the protobuf compiler. Protoc creates one Java file for each .proto file and puts it into the <project_home>/target/generated-sources/ folder. The generated Java source code depends on the protobuf java library, therefore we have to add this dependency to our pom.xml.

Now we have everything set up and are able to start coding our application. First we create the sender application. For this example we want to reduce the bandwidth even more and send the messages over UDP. UDP has an overhead of only 28 bytes but is unreliable and packets could get lost. Not really a problem for our demo application because we send and receive the messages through the loopback device (127.0.0.1).

The sender first creates an instance of the message. SensorMessage is the class that protoc generated. The method toByteArray() serializes the object to a byte array that contains the Protocol Buffer message in binary form. Then the program sends the data with the Java built-in UDP support to port 9992 on localhost.

The program creates a simple Spring Integration flow starting with the UDP inbound adapter that listens on port 9992 for incoming packets. Each incoming packet contains binary data and flows into a transformer that deserializes the data with the static method parseFrom() into a SensorMessage instance. After the transformation, the message flows to a handler that prints the object to System.out.