Thursday, 8 August 2013

Hazelcast is an in-memory data grid that provides a scaleable data store with a rich set of features. Over the past year I've been using it in other applications and I've been impressed with it's capabilities. A new feature in recently released Hazelcast 3.0, is the ability to plug-in your own serialization code. By allowing Argot to be used both in the in-memory data grid and as the transport for Internet of Things applications, it allows the same code developed for Argot to be used in Big-Data applications. The combination of Hazelcast and Argot means less code, less development time and more scale.

Recently the Hazelcast team made available a small benchmark code that provided both examples of how various serialization software could be embedded into Hazelcast and showed a rudimentary benchmark for the different methods. The code is made available here. I've forked the code and included Argot as an additional example.

The rest of this post covers the details of how I integrated with Hazelcast. It also includes a discussion of how Argot's performance compares to the other serialization techniques used in the benchmark.

Argot Hazelcast Integration

In this example the Argot definition for the 'sample' object is defined as follows. The default Argot common dictionary doesn't include the generic long_array or double_array, so I've also created them here:

The above elements would be required for any Argot
based implementation and is not a special requirement of the Hazelcast
implementation. I've included them here for completeness of the example.

The bridge between Hazelcast and Argot is defined by a StreamSerializer implementation. The ArgotSerializer provides the generic implementation that can be used by any object. The read and write methods read and write the object to stream.

The Hazelcast API makes registering and using custom serialization reasonable simple to configure. A class is bound to a serializer. In this example I'm binding the SampleObject class to the ArgotSerializer:

That's it, the Sample object can now be serialized to the Internet of Things and can also be stored in a Big-Data environment.

Argot Performance

Having not looked at Argot performance for some time, I was very interested to see how Argot would perform in the benchmark. While performance is not necessarily the number one decision in choosing a serialization protocol, it is reasonably important.

The first test I conducted didn't include either the array of long values or the array of double values. The results were:

Argot Serialization

4172 bytes

348 ms

Java Serialization

4336 bytes

411 ms

DataSerializable

4237 bytes

263 ms

IdentifiedDataSerializable

4186 bytes

170 ms

Portable

4205 bytes

228 ms

Kryo

4336 bytes

277 ms

Kryo-unsafe

4336 bytes

256 ms

While Argot didn't come first in the first performance test, it wasn't last which isn't bad, especially considering Argot hasn't had a lot of performance tuning done. Argot did win at using the least amount of data. This can be achieve as it uses meta data describe the structure of the binary format. This results in less data stored at runtime.

Being reasonably happy with Argot's performance, the next step I did was include the array of long values in the sample object. The benchmark stores 3000 long values to the array. The initial results were very bad for Argot and made me realise that the implementation of int64 in Argot was based on some old code that wrote individual bytes to the stream. After fixing this up, the following results were achieved:

Argot Serialization

28174 bytes

5587 ms

Java Serialization

28374 bytes

862 ms

DataSerializable

28241 bytes

918 ms

IdentifiedDataSerializable

28190 bytes

851 ms

Portable

28213 bytes

904 ms

Kryo

28374 bytes

786 ms

Kryo-unsafe

28374 bytes

727 ms

Ouch! The Argot performance of writing an array of 3000 int64's was six times worse than the closest performer. This test showed me two things; the first is that this benchmark is heavily biased towards arrays, and the second is that Argot's default array marshaller is terrible at dealing with large arrays of simple types.

After some further investigation I discovered that performance for arrays is all gained/lost in the number of calls to read or write to the stream. If you can minimize stream writes you can improve the performance greatly. Not to give up, I used Argot's ability to implement custom marshalling to implement a fast int64 array marshaller. The writer for the marshaller is as follows:

This implementation minimises writes to the stream by allocating the full byte stream and preparing the data before writing it out to the stream. A similar reader was written which reads the full array in before parsing it. To connect the new marshaller to the long array was a simple matter of changing the binding:

library.bind( library.getTypeId("long_array", "1.0"),

new LongArrayReader(), new LongArrayWriter(), long[].class);

After running the benchmark again, I got the following results:

Argot Serialization

28174 bytes

768 ms

Java Serialization

28374 bytes

832 ms

DataSerializable

28241 bytes

899 ms

IdentifiedDataSerializable

28190 bytes

816 ms

Portable

28213 bytes

887 ms

Kryo

28374 bytes

713 ms

Kryo-unsafe

28374 bytes

703 ms

With the optimized array marshaller Argot comes third in the benchmark! Not a bad result.

In conclusion, it's been very interesting to see how Argot compares to other serialization techniques in both implementation and performance. I've concluded that for objects with many simple fields, Argot is performing reasonably, however, additional buffering would likely bring Argot inline with some of the faster serialization implementations. I will put that on the task list for Argot 1.4. In addition, the serialization of large arrays are best done by specialised implementations. In implementations where performance is important it's good to know that Argot is flexible enough to allow this to be configured.