A default application of serialization to these classes will generate
525 bytes for BigMoney and 599 bytes for Money.
This is a lot of data to be sending for objects that seem quite simple.

Where does the size go?

Well, each serialized class had to write a header to state what the class is.
For something like Money, it has to write a header for itself,
BigMoney, CurrencyUnit, BigDecimal and
BigInteger.
The header also includes the serialization version number and the names of each field.

Of course, serialization is designed to handle complex cases where the versions of the
class file differ on two JVMs. Data is populated into the right fields using the field name.
But for simple classes like money, the data isn't going to change over time.

One interesting fact is that the class header is only sent once per stream for a class.
As a result, for each subsequent after the first the size is reduced.
For default serialization of a subsequent BigMoney the size is 59 bytes
and for Money it is 65 bytes. Clearly, the header is a major overhead.

Making the data smaller

The key to this is using a serialization delegate class.
The delegate is a class that is written into the output stream in place of the original class.
This approach is required because the fields are final which prevents a sensible data
format from being written/read by the class itself.

The delegate class uses the low level writeObject and readObject to control
the data in the stream. The readResolve method then returns the correct object back
for the serialization mechanism to put in the object structure.
The class is static to ensure a stable serialized form.

Simply taking control of the stream in this way will greatly reduce the overall size.
The biggest gain is in writing out the BigDecimal in an efficient manner.

Even better?

My investigation has shown a technique to make the stream even smaller.

Firstly, rather than using a static inner class, use a top-level package scoped class.
This will have a shorter fully qualified class name, thus a shorter header.

Secondly, look at the other classes in the package.
If there are more classes that need the same treatment, why not use a single delegate class for all of them?

So, both classes are sharing the same serialization delegate, using a single byte type to distinguish them.
Since the header is written once per class per stream, there is now only one header written
whether your stream contains BigMoney, Money or both.

I've also switched to using Externalizable rather than Serializable.
Despite the public methods, these cannot be called on the general API because this is a package scoped class.
This change doesn't affect the stream size, but should perform faster (untested!) as there is less reflection involved.

With these changes, the stream size for sending one BigMoney or Money drops
to 58 bytes from 525/299 bytes.
Sending a subsequent object of the same type drops to 24 bytes, whereas the default would be 59/65 bytes.

The single shared delegate approach also results in a smaller jar file, as there is a large jar file
size overhead for each separate class. (We've replaced two delegates by one, so the jar is smaller).

One downside with this approach is that serialization is no longer encapsulated within the class
being serialized. This may result in a constructor becoming package scoped rather than private.

The approach is also only recommended where the class and serialized format is stable, as you
are fully responsible for evolution over time of the data format.

A final downside is that the object identity of objects might not be not preserved.
For example, if the data of the BigDecimal is written out rather than a reference to the
object then a new BigDecimal object will be created for each BigMoney deserialized.
The extent to which this is a problem is dependent on the memory structure being serialized.

The same problem applies to multiple Money object backed by the same BigMoney.
The default serialized size for the second would be just 10 bytes, whereas the basic shared delegate approach
would be 24 bytes.

As a result, I recommend only writing the base class, BigMoney in this case, directly
using its contents. Other classes that contain the base class, Money in this case,
should write out a reference to the BigMoney from the shared delegate.
This approach means that the second Money takes 14 bytes when the BigMoney is shared
and 34 bytes when it isn't.

Using this final approach, the figures are as follows

Object

Default serialization

Shared delegate

First sent

Subsequent

First sent

Subsequent

BigMoney

525

59

58

24

Money

599

65

68

34

Money with shared BigMoney

599

10

68

14

Summary

The shared delegate technique offers one route to the smallest stream size for serialization.
The data size for the first object was a tenth of the original, and halved for subsequent objects.
However, I would recommend this as a specialist technique for low level value objects rather than general beans.

Monday, 8 February 2010

This is a quick blog to outline my upcoming job change and how it
affects JSR-310.

For many years I've worked for SITA, global leader in air transport
communications and IT solutions.
But the time has come to move on, so from the 1st of March I'm
starting a new job at a London startup, Open Gamma.

So, what can I tell you about Open Gamma? Well not too much just yet
as its only just coming out of stealth mode.
I can say they're lead by Kirk Wylie, they're building
technology for the financial industry, and I'm excited about their big
idea!
Oh, and they're hiring
(London only).

And how does this affect JSR-310?

Well, OpenGamma will be actively supporting my work on JSR-310 in work time!
Clearly this will have a big impact on development pace, and we may
yet make JDK 7 (but of course thats up to the SunOracle).

In the meantime, watch out for the Early Draft Review of JSR-310 where
I'll need maximum feedback!