Comparing Externalizable to Serializable

Of course, this efficiency comes at a price.
Serializablecan be frequently implemented by doing two
things: declaring that a class implements the
Serializableinterface and adding a zero-argument
constructor to the class. Furthermore, as an application evolves, the
serialization mechanism automatically adapts. Because the metadata is
automatically extracted from the class definitions, application programmers
often don't have to do anything except recompile the program.

On the other hand,
Externalizableisn't particularly easy to do, isn't very flexible, and requires you to
rewrite your marshalling and demarshalling code whenever you change your class
definitions. However, because it eliminates almost all the reflective calls
used by the serialization mechanism and gives you complete control over the
marshalling and demarshalling algorithms, it can result in dramatic
performance improvements.

To demonstrate this, I have defined the
EfficientMoneyclass. It has the same fields and
functionality as
Moneybut implements
Externalizableinstead of
Serializable:

On my home machine, averaging over 10 trial runs for both
Moneyand
EfficientMoney, I
get the results shown in Table
10-1. (We need to average because the
elapsed time can vary (it depends on what else the computer is doing). The
size of the file is, of course, constant.)

These results are fairly impressive. By simply converting a leaf
class in our hierarchy to use externalization, I save 67 bytes and 10
milliseconds when serializing a single instance. In addition, as I pass larger
data sets over the wire, I save more and more bandwidth--on average, 18 bytes
per instance.

TIP: Which numbers should we pay attention
to? The single-instance costs or the 10,000-instance costs? For most
applications, the single-instance cost is the most important one. A typical
remote method call involves sending three or four arguments (usually of
different types) and getting back a single return value. Since RMI clears
the serialization mechanism between calls, a typical remote method call
looks a lot more like serializing 3 or 4 single instances than serializing
10,000 instances of the same class.

If I need more efficiency, I can go further and remove
ValueObjectfrom the hierarchy entirely. The
ReallyEfficientMoneyclass directly extends
Objectand implements
Externalizable:

ReallyEfficientMoneyhas much better
performance than either
Moneyor
EfficientMoneywhen a single instance is serialized but
is almost identical to
EfficientMoneyfor large
data sets. Again, averaging over 10 iterations, I record the numbers in Table
10-2.

Compared to
Money, this is quite
impressive; I've shaved almost 200 bytes of bandwidth and saved 40
milliseconds for the typical remote method call. The downside is that I've had
to abandon my object hierarchy completely to do so; a significant percentage
of the savings resulted from not including
ValueObjectin the inheritance chain. Removing
superclasses makes code harder to maintain and forces programmers to implement
the same method many times (
ReallyEfficientMoneycan't use
ValueObject's implementation of
equals( )and
hashCode( )anymore). But it does lead to significant performance improvements.

One Final Point

An important point is that you can decide whether to implement
Externalizableor
Serializableon a class-by-class basis. Within the same
application, some of your classes can be
Serializable, and some can be
Externalizable. This makes it easy to evolve your
application in response to actual performance data and shifting requirements.
The following two-part strategy is often quite nice:

Make all your classes implement
Serializable.

After that, make some of them, the ones you send often
and for which serialization is dramatically inefficient, implement
Externalizableinstead.

This gets you most of the convenience of serialization and lets
you use
Externalizableto optimize when
appropriate.

Experience has shown that, over time, more and more objects will
gradually come to directly extend
Objectand
implement
Externalizable. But that's fine. It
simply means that the code was incrementally improved in response to
performance problems when the application was deployed.

Learning Command Objects and RMI -- O'Reilly's Java RMI author William Grosso introduces you to the basic
ideas behind command objects by providing a translation
service from a remote server and using command objects
to structure the RMI made from a client program.

Seamlessly Caching Stubs for Improved Performance -- In Part 2 of this RMI series, William Grosso addresses a common problem with RMI apps -- too many remote method calls to a naming service. In this article he extends the framework introduced in Part 1 to provide seamless caching of stubs.

Generics and Method Objects -- O'Reilly's Java RMI author William Grosso introduces you to the new Generics Specification and rebuilds his command object framework using it.