Saturday, 7 July 2012

Item 77: For instance control, prefer enum types to readResolve

Item 3
describes the Singleton pattern and gives the following example of a
singleton class. This class restricts access to its constructor to ensure that
only a single instance is ever created:

public class Elvis {

public static final Elvis INSTANCE = new Elvis();

private Elvis()
{ ... }

public void leaveTheBuilding() { ... }

}

As noted in Item
3, this class would no longer be a singleton if the words “implements Serializable” were added to its declaration.
It doesn’t matter whether the class uses the default serialized form or a
custom serialized form (Item 75), nor does it matter whether the class provides
an explicit readObject method (Item
76). Any readObject method,
whether explicit or default, returns a newly created instance, which will not
be the same instance that was created at class initialization time.

The readResolve feature allows you to substitute another
instance for the one created by readObject [Serialization,
3.7]. If the class of an object being deserialized defines a readResolve method with the proper declaration, this
method is invoked on the newly created object after it is deserialized. The
object reference returned by this method is then returned in place of the newly
created object. In most uses of this feature, no reference to the newly created
object is retained, so it immediately becomes eligible for garbage collection.

If the Elvis class is made to implement Serializable, the following readResolve method suffices to guarantee the singleton
property:

// readResolve for instance control - you can do
better!

private Object readResolve() {

// Return the one true Elvis and let the garbage
collector

// take care of the Elvis impersonator.

return INSTANCE;

}

This method
ignores the deserialized object, returning the distinguished Elvis instance that was created when the class was
initialized. Therefore, the serialized form of an Elvis instance need not contain any real data; all
instance fields should be declared transient. In fact, if you depend on
readResolve for instance
control, all instance fields with object reference types must
be declared transient. Otherwise, it
is possible for a determined attacker to secure a reference to the deserialized
object before its readResolve method is run,
using a technique that is vaguely similar to the MutablePeriod
attack
in Item 76.

The attack is
a bit complicated, but the underlying idea is simple. If a singleton contains a
nontransient object reference field, the contents of this field will be deserialized
before the singleton’s readResolve method is run.
This allows a carefully crafted stream to “steal” a reference to the originally
deserialized singleton at the time the contents of the object reference field
are deserialized.

To make this
concrete, consider the following broken singleton:

// Broken singleton - has nontransient object
reference field!

public class Elvis implements Serializable {

public static final Elvis INSTANCE = new Elvis();

private Elvis() { }

private String[] favoriteSongs =

{ "Hound Dog", "Heartbreak
Hotel" };

public void printFavorites() {

System.out.println(Arrays.toString(favoriteSongs));

}

private Object readResolve() throws
ObjectStreamException {

return INSTANCE;

}

}

You could fix
the problem by declaring the favorites field transient, but you’re better off fixing it by making Elvis a single-element enum type (Item 3).
Historically, the readResolve method was
used for all serializable instance-controlled classes. As of release 1.5, this
is no longer the best way to maintain instance control in a serializable class.
As demonstrated by the ElvisStealer attack, this
technique is fragile and demands great care.

If instead you
write your serializable instance-controlled class as an enum, you get an
ironclad guarantee that there can be no instances besides the declared
constants. The JVM makes this guarantee, and you can depend on it. It requires
no special care on your part. Here’s how our Elvis
example
looks as an enum:

// Enum singleton - the preferred approach

public enum Elvis {

INSTANCE;

private String[] favoriteSongs =

{ "Hound Dog", "Heartbreak Hotel" };

public void printFavorites() {

System.out.println(Arrays.toString(favoriteSongs));

}

}

The use of readResolve for instance control is not obsolete. If you
have to write a serializable instance-controlled class whose instances are not
known at compile time, you will not be able to represent the class as an enum
type. The
accessibility of readResolve is significant. If you place a
readResolve method on a
final class, it should be private. If you place a readResolve method on a nonfinal class, you must
carefully consider its accessibility. If it is private, it will not apply to
any subclasses. If it is package-private, it will apply only to subclasses in
the same package. If it is protected or public, it will apply to all subclasses
that do not override it. If a readResolve method is
protected or public and a subclass does not override it, deserializing a
serialized subclass instance will produce a superclass instance, which is
likely to cause a ClassCastException.

To summarize,
you should use enum types to enforce instance control invariants wherever
possible. If this is not possible and you need a class to be both serializable
and instance-controlled, you must provide a readResolve
method
and ensure that all of the class’s instance fields are either primitive or
transient.