Look-ahead Java deserialization

How to secure deserialization from untrusted input without using
encryption or sealing

When Java™ serialization is used to exchange information
between a client and a server, attackers can try to replace the legitimate
serialized stream with malicious data. This article explains the nature of
this threat and describes a simple way to protect against it. Find out how to
stop the deserialization process as soon as an unexpected Java class is found
in the stream.

Pierre Ernst is a senior member of the IBM Business Analytics Security
Competency Group at the Ottawa Lab in Canada. A former software developer
turned penetration tester, he's responsible for finding security
vulnerabilities in IBM applications before they are released. Using a
combination of manual testing and secure code review, his work complements
automated vulnerability scanners. Pierre is also responsible for giving
guidance to developers on how to mitigate and fix security issues.

Java serialization enables developers to save a Java object to a binary
format so that it can be persisted to a file or transmitted over a
network. Remote Method Invocation (RMI) uses serialization as a
communication medium between a client and a server. Several security
problems can arise when a service accepts binary data from a client and
deserializes the input to construct a Java instance. This article focuses
on one of them: An attacker could serialize an instance of another class
and send it to the service. The service would then deserialize the
malicious object and most probably cast it to the legitimate class the
service is expecting, causing an exception to be thrown. However, that
exception might come too late to ensure that the data is secure. This
article explains why and shows how to implement a secure alternative. (See
the Other deserialization pitfalls sidebar for a
brief overview of other security issues relating to Java
deserialization.)

Other deserialization
pitfalls

Deserialization is subject to three additional threats:

An attacker could eavesdrop on the communication and obtain
potentially sensitive data. Transport Layer Security (TLS) can be
used to prevent this type of attack.

A malicious user could tamper with data that was legitimately
serialized by the client application and change values to subvert
the service's business logic. As with other types of services,
input validation must be applied at the server, even if the same
validation has already taken place at the client. Object sealing
can also be an effective countermeasure in this scenario.

An attacker can set private members of
the object, which might not be the behavior that the developers
intended. The attacker might be able to change the object's
internal state using that technique. Marking such members
transient can be part of the
solution.

Further discussion of these issues and countermeasures is outside the scope of this article.

Vulnerable classes

Your service shouldn't deserialize objects of arbitrary class. Why not? The
short answer is: because you likely have vulnerable classes in the
server's classpath that an attacker can leverage. These classes contain
code that let the attacker cause a denial-of-service condition or
— in extreme cases — to inject arbitrary code.

You might believe that this kind of attack is impossible, but consider how
many classes can be found in the classpath of a typical server. They
include not only your own code, but also the Java Class Library,
third-party libraries, and any middleware or framework libraries.
Additionally, the classpath might change over an application's lifetime or
be modified in response to environmental changes to the system that extend
beyond a single application. When trying to leverage such a weakness, an
attacker can combine several operations by sending multiple serialized
objects.

I should emphasize that the service will deserialize a malicious object
only if:

The malicious object's class exists in the server's classpath. The
attacker cannot simply send a serialized object of any class, because
the service will be unable to load the class.

The malicious object's class is either serializable or
externalizable. (That is, the class on the server must implement
either the java.io.Serializable interface
or the java.io.Externalizable
interface.)

Also, the deserialization process populates the object tree by copying data
from the serialized stream without calling the constructor. So an attacker
can't execute Java code residing inside the constructor of the
serializable object class.

But the attacker has other ways of executing some code on the server.
Whenever the JVM deserializes an object of a class that implements one of
the following three methods, it calls the method and executes the code
inside it:

The readObject() method, typically used by
developers when standard serialization cannot be used, such as when a
transient member needs to be set.

The readResolve() method, typically used to
serialize singleton instances.

The readExternal() method, used for
externalizable objects.

So if you have classes in your classpath that use any of these methods, you
must be aware that an attacker can call the methods remotely. This kind of
attack has been used in the past to break out of the Applet sandbox (see
Resources); the same technique can also be
applied against a server.

Read on to see how to allow deserialization only of the class (or classes)
that you expect for your service.

Java serialization binary
format

Whitelisting

Even if you're absolutely certain that your service is immune to the
attack discussed in this article, remember that input validation
against a list of known good values (whitelisting) is always
part of good security practices.

After an object is serialized, the binary data contains both metadata
(information about the structure of the data, such as class name, number
of members, and type of members) and the data itself. I'll use a simple
Bicycle class as an example. The class, shown
in Listing 1, contains three members (id,
name, and nbrWheels)
and their corresponding setters and getters:

Look-ahead class
validation

As you can see in Listing 3, when the stream is
read, the class description of the serialized object appears before the
object itself. This structure enables you to implement your own algorithm
to read the class description and decide whether to continue reading the
stream, depending on the class name. Fortunately, you can do this easily
by using a hook Java provides that's normally used for custom class
loading — namely, overriding the
resolveClass() method. This hook fits the bill
perfectly for providing custom validation, because you can use it to throw
an exception whenever the stream contains an unexpected class. You need to
subclass java.io.ObjectInputStream and override
the resolveClass() method. Listing 4 uses this
technique to allow only instances of the
Bicycle class to be deserialized:

By calling the readObject() method on your
com.ibm.ba.scg.LookAheadDeserializer instance,
you prevent an unexpected object from being deserialized.

As a demonstration, Listing 5 serializes two objects — an instance
of the expected class
(com.ibm.ba.scg.LookAheadDeserializer.Bicycle)
and an unexpected object (a java.lang.File
instance) — then tries to deserialize them using the custom
validation hook from Listing 4:

Figure 1. Application
output

Conclusion

This article has shown you how to stop the Java deserialization process as
soon as an unexpected Java class is found on the stream, without needing
to perform encryption, sealing, or simple input validation on the members
of the newly deserialized instance. See Download
to get the full source code for the examples.

Remember that the entire object tree (the root object with all its members)
gets constructed during deserialization. In more-complex configurations,
you might need to allow more than one class to be deserialized.

CVE-2004-2540: readObject in
JRE allows remote attackers to cause a denial of service using
crafted serialized data.

CVE-2008-5353: The JRE does not properly enforce context
of ZoneInfo objects during
deserialization, which allows remote attackers to run untrusted
applets and applications in a privileged context, as demonstrated
by "deserializing Calendar
objects."

CVE-2010-0094: Unspecified vulnerability in the JRE
allows remote attackers to affect confidentiality, integrity, and
availability through unknown vectors related to deserialization of
RMIConnectionImpl objects, which allows
remote attackers to call system-level Java functions using the
ClassLoader of a constructor that is
being deserialized.

The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.