Using SSL with Non-Blocking IO

Back in 2001, Sun introduced the Java NIO API as part of the
then newly released Java SDK 1.4. This API solved a significant limitation of
the Java SDK, which was the lack of an API for building highly scalable network
applications. Previously, the IO support in Java was limited to stream-based,
blocking IO, which although elegant and simple, is significantly impaired in
terms of scalability, requiring one active thread for each network connection.
Java NIO introduced support for IO multiplexing and non-blocking IO, which are
necessary tools to build highly scalable applications. In the article
"Building Highly Scalable Servers with Java NIO," I discussed these two new features and, as a proof of concept, presented an IO framework capable of
scaling up to several thousands of simultaneous connections.

After the initial experiments with Java NIO, most
developers start wondering about security; in particular, how to use SSL
with Java NIO. With the traditional blocking sockets API, security is a simple
issue: just set up an SSLContext instance with the appropriate
key material, use it to create instances of SSLSocketFactory or
SSLServerSocketFactory, and finally use these factories to create instances of SSLServerSocket or SSLSocket. And that's all there is to it! After they are created, they can
be used just like plaintext sockets, requiring no changes on the code that uses
them. (For more information, see "Secure Your Sockets with JSSE.") So how hard can it be with Java NIO? The
Javadocs provide no information about the issue, which is enough to make one
suspicious. The next step a typical developer does is to Google "SSL
Java NIO." The only results are a few discussions where other developers
complain about the same problem. Reading those discussions provides the answer
to our question: with the Java SDK 1.4, it is not possible to use SSL with Java
NIO! We can either have security or scalability, but not both!

Fortunately, this limitation was corrected by the newly
released JDK 5 with the introduction of the SSLEngine API. (Actually, SSLEngine is
only the name of the main class of the API, but for the lack of a better name,
I'll use it to refer to the whole API.) The solution offered by this API
is a bit of a surprise for those, like me, who were expecting a solution similar
to the one used with the stream-based API, which would be an
SSLSocketChannel
class that could be used as a drop-in replacement for standard
SocketChannels.
Instead of this obvious solution, Sun decided to solve the SSL problem once and
for all, by making the SSLEngine API transport-independent, thereby completely
separating the SSL support from IO. The SSL Engine API is only responsible for
implementing the SSL/TLS state machine, which performs all communication with
the outside world using byte buffers. Therefore, the developer is free to use any
transport mechanism he finds appropriate.

This solution has the significant advantage of supporting
all possible IO and threading models, both existing and future ones.
Unfortunately, as in many other situations, flexibility comes at the price of
complexity. Many of the details of the SSL protocol that are hidden from the
developer by the traditional stream-based SSL API are now exposed to the
developer, who must deal with them directly. This mainly includes handshaking
and reassembling SSL packets before decrypting them. There are other details
that must be dealt with, but these two are the ones likely to cause headaches.

It is no wonder that Sun considers this an advanced API,
recommending that beginners continue using SSLSockets. After having
experimented with it, I couldn't agree more. But if you really need the scalability
offered by Java NIO, you have no choice but to get your hands dirty. And
that's what I've been doing for the last few weeks, while extending the
IO framework presented in my previous article to support SSL.

In this article, I describe the SSLEngine API and present
the main lessons learned during my contact with this API. For those
interested in a deeper understanding of this API, the article includes
the source code of the revised IO framework.

The SSLEngine API

The workhorse of the SSLEngine API is the javax.net.ssl.SSLEngine class, which implements the SSL/TLS state machine and
performs all operations related to the protocol. This includes handshaking,
encryption, and decryption.

Lifecycle

The lifecycle of an SSLEngine is described in Figure 1.

Figure 1. The lifecycle of an SSLEngine

SSLEngine instances are created by the
SSLContext class. This class was introduced with the stream-based SSL API,
where it is used to initialize a security context with cryptographic material
and to create SSL socket factories. In JDK 1.5, it was extended to also create
SSLEngine instances. The setup process for an SSLContext is
exactly the same as before, and I'll not mention it here.

After it is created, an SSLEngine must
first go through the handshake, where the server and the client negotiate the
cipher suite and the session keys. This phase typically involves the exchange
of several messages.

After completing the handshake, the
application can start sending and receiving application data. This is the main
state of the engine and will typically last until the connection is CLOSED.

In some situations, one of the peers may ask
for a renegotiation of the session parameters, either to generate new session
keys or to change the cipher suite. This forces a re-handshake. Although in the
diagram they are represented as separate states, a re-handshake does not stop
the flow of application data. Therefore, handshake and application data can be
intermixed freely during this stage, which poses a challenge to the developer,
who must be careful to separate the two types of data.

When one of the peers is done with the
connection, it should initiate a graceful shutdown, as specified in the SSL/TLS
protocol. This involves exchanging a couple of closure messages between the
client and the server to terminate the logical session before physically
closing the socket.

Interaction with Applications

Now that the basic lifecycle of an SSLEngine has been described, it's time to take a closer look at its
interaction with applications. Figure 2 presents the typical structure of an
application using an SSLEngine instance.

Figure 2. The flow of data in an application using an SSLEngine

The two main methods of the SSLEngine are
wrap() and unwrap(). These methods have various overloaded
versions, but the following two signatures are the ones that are likely be used most often:

SSLEngine.wrap() receives plaintext data
from the application and encrypts it. It may also generate handshake data, if a
handshake is in progress. The result, containing both encrypted application
data and handshake data, is given back to the application in order to be sent
to the peer. On the opposite direction, SSLEngine.unwrap() processes data read
from the network, which may include handshake and encrypted application data.
The handshake data is used to update the internal state of the SSLEngine and
the application data is decrypted and passed to the application. As a result of
this behavior, a typical application will have the following four buffers:

inNetData
Stores data received directly from the network. This consists of
encrypted data and handshake information. This buffer is filled with data
read from the socket and emptied by SSLEngine.unwrap().

inAppData
Stores decrypted data received from the peer. This buffer is
filled by SSLEngine.unwrap() with decrypted application data and emptied
by the application.

outAppData
Stores decrypted application data that is to be sent to the other
peer. The application fills this buffer, which is then emptied by
SSLEngine.wrap().

outNetData
Stores data that is to be sent to the network, including handshake
and encrypted application data. This buffer is filled by SSLEngine.wrap()
and emptied by writing it to the network.

The buffers must be carefully managed, so that when wrap() and unwrap() are called, there is enough data to process and enough space to store the generated data. To help with this task, those
methods return an instance of SSLEngineResult, containing information about
the overall status of the engine and about the handshake status.

The overall status information is used to notify the developer of the result of the last operation attempted by the engine. It can take the following values:

BUFFER_OVERFLOW
There is not enough space on the output buffer to write all of the data that would be
generated by the method. The application should free some space on the out
buffer.

BUFFER_UNDERFLOW
There is not enough data on the input buffer to perform the operation. The application
should read more data from the network. As far as I understood, this
result happens only in calls to unwrap(). The SSL/TLS protocol is
packet-based and unwrap() can only operate on full packets. If the
input buffer does not contain a full packet, unwrap() will return
this result. In a call to wrap(), the SSLEngine is able to create
a SSL/TLS packet with whatever data is available on the input buffer, so it
should never complain about a buffer underflow.

CLOSED
The SSLEngine is CLOSED. This instance can no longer be used.

OK
The operation was performed successfully. Some data was either consumed or produced, or both.

The handshake status is used to inform
about any handshake that may be in progress. It can be one of the following:

FINISHED
The last operation terminated the handshake.

NEED_TASK
A lengthy task must be performed in order to continue the handshake. More on this later.

NEED_UNWRAPunwrap() must be called to proceed with the handshake.

NEED_WRAPwrap() must be called to proceed with the handshake.

NOT_HANDSHAKING
No handshake is in progress.

The result of a wrap()/unwrap() call also
contains the number of bytes consumed and produced.

Handling Lengthy Operations

Before giving an example of a handshake
sequence, it is necessary to explain the NEED_TASK status. During handshake,
the SSL/TLS protocol often needs to perform operations that block or that take
a long time to complete. In most situations, this corresponds to the generation
of session keys, but in more complex scenarios, it may involve asking for a
password from the user or validating a certificate with a remote server. In
non-blocking IO models, these operations cannot be performed in the same thread
that is used to service IO requests, since it would block all other connections
serviced by that thread. Therefore, the SSLEngine supports a mechanism to
delegate these tasks to external threads. When such a lengthy task must be
performed, the NEED_TASK status is returned. The developer must then call the
method SSLEngine.getDelegatedTask() to obtain an instance of
Runnable that encapsulates the task and then executes it in the most suitable way. Some of the
more common possibilities are executing it synchronously in the IO thread, or
asynchronously, either in a thread pool or in a new thread. The following code
shows how to execute the tasks in a separate thread using the new
java.util.concurrent package:

A Sample SSL Session

Now I'll describe part of a typical SSL client session. A client is
responsible for initiating the handshake
sequence by sending a hello message to the server. To do so, the
SSLEngine must be put in client mode and the handshake initiated. This is done
with the following code:

sslEngine.setUseClientMode(true);
sslEngine.beginHandshake();

The server does the same, but initializing the engine to server mode:

sslEngine.setUseClientMode(false);
sslEngine.beginHandshake();

The client then calls wrap() to generate
the initial handshake message. The result of this call will typically be:

The operation was performed correctly, but now the engine is waiting for an unwrap() to proceed with the handshake. Also notice that a message of 100 bytes was produced, although no data was consumed.
This is the engine generating the hello message. Suppose we try to call unwrap()
without having read enough data. The result would be:

It consists in a complaint about a buffer
underflow, which is how the engine asks for more data in the input buffer.
After reading enough data from the network, the result of a call to
unwrap() is the following:

This time no data was produced, but the data read from the network was consumed. Before proceeding, the engine must
perform some lengthy task (most likely, it needs to generate the session keys).
The developer must execute all pending tasks, call wrap(), and
then send the result to the other peer.

The handshake goes on for a few more
messages until finishing with the following result:

With the handshake finished, the SSLEngine is finally ready to process
application data.

To close the connection the user must first
inform the SSLEngine that there is no more application data to be sent and,
therefore, the session should be terminated. This is done by calling
SSLEngine.closeOutbound(). After this, a call to wrap()
will generate a close message that must be sent to the other peer. A well-behaved program should wait
for the answer to this message, but the SSL/TLS specification says that it is
acceptable to close the socket after sending the initial close message. And,
typically, this is the easiest solution. After being closed, an SSLEngine
cannot be reused.

Conclusion

Java NIO gave developers the tools required
to build highly scalable servers, but not for doing so securely. Developers
were forced to choose between high scalability with Java NIO, and security with
the traditional stream-based API. Now, Java 5 introduces the SSLEngine API,
which solves the problem once and for all, both for existing and future IO and
threading models, by providing a transport-independent approach for protecting
the communication between two peers. Unfortunately, this is a complex API with
a long and steep learning curve. But when scalability and security are not
optional, this is a price developers will have to pay.