Secure Your Sockets with JSSE
Back in 2001, Sun introduced the Java NIO API as part of the then newly released Java SDK 1.4. This API solved a significant limitation of the Java SDK, which was the lack of an API for building highly scalable network applications. Previously, the IO support in Java was limited to stream-based, blocking IO, which although elegant and simple, is significantly impaired in terms of scalability, requiring one active thread for each network connection. Java NIO introduced support for IO multiplexing and non-blocking IO, which are necessary tools to build highly scalable applications. In the article "Building Highly Scalable Servers with Java NIO," I discussed these two new features and, as a proof of concept, presented an IO framework capable of scaling up to several thousands of simultaneous connections.
After the initial experiments with Java NIO, most developers start wondering about security; in particular, how to use SSL with Java NIO. With the traditional blocking sockets API, security is a simple issue: just set up an SSLContext instance with the appropriate key material, use it to create instances of SSLSocketFactory or SSLServerSocketFactory, and finally use these factories to create instances of SSLServerSocket or SSLSocket. And that's all there is to it! After they are created, they can be used just like plaintext sockets, requiring no changes on the code that uses them. (For more information, see "Secure Your Sockets with JSSE.") So how hard can it be with Java NIO? The Javadocs provide no information about the issue, which is enough to make one suspicious. The next step a typical developer does is to Google "SSL Java NIO." The only results are a few discussions where other developers complain about the same problem. Reading those discussions provides the answer to our question: with the Java SDK 1.4, it is not possible to use SSL with Java NIO! We can either have security or scalability, but not both!
Fortunately, this limitation was corrected by the newly released JDK 5 with the introduction of the SSLEngine API. (Actually, SSLEngine is only the name of the main class of the API, but for the lack of a better name, I'll use it to refer to the whole API.) The solution offered by this API is a bit of a surprise for those, like me, who were expecting a solution similar to the one used with the stream-based API, which would be an SSLSocketChannel class that could be used as a drop-in replacement for standard SocketChannels. Instead of this obvious solution, Sun decided to solve the SSL problem once and for all, by making the SSLEngine API transport-independent, thereby completely separating the SSL support from IO. The SSL Engine API is only responsible for implementing the SSL/TLS state machine, which performs all communication with the outside world using byte buffers. Therefore, the developer is free to use any transport mechanism he finds appropriate.
This solution has the significant advantage of supporting all possible IO and threading models, both existing and future ones. Unfortunately, as in many other situations, flexibility comes at the price of complexity. Many of the details of the SSL protocol that are hidden from the developer by the traditional stream-based SSL API are now exposed to the developer, who must deal with them directly. This mainly includes handshaking and reassembling SSL packets before decrypting them. There are other details that must be dealt with, but these two are the ones likely to cause headaches.
It is no wonder that Sun considers this an advanced API, recommending that beginners continue using SSLSockets. After having experimented with it, I couldn't agree more. But if you really need the scalability offered by Java NIO, you have no choice but to get your hands dirty. And that's what I've been doing for the last few weeks, while extending the IO framework presented in my previous article to support SSL.
In this article, I describe the SSLEngine API and present the main lessons learned during my contact with this API. For those interested in a deeper understanding of this API, the article includes the source code of the revised IO framework.
The SSLEngine API
The workhorse of the SSLEngine API is the javax.net.ssl.SSLEngine class, which implements the SSL/TLS state machine and performs all operations related to the protocol. This includes handshaking, encryption, and decryption.
Lifecycle
The lifecycle of an SSLEngine is described in Figure 1.
Figure 1. The lifecycle of an SSLEngine
SSLEngine instances are created by the SSLContext class. This class was introduced with the stream-based SSL API, where it is used to initialize a security context with cryptographic material and to create SSL socket factories. In JDK 1.5, it was extended to also create SSLEngine instances. The setup process for an SSLContext is exactly the same as before, and I'll not mention it here.
After it is created, an SSLEngine must first go through the handshake, where the server and the client negotiate the cipher suite and the session keys. This phase typically involves the exchange of several messages.
After completing the handshake, the application can start sending and receiving application data. This is the main state of the engine and will typically last until the connection is CLOSED.
In some situations, one of the peers may ask for a renegotiation of the session parameters, either to generate new session keys or to change the cipher suite. This forces a re-handshake. Although in the diagram they are represented as separate states, a re-handshake does not stop the flow of application data. Therefore, handshake and application data can be intermixed freely during this stage, which poses a challenge to the developer, who must be careful to separate the two types of data.
When one of the peers is done with the connection, it should initiate a graceful shutdown, as specified in the SSL/TLS protocol. This involves exchanging a couple of closure messages between the client and the server to terminate the logical session before physically closing the socket.
Interaction with Applications
Now that the basic lifecycle of an SSLEngine has been described, it's time to take a closer look at its interaction with applications. Figure 2 presents the typical structure of an application using an SSLEngine instance.
Figure 2. The flow of data in an application using an SSLEngine
The two main methods of the SSLEngine are wrap() and unwrap(). These methods have various overloaded versions, but the following two signatures are the ones that are likely be used most often:
public SSLEngineResult wrap(
ByteBuffer src, ByteBuffer dst)
throws SSLException;
public SSLEngineResult unwrap(
ByteBuffer src, ByteBuffer dst)
throws SSLException;
SSLEngine.wrap() receives plaintext data from the application and encrypts it. It may also generate handshake data, if a handshake is in progress. The result, containing both encrypted application data and handshake data, is given back to the application in order to be sent to the peer. On the opposite direction, SSLEngine.unwrap() processes data read from the network, which may include handshake and encrypted application data. The handshake data is used to update the internal state of the SSLEngine and the application data is decrypted and passed to the application. As a result of this behavior, a typical application will have the following four buffers:
· inNetData
Stores data received directly from the network. This consists of encrypted data and handshake information. This buffer is filled with data read from the socket and emptied by SSLEngine.unwrap().
· inAppData
Stores decrypted data received from the peer. This buffer is filled by SSLEngine.unwrap() with decrypted application data and emptied by the application.
· outAppData
Stores decrypted application data that is to be sent to the other peer. The application fills this buffer, which is then emptied by SSLEngine.wrap().
· outNetData
Stores data that is to be sent to the network, including handshake and encrypted application data. This buffer is filled by SSLEngine.wrap() and emptied by writing it to the network.
The buffers must be carefully managed, so that when wrap() and unwrap() are called, there is enough data to process and enough space to store the generated data. To help with this task, those methods return an instance of SSLEngineResult, containing information about the overall status of the engine and about the handshake status.
The overall status information is used to notify the developer of the result of the last operation attempted by the engine. It can take the following values:
· BUFFER_OVERFLOW
There is not enough space on the output buffer to write all of the data that would be generated by the method. The application should free some space on the out buffer.
· BUFFER_UNDERFLOW
There is not enough data on the input buffer to perform the operation. The application should read more data from the network. As far as I understood, this result happens only in calls to unwrap(). The SSL/TLS protocol is packet-based and unwrap() can only operate on full packets. If the input buffer does not contain a full packet, unwrap() will return this result. In a call to wrap(), the SSLEngine is able to create a SSL/TLS packet with whatever data is available on the input buffer, so it should never complain about a buffer underflow.
· CLOSED
The SSLEngine is CLOSED. This instance can no longer be used.
· OK
The operation was performed successfully. Some data was either consumed or produced, or both.
The handshake status is used to inform about any handshake that may be in progress. It can be one of the following:
· FINISHED
The last operation terminated the handshake.
· NEED_TASK
A lengthy task must be performed in order to continue the handshake. More on this later.
· NEED_UNWRAP
unwrap() must be called to proceed with the handshake.
· NEED_WRAP
wrap() must be called to proceed with the handshake.
· NOT_HANDSHAKING
No handshake is in progress.
The result of a wrap()/unwrap() call also contains the number of bytes consumed and produced.
Handling Lengthy Operations
Before giving an example of a handshake sequence, it is necessary to explain the NEED_TASK status. During handshake, the SSL/TLS protocol often needs to perform operations that block or that take a long time to complete. In most situations, this corresponds to the generation of session keys, but in more complex scenarios, it may involve asking for a password from the user or validating a certificate with a remote server. In non-blocking IO models, these operations cannot be performed in the same thread that is used to service IO requests, since it would block all other connections serviced by that thread. Therefore, the SSLEngine supports a mechanism to delegate these tasks to external threads. When such a lengthy task must be performed, the NEED_TASK status is returned. The developer must then call the method SSLEngine.getDelegatedTask() to obtain an instance of Runnable that encapsulates the task and then executes it in the most suitable way. Some of the more common possibilities are executing it synchronously in the IO thread, or asynchronously, either in a thread pool or in a new thread. The following code shows how to execute the tasks in a separate thread using the new java.util.concurrent package:
Executor exec =
Executors.newSingleThreadExecutor();
if (res.getHandshakeStatus() ==
SSLEngineResult.HandshakeStatus.NEED_TASK) {
Runnable task;
while ((task=engine.getDelegatedTask()) != null)
{
exec.execute(task);
}
}
A Sample SSL Session
Now I'll describe part of a typical SSL client session. A client is responsible for initiating the handshake sequence by sending a hello message to the server. To do so, the SSLEngine must be put in client mode and the handshake initiated. This is done with the following code:
sslEngine.setUseClientMode(true);
sslEngine.beginHandshake();
The server does the same, but initializing the engine to server mode:
sslEngine.setUseClientMode(false);
sslEngine.beginHandshake();
The client then calls wrap() to generate the initial handshake message. The result of this call will typically be:
Status = OK HandshakeStatus = NEED_UNWRAP
bytesConsumed = 0 bytesProduced = 100
The operation was performed correctly, but now the engine is waiting for an unwrap() to proceed with the handshake. Also notice that a message of 100 bytes was produced, although no data was consumed. This is the engine generating the hello message. Suppose we try to call unwrap() without having read enough data. The result would be:
Status = BUFFER_UNDERFLOW HandshakeStatus = NEED_UNWRAP
bytesConsumed = 0 bytesProduced = 0
It consists in a complaint about a buffer underflow, which is how the engine asks for more data in the input buffer. After reading enough data from the network, the result of a call to unwrap() is the following:
Status = OK HandshakeStatus = NEED_TASK
bytesConsumed = 701 bytesProduced = 0
This time no data was produced, but the data read from the network was consumed. Before proceeding, the engine must perform some lengthy task (most likely, it needs to generate the session keys). The developer must execute all pending tasks, call wrap(), and then send the result to the other peer.
The handshake goes on for a few more messages until finishing with the following result:
Status = OK HandshakeStatus = FINISHED
bytesConsumed = 37 bytesProduced = 0
With the handshake finished, the SSLEngine is finally ready to process application data.
To close the connection the user must first inform the SSLEngine that there is no more application data to be sent and, therefore, the session should be terminated. This is done by calling SSLEngine.closeOutbound(). After this, a call to wrap() will generate a close message that must be sent to the other peer. A well-behaved program should wait for the answer to this message, but the SSL/TLS specification says that it is acceptable to close the socket after sending the initial close message. And, typically, this is the easiest solution. After being closed, an SSLEngine cannot be reused.
Conclusion
Java NIO gave developers the tools required to build highly scalable servers, but not for doing so securely. Developers were forced to choose between high scalability with Java NIO, and security with the traditional stream-based API. Now, Java 5 introduces the SSLEngine API, which solves the problem once and for all, both for existing and future IO and threading models, by providing a transport-independent approach for protecting the communication between two peers. Unfortunately, this is a complex API with a long and steep learning curve. But when scalability and security are not optional, this is a price developers will have to pay.