A JavaScript Implementation of TLS (Part 2/2)

In the previous article that we did on a JavaScript implementation of TLS, we explained why we created Forge, which we released as open source software. To summarize, before Forge, there was no easy way to access a home computer using just JavaScript and Flash – technologies that exist in 98.9% of all browsers. With Forge, application providers such as Google Docs can now provide access to your home computer in a way that is safe and secure.

Forge Design

We really only expect developers to find this part interesting so we assume that if you’re still reading you are one.

We created the Forge project to provide a means for our customers to securely connect from our website, bitmunk.com, to their Bitmunk P2P application, which they run on their home computers. It is also related to two of our other projects: PaySwarm and Monarch.

In designing Forge, we used a similar approach to how we design many of our products here at Digital Bazaar. First, we start with a bottom-up approach. There are a lot of little individual parts that must all come together to make a complete TLS+HTTP stack. Once we’ve built and tested those little parts, we switch to a top-down approach. Here we discover exactly how the code best flows in a TLS implementation. We study how state-changes are driven in the system and which code paths can be or should be shared and reused. We design the API that a user of our TLS implementation will interact with and integrate it with our system design. This way we understand the bigger picture of what is going on, how the implementation will be used, and we can better ensure that we don’t make poor critical design decisions. Once we’ve got the code flow worked out, we can go in and fill out the middle details by gluing together the pieces we created during the bottom-up design phase.

Next we’ll provide an overview of the pieces you need to build a TLS implementation and explain how we decided the code ought to flow and interact with a user. We won’t really talk about the middle glue — for the most part is simply involves following the TLS spec — but we will briefly mention how we tested our implementation to sort out any problems we had.

The Cross-domain Problem

Since Flash 9.0.115.0 Adobe has made it possible to create raw sockets in Flash provided that the server they are to communicate with can serve up a cross-domain policy. That policy is in XML format and is, by default, served from port 843. You can, however, specify a different port from which to obtain the policy. There’s another option, which involves serving it directly in-line with the HTTP protocol, but not as part of the protocol itself. We thought that a bit hackish so we opted for using a custom port. This way our application can select a port (or you can configure one) that will serve up the policy. The port that gets selected can be uploaded to our website and stored in a database, along with the SSL certificate generated for the particular application, its current IP address, and access port. Dealing with firewalls is beyond the scope of this document other than to say that we use a UPnP implementation that handles the issue on many routers.

So what’s in the swf that we serve from our website? An interface to JavaScript that permits it to do the following:

Create or destroy a raw socket.

Send or receive data on that raw socket.

Deflate or inflate data using the DEFLATE algorithm (technically zlib, not just raw DEFLATE).

Store data (read: cookies) on the local disk using Flash’s SharedObject.

That’s all we use Flash for. There are some currently necessary and ugly inefficiencies when communicating between JavaScript and Flash. While, supposedly, both JavaScript and Flash use UTF-16 to store their strings internally, when we’re sending raw bytes between the two, we have to base64-encode them or else Flash gets confused about null bytes in the middle of a string (and possibly other bytes as well). There are some other not-niceties with Flash’s ExternalInterface (read: its bizarre string escaping behavior) that we managed to avoid by returning strings as the properties of objects rather than just directly.

We tried to minimize our Flash usage as much as possible and we aim to replace what we can in the future with features from HTML5.

With this Flash API our JavaScript can create raw bytes and send them to any server that provides a compatible cross-domain policy. Since we create the application that you install, we include such a policy.

A TLS implementation requires a lot of little pieces to come together. We made some choices to try and create or acquire those little pieces as quickly as possible without sacrificing too much performance. Also, we were originally going to start out by implementing the latest version of TLS (1.2) or maybe 1.1 if 1.2 was still too new. It turns out that 1.1 is still too new. Our application uses OpenSSL for its server-side TLS, which hasn’t yet implemented anything higher than 1.0.

The Technical Requirements of a TLS Implementation

Raw byte storage

Here we knew that JavaScript stored strings internally as UTF-16 so we decided to just try writing a class similar to what we use in some of our C++ code called ByteBuffer. Our byte buffer implementation in JavaScript was built around storing bytes in strings using the charCodeAt() and fromCharCode() functions to do byte to character conversions. We tried it out, ran some tests, and found that it actually worked and was pretty quick and compact. That’s what we use for all byte storage on the JavaScript side.

At least 1 cipher suite

The TLS 1.0 spec has a mandatory cipher suite: TLS_DHE_DSS_WITH_3DES_EDE_CBC_SHA. All of that text means Diffie-Hellman parameters are signed with a DSA key and data is encrypted using triple DES (in Encrypt-Decrypt-Encrypt Cipher-Block-Chaining mode). The TLS 1.2 spec has a mandatory cipher suite: TLS_RSA_WITH_AES_128_CBC_SHA. Since we started out implementing TLS 1.2 and AES is quicker, newer, in wide-adoption, and believed to be more secure, we preferred it. We will also already need RSA to handle the SSL certificates we generate. So we decided to do a little less work and sacrifice being officially compliant with TLS 1.0 but we are still practically compliant with just about every major TLS server out there.

AES Encryption

We tested a number of existing open source AES JavaScript implementations to find the most performant. We ended up choosing one from a library, which at the time, was called “jscrypto” and was under a public domain license. It has since been renamed to the “Stanford JavaScript Crypto Library” or SJCL. We thought the implementation was very clever and fast, but the commenting within the source for how it worked was lacking. We made an effort to understand what was going on in the implementation, heavily commented the source, and made a few optimizations and changes. Since that point in time, SJCL has made similar optimizations and we believe their implementation is actually now slightly faster than ours. However, the difference in speed wasn’t large enough to warrant spending that much time analyzing where the differences were or reintegrating their updates with our different interface.

RSA Encryption

We’ve had a lot of experience here at Digital Bazaar working with RSA and DSA public and private keys and RSA encryption. Therefore, we already knew the basics behind how to do RSA encryption, including how to do RSA decryption more quickly using the Chinese Remainder Theorem. However we didn’t internally have a good BigInteger implementation in JavaScript. We had previously researched performant JavaScript BigIntegers in the past for some other projects so we had a head start on finding one that we liked. The fastest two we considered were Leemon Baird’s BigInt and Tom Wu’s BigInteger. Both were quite fast compared to anything we found on the net with Leemon’s out-performing Tom Wu’s in chrome but Tom Wu’s out-performing Leemon’s in Firefox. However, Tom Wu’s BigInteger had a much more pleasant API. Leemon Baird’s site states that he wrote his BigInt purely for his own entertainment and found it hard to imagine any practical cryptographic use for his code. We found a use but it turns out that his implementation had a tendency to pollute the global namespace in JavaScript making it difficult to use with other JavaScript running. We would have had to rewrite the API and we just didn’t need to do that with Tom Wu’s BigInteger. So we selected his. It turns out that Tom Wu also had some code for doing RSA encryption but we had done it before and it only takes a few lines of code — in stark contrast to making a really fast BigInteger library — so we just made use of his BigInteger library within our own APIs.

SHA-1 and MD5

There were some implementations of these out on the web that we found but they were all more or less the same and simply followed the MD5 and SHA-1 pseudo code available on Wikipedia. Since they would be easy to write and we had some interest in how they worked we just quickly implemented them ourselves by following the pseudo code.

HMAC (Hash-based Message Authentication Code)

Writing an HMAC implementation is really easy when you’ve already got the actual hash functions. There is pseudo code available on Wikipedia. Our HMAC implementation is just a wrapper around one of the supported message digests (SHA-1 or MD5) that integrates a secret key properly.

Base64

We had written our own Base64 codec in other languages before so we just quickly ported that code and made some JavaScript-specific changes.

ASN.1 (Abstract Syntax Notation Number 1)

Like we mentioned earlier, we had already done lots of work with X.509 certificates (which are typically encoded in DER (Distinguished Encoding Rules) ASN.1) so we already had an ASN.1 implementation in a couple of languages other than JavaScript. However, we found a really comprehensive ASN.1 parser online by Lapo Luchini. Unfortunately, we found that its internal data storage wasn’t as efficient as our byte buffer implementation and that it actually supported far more from ASN.1 than we needed so we ported our simple and shorter Java ASN.1 implementation to JavaScript (which was nearly a direct port).

A cryptographically-secure PRNG (Pseudo Random Number Generator)

The key to a PRNG for TLS is that the output needs to be unpredictable. The TLS protocol requires some random bytes to be sent in the clear and if an attacker can steal those and figure out what random bytes will be next then they’ll be able to steal the keys generated to communicate securely. This is among the most difficult problems to solve and audit in a JavaScript implementation. It would be an easy problem to solve if web browsers exposed a JavaScript interface to the cryptographically-secure PRNG that they have to use to implement TLS natively, but this just isn’t the case yet.

The PRNG used in the current implementation is based on the Fortuna algorithm, designed by Bruce Scheier and Niels Ferguso. The Fortuna algorithm removes the necessity of having to try to estimate the amount of true entropy in the data you add to your entropy pool. It does this by evenly spreading the entropy from all of your various entropy sources out over 32 pools and then only periodically taking from some subset of those pools based on the number of reseeds that have occurred. To generate its random numbers, the Fortuna algorithm uses a cryptographic function, typically whichever is already available in the system the PRNG will be used in. In our case, we used AES as the cryptographic backend.

The problem with a PRNG in JavaScript is that there aren’t as many sources of good entropy as there are for languages that have access to the hard disk. Our PRNG makes an attempt at grabbing entropy from page load times, mouse movements, keyboard presses, and some of the data available in the navigator object. However, there just really isn’t enough entropy available when the page first loads to immediately start generating numbers. In the future we might be able to start doing something like storing and sending a seed file from the server that hosts the JavaScript TLS but that’s a lot of added complexity. Right now, if the PRNG needs to be seeded right away, we use the navigator’s information and yet another source of entropy by combining the browser’s built-in random number generator and one based on the Park-Miller “minimal standard” 32 bit PRNG. It isn’t perfect or pretty, but we hope that it’s good enough given our current limitations.

Once all of these pieces were ready we could then begin implementing TLS. To do this we read over the spec to get a general idea of what was going on and then began doing a top-down design. We also decided to cut down on time and just implement the client-side of TLS. We also didn’t have a use case for the server side, however, should someone need to, extending our current implementation to support server-side TLS doesn’t seem too daunting. Like we mentioned earlier, how each of these pieces fits together is the “middle glue” that we don’t cover in this article but can be easily gleaned from the TLS spec. Next we’ll discuss just a little bit of our top-down design and APIs without going into too many boring details.

Handling TLS Records

TLS traffic is broken down into records. Each record has a maximum size of 16 KiB. The records contain application data, alerts (errors or warnings), or a message related to the TLS handshake protocol. The handshake protocol is used to establish a session which will contain and make use of the cipher suite that a client and server agree upon to secure their traffic. The records you receive control how state changes.

Our TLS connection object can either accept incoming records (typically from a web server but this is abstracted) or produce outgoing ones (again typically intended to be sent to a web server). When data comes in, we check to see if we are buffering a record already. If we are, then we add more data to the record. If not, we start a new record and take note of its size. If a record is part of the TLS handshake protocol then its full message size can be found inside of the record’s handshake message header. This can be used to determine whether or not a record has been fragmented so we know how long to keep reading until the full message has arrived.

Once a full message has arrived we ship it off to update our current connection state. There are state tables that keep track of what the next valid state is based on the next record or handshake message type that we’re expecting. We enter the record type and message type into our state tables which call the appropriate function to handle the record and its message. If the message is unexpected, our error handler takes over and generates a TLS alert indicating there was an error, which will terminate the TLS connection. Otherwise, we will process either a handshake message, if we are still negotiating our session with the other end, or application data. For details on how the TLS handshake protocol works see the RFC or our source code.

Handling SSL Certificates

The only detail we’ll discuss here concerning the handshake protocol is how certificates are handled. This is important because our design provides a useful callback during the certificate verification process.

When the server’s SSL certificate is checked, it is part of a chain of certificates. If the certificate is self-signed there will be only one certificate in the chain. Otherwise, each subsequent certificate in the chain must be the issuer of the previous one and must have digitally signed it. The certificate chain verification process therefore checks for this condition and ensures that some other details about the certificates in the chain are valid (ie: expiration dates). Every time a certificate in the chain has been examined, an optional user callback is called passing: the TLS connection, a verified flag which is true if the certificate passed all verification checks or is otherwise the TLS alert value corresponding to how it failed, the depth of the certificate in the chain, and the chain as an array with the server’s certificate at index 0. The user function can return true to indicate that the certificate should be considered verified (trusted) or a TLS alert indicating why it isn’t trusted. This allows customized certificate verification.

A similar callback can be found in OpenSSL, so developers who have worked with that project should recognize it.

Once the TLS Handshake is Complete

Once we’ve completed the handshake with the other end, we can start sending application data. When application data is given to our TLS connection it will be automatically fragmented as needed. Furthermore, our TLS connection supports session caching. This is particularly useful in JavaScript because doing a TLS handshake is a costly process and session caching can avoid a large part of it by resuming an existing session. When a session is resumed, the server doesn’t have to send its certificate again and can skip many of the full handshake details. That alone can save a lot of time and data transferred. When application data is ready, a callback will be called and the data can be read from a buffer on the TLS connection. If you don’t want the data just yet, you can wait for more to arrive and it will just be appended to the buffer.

HTTP in JavaScript

What data will we be sending to our application? In our case, which is probably the common case, it’s HTTP. So that means we need an HTTP implementation in JavaScript. We took a quick look around for one, but didn’t expect to find anything because, of course, it sounds ridiculous.

Luckily for us we have written some HTTP implementations here before so we just ported the simplest parts. One of those parts included chunked-encoding, something our servers often use and we didn’t want to be without.

Included in our HTTP implementation is a simple client that can use up to N connections to connect to a server. It pools the connections together and tries to make an intelligent decision as to which one to use when queuing up a new request. To create an HTTP client, you give it the JavaScript wrapper for a the Flash socket pool (what we ended up calling our little Flash API) the maximum number of connections and the host to connect to via a url. Each HTTP client will connect to a single host but you can create as many clients as you want. If you want the connections to be secure, you simply specify the url’s protocol as “https”, provide a list of certificates to trust, the optional verify callback discussed earlier, and some other optional parameters.

You can provide optional callbacks for when a connection has been made, an HTTP header has arrived, an HTTP body has arrived, or an error has occurred. Using these callbacks you can handle most of your needs using HTTP.

An XmlHttpRequest API

To make this even easier, we added an XmlHttpRequest API implementation that wraps our HTTP client. Technically speaking, it wraps one HTTP client per domain, since it provides cross-domain support. Using jQuery you can specify a callback to create our XHR object and use it to communicate over HTTP with a cross-domain application using standard APIs.

With our top-level design and API in place, we started filling out the middle glue to connect our low-level pieces with our top-level design. Simply following the TLS spec will cover most of the issues here, with one obvious exception: testing everything with a TLS-compliant server.

Connecting to An OpenSSL Test Server

Since TLS is a conglomerate of many different pieces, we knew that implementations are difficult to get right and we would have made mistakes. The easiest way to correct those mistakes was to have a correctly implemented server to communicate with; one that was built with open source code to which we could add debugging information. Since we have worked with OpenSSL before, we downloaded their source and built their test server. From there, we added whatever appropriate debugging information we needed to the entire TLS handshake process until we worked out all of the kinks. This was the easiest way to find out exactly what was going wrong when something wasn’t quite right.

Well, that’s it. Once all of these pieces are put together, you can communicate using HTTP over TLS with a server running on a different domain, provided that you begin with a website that you trust.