Technology, Open Source and Identity

Tag Archives: REST

My last post on RESTful transactions sure seemed to attract a lot of attention. There are a number of REST discussion topics that tend to get a lot of hand-waving by the REST community, but no real concrete answers seem to be forthcoming. I believe the most fundamental reasons for this include the fact that the existing answers are unpalatable – both to the web services world at large, and to REST purists. Once in a while when they do mention a possible solution to a tricky REST-based issue, the web services world responds violently – mostly because REST purists give answers like “just don’t do that” to questions like “How do I handle session management in a RESTful manner?”

I recently read an excellent treatise on the subject of melding RESTful web services concepts with enterprise web service needs. Benjamin Carlyle’s Sound Advice blog entry, entitled The REST Statelessness Constraint hits the mark dead center. Rather than try to persuade enterprise web service designers not to do non-RESTful things, Benjamin instead tries to convey the purposes behind REST constraints (in this case, specifically statelessness), allowing web service designers to make rational tradeoffs in REST purity for the sake of enterprise goals, functionality, and performance. Nice job Ben!

The fact is that the REST architectural style was designed with one primary goal in mind: to create web architectures that would scale well to the Internet. The Internet is large, representing literally billions of clients. To make a web service scale to a billion-client network, you have to make hard choices. For instance, http is connectionless. Connectionless protocols scale very well to large numbers of clients. Can you imagine a web server that had to manage 500,000 simultaneous long-term connections?

Server-side session data is a difficult concept to shoehorn into a RESTful architecture, and it’s the subject of this post. Lots of web services – I’d venture to say 99 percent of them – manage authentication using SSL/TLS and the HTTP “basic auth” authentication scheme. They use SSL/TLS to keep from exposing a user’s name and password over the wire, essentially in clear text. They use basic auth because it’s trivial. Even banking institutions use this mechanism because, for the most part, it’s secure. Those who try to go beyond SSL/TLS/basic auth often do so because they have special needs, such as identity federation of disparate services.

To use SSL/TLS effectively, however, these services try hard to use long-term TCP connections. HTTP 1.0 had no built-in mechanism for allowing long-term connections, but NetScape hacked in an add-on mechanism in the form of the “connection: keep-alive” header, and most web browsers support it, even today. HTTP 1.1 specifies that connections remain open by default. If an HTTP 1.1 client sends the “connection: close” header in a request then the server will close the connection after sending the response, but otherwise, the connection remains open.

This is a nice enhancement, because it allows underlying transport-level security mechanisms like SSL/TLS to optimize transport-level session management. Each new SSL/TLS connection has to be authenticated, and this process costs a few round-trips between client and server. By allowing multiple requests to occur over the same authenticated sesssion, the cost of transport-level session management is amortized over several requests.

In fact, by using SSL/TLS mutual authentication as the primary authentication mechanism, no application state need be maintained by the server at all for authentication purposes. For any given request, the server need only ask the connection layer who the client is. If the service requires SSL/TLS mutual auth, and the client has made a request, then the server knows that the client is authenticated. Authorization (resource access control) must still be handled by the service, but authorization data is not session data, it’s service data.

However, SSL/TLS mutual auth has an inherent deployment problem: key management. No matter how you slice it, authentication requires that the server know something about the client in order to authenticate that client. For SSL/TLS mutual auth, that something is a public key certificate. Somehow, each client must create a public key certificate and install it on the server. Thus, mutual auth is often reserved for the enterprise, where key management is done by IT departments for the entire company. Even then, IT departments cringe at the thought of key management issues.

User name and password schemes are simpler, because often web services will provide users a way of creating their account and setting their user name and password in the process. Credential management done. Key management can be handled in the same way, but it’s not as simple. Some web services allow users to upload their public key certificate, which is the SSL/TLS mutual-auth equivalent of setting a password. But a user has to create a public/private key pair, and then generate a public key certificate from this key pair. Java keytool makes this process as painless as possible, but it’s still far from simple. No – user name and password is by far the simpler solution.

As I mentioned above, the predominant solution today is a combination of CA-based transport-layer certificate validation for server authentication, and HTTP basic auth for client authentication. The web service obtains a public/private key pair that’s been generated by a well-known Certificate Authority (CA). This is done by generating a certificate signing request using either openssl or the Java keytool utility (or by using less mainstream tools provided by the CA). Because most popular web browsers today ship well-known CA certificates in their truststores, and because clients implicitly trust services that provide certificates signed by these well-known CA’s, people tend to feel warm and fuzzy because no warning messages pop up on the screen when they connect to one of these services. Should they fear? Given the service verification process used by CAs like Entrust and Verisign, they probably should, but that problem is very difficult to solve, so most people just live with this stop-gap solution.

On the server side, the web service needs to know the identity of the client in order to know what service resources that client should have access to. If a client requests a protected resource, the server must be able to validate that client’s right to the resource. If the client hasn’t authenticated yet, the server challenges the client for credentials using a response header and a “401 Unauthorized” response code. Using the basic auth scheme, the client base64-encodes his user name and password and returns this string in a response header. Now, base64 encoding is not encrytion, so the client is essentially passing his user name and password in what amounts to clear text. This is why SSL/TLS is used. By the time the server issues the challenge, the SSL/TLS encrypted channel is already established, so the user’s credentials are protected from even non-casual snoopers.

When the proper credentials arrive in the next attempt to request the protected resource, the server decodes the user name and password, verifies them against its user database, and either returns the requested resource, or fails the request with “401 Unauthorized” again, if the user doesn’t have the requisite rights to the requested resource.

If this was the extent of the matter, there would be nothing unRESTful about this protocol. Each subsequent request contains the user’s name and password in the Authorization header, so the server has the option of using this information on each request to ensure that only authorized users can access protected resources. No session state is managed by the server here. Session or application state is managed by the client, using a well-known protocol for passing client credentials on each request – basic auth.

But things don’t usually stop there. Web services want to provide a good session experience for the user – perhaps a shopping cart containing selected items. Servers typically implement shopping carts by keeping a session database, and associating collections of selected items with users in this database. How long should such session data be kept around? What if the user tires of shopping before she checks out, goes for coffee, and gets hit by a car? Most web services deal with such scenarios by timing out shopping carts after a fixed period – anywhere from an hour to a month. What if the session includes resource locks? For example, items in a shopping cart are sometimes made unavailable to others for selection – they’re locked. Companies like to offer good service to customers, but keeping items locked in your shopping cart for a month while you’re recovering in the hospital just isn’t good business.

REST principles dictate that keeping any sort of session data is not viable for Internet-scalable web services. One approach is to encode all session data in a cookie that’s passed back and forth between client and server. While this approach allows the server to be completely stateless with respect to the client, it has its flaws. First, even though the data is application state data, it’s still owned by the server, not the client. Most clients don’t even try to interpret this data. They just hand it back to the server on each successive request. But this data is application state data, so the client should manage it, not the server.

There’s no good answers to these questions yet. What it comes down to is that service design is a series of trade-offs. If you really need your web service to scale to billions of users, then you’d better find ways to make your architecture compliant with REST principles. If you’re only worried about servicing a few thousand users at a time, then perhaps you can relax the constraints a bit. The point is that you should understand the constraints, and then make informed design decisions.

I was reading recently in RESTful Web Services (Leonard Richardson & Sam Ruby, O’Reilly, 2007) about how to implement transactional behavior in a RESTful web service. Most web services today do this with an overloaded POST operation, but the authors assert that this isn’t necessary.

Their example (in Chapter Eight) uses the classic bank account transaction scenario, where a customer wants to transfer 50 dollars from checking to savings. I’ll recap here for your benefit. Both accounts start with 200 dollars. So after a successful transaction, the checking account should contain 150 dollars and the savings account should contain 250 dollars. Let’s consider what happens when two clients operate on the same resources:

This is all well and good until you consider that the steps in these operations might not be atomic. Transactions protect against the following situation, wherein the separate steps of these two Clients’ operations are interleaved:

After both operations, the account should contain 100 dollars, but because no account locking was in effect during the two updates, the second withdrawal is lost. Thus 100 dollars was physically removed from the account, but the account balance reflects only a 50 dollar withdrawal. Transaction semantics would cause the following series of steps to occur:

Web Transactions

The authors’ approach to RESTful web service transactions involves using POST against a “transaction factory” URL. In this case /transactions/account-transfer represents the transaction factory. The checking account is represented by /accounts/checking/11 and the savings account by /accounts/savings/55.

Now, if you recall from my October 2008 post, PUT or POST: The REST of the Story, POST is designed to be used to create new resources whose URL is not known in advance, whereas PUT is designed to update or create a resource at a specific URL. Thus, POSTing against a transaction factory should create a new transaction and return its URL in the Location response header.

At first glance, this appears to be a nice design, until you begin to consider the way such a system might be implemented on the back end. The authors elaborate on one approach. They state that documents PUT to resources within the transaction might be serialized during building of the transaction. When the transaction is committed the entire set of serialized operations could then be executed by the server within a server-side database transaction. The result of committing the transaction is then returned to the client as the result of the client’s commit on the web transaction.

However, this can’t work properly, as the server would have to have the client’s view of the original account balances in order to ensure that no changes had slipped in after the client had read the accounts, but before the transaction was committed (or even begun!). As it stands, changes could be made by a third-party to the accounts before the new balances are written and there’s no way for the server to ensure that these other modifications are not overwritten by outdated state provided by the transaction log. It is, after all, the entire purpose of a transaction to protect a database against this very scenario.

Fixing the Problem

One way to make this work is to include account balance read (GET) operations within the transaction, like this:

The GET operations would, of course, return real data in real time. But the fact that the accounts were read within the transaction would give the server a reference point for later comparison during the execution of the back-end database transaction. If the values of either account balance are modified before the back-end transaction is begun, then the server would have to abort the transaction and the client would have to begin a new transaction.

This mechanism is similar in operation to lock-free data structure semantics. Lock-free data structures are found in low-level systems programming on symmetric multi-processing (SMP) hardware. A lock-free data structure allows multiple threads to make updates without the aid of concurrency locks such as mutexes and spinlocks. Essentially, the mechanism guarantees that an attempt to read, update and write a data value will either succeed or fail in a transactional manner. The implementation of such a system usually revolves around the concept of a machine-level test and set operation. The would-be modifier, reads the data element, updates the read copy, and then performs a conditional write, wherein the condition is that the value is the same as the originally read value. If the value is different, the operation is aborted and retried. Even under circumstances of high contention the update will likely eventually occur.

How this system applies to our web service transaction is simple: If the values of either account are modified outside of the web transaction before the back-end database transaction is begun (at the time the commit=true document is PUT), then the server must abort the transaction (by returning “500 Internal server error” or something). The client must then retry the entire transaction again. This pattern must continue until the client is lucky enough to make all of the modifications within the transaction that need to be made before anyone else touches any of the affected resources. This may sound nasty, but as we’ll see in a moment, the alternatives have less desirable effects.

Inline Transaction Processing

Another approach is to actually have the server begin a database transaction at the point where the transaction resource is created with the initial POST operation above. Again, the client must read the resources within the transaction. Now the server can guarantee atomicity — and data integrity.

As with the previous approach, this approach works whether the database uses global- or resource-level locking. All web transaction operations happen in real time within a database transaction, so reads return real data and writes happen during the write requests, but of course the writes aren’t visible to other readers until the transaction is committed.

A common problem with this approach is that the database transaction is now exposed as a “wire request”, which means that a transaction can be left outstanding by a client that dies in the middle of the operation. Such transactions have to be aborted when the server notices the client is gone. Since HTTP is a stateless, connectionless protocol, it’s difficult for a server to tell when a client has died. At the very least, database transactions begun by web clients should be timed out. Unfortunately, while timing out a database transaction, no one else can write to the locked resources, which can be a real problem if the database uses global locking. Additional writers are blocked until the transaction is either committed or aborted. Locking a highly contended resource over a series of network requests can significantly impact scalability, as the time frame for a given lock has just gone through the ceiling.

My current project at Novell involves the development of a ReSTful web service for submission of audit records from security applications. The server is a Jersey servlet within an embedded Tomcat 6 container.

One of the primary reasons for using a ReSTful web service for this purpose is to alleviate the need to design and build a heavy-weight audit record submission client library. Such client libraries need to be orthogonally portable across both hardware platforms and languages in order to be useful to Novell’s customers. Just maintaining the portability of this client library in one language is difficult enough, without adding multiple languages to the matrix.

Regardless of our motivation, we still felt the need to provide a quality reference implementation of a typical audit client library to our customers. They may incorporate as much or as little of this code as they wish, but a good reference implementation is worth a thousand pages of documentation. (Don’t get me wrong, however–this is no excuse for not writing good documentation! The combination of quality concise documentation and a good reference implementation is really the best solution.)

The idea here is simple: Our customers won’t have to deal with difficulties that we stumble upon and then subsequently provide solutions for. Additionally, it’s just plain foolish to provide a server component for which you’ve never written a client. It’s like publishing a library API that you’ve never written to. You don’t know if the API will even work the way you originally intended until you’ve at least tried it out.

Since we’re already using Java in the server, we’ve decided that our initial client reference implementation should also be written in Java. Yesterday found my code throwing one exception after another while simply trying to establish the TLS connection to the server from the client. All of these problems ultimately came down to my lack of understanding of the Java key store and trust store concepts.

You see, the establishment of a TLS connection from within a Java client application depends heavily on the proper configuration of a client-side trust store. If you’re using mutual authentication, as we are, then you also need to properly configure a client-side key store for the client’s private key. The level at which we are consuming Java network interfaces also demands that we specify these stores in system properties. More on this later…

Using Curl as an Https Client

We based our initial assumptions about how the Java client needed to be configured on our use of the curl command line utility in order to test the web service. The curl command line looks something like this:

The important aspects of this command-line include the use of the –cert, –cert-type, –key and –key-type parameters, as well as the fact that we specified a protocol scheme of “https” in the URL.

With one exception, the remaining options are related to which http method to use (-X), what data to send (–data), and which message properties to send (–header). The exception is the -k option, and therein lay most of our problems with this Java client.

The curl man-page indicates that the -k/–insecure option allows the TLS handshake to succeed without verifying the server certificate in the client’s CA (Certificate Authority) trust store. The reason this option was added was because several releases of the curl package shipped with a terribly out-dated trust store, and people were getting tired of having to manually add certificates to their trust stores everytime they hit a newer site.

Doing it in Java

But this really isn’t the safe way to access any secure public web service. Without server certificate verification, your client can’t really know that it’s not communicating with a server that just says it’s the right server. (“Trust me!”)

During the TLS handshake, the server’s certificate is passed to the client. The client should then verify the subject name of the certificate. But verify it against what? Well, let’s consider–what information does the client have access to, outside of the certificate itself? It has the fully qualified URL that it used to contact the server, which usually contains the DNS host name. And indeed, a client is supposed to compare the CN (Common Name) portion of the subject DN (Distinguished Name) in the server certificate to the DNS host name in the URL, according to section 3.1 “Server Identity” of RFC 2818 “HTTP over TLS”.

Java’s HttpsURLConnection class strictly enforces the advice given in RFC 2818 regarding peer verification. You can override these constraints, but you have to basically write your own version of HttpsURLConnection, or sub-class it and override the methods that verify peer identity.

Creating Java Key and Trust Stores

Before even attempting a client connection to our server, we had to create three key stores:

A server key store.

A client key store.

A client trust store.

The server key store contains the server’s self-signed certificate and private key. This store is used by the server to sign messages and to return credentials to the client.

The client key store contains the client’s self-signed certificate and private key. This store is used by the client for the same purpose–to send client credentials to the server during the TLS mutual authentication handshake. It’s also used to sign client-side messages for the server during the TLS handshake. (Note that once authentication is established, encryption happens using a secret or symetric key encryption algorithm, rather than public/private or asymetric key encryption. Symetric key encryption is a LOT faster.)

The client trust store contains the server’s self-signed certificate. Client-side trust stores normally contain a set of CA root certificates. These root certificates come from various widely-known certificate vendors, such as Entrust and Verisign. Presumably, almost all publicly visible servers have a purchased certificate from one of these CA’s. Thus, when your web browser connects to such a public server over a secure HTTP connection, the server’s certificate can be verified as having come from one of these well-known certificate vendors.

I first generated my server key store, but this keystore contains the server’s private key also. I didn’t want the private key in my client’s trust store, so I extracted the certificate into a stand-alone certificate file. Then I imported that server certificate into a trust store. Finally, I generated the client key store:

Telling the Client About Keys

There are various ways of telling the client about its key and trust stores. One method involves setting system properties on the command line. This is commonly used because it avoids the need to enter absolute paths directly into the source code, or to manage separate configuration files.

$ java -Djavax.net.ssl.keyStore=/tmp/keystore.jks ...

Another method is to set the same system properties inside the code itself, like this:

I chose the latter, as I’ll eventually extract the strings into property files loaded as needed by the client code. I don’t really care for the fact that Java makes me specify these stores in system properties. This is especially a problem for our embedded client code, because our customers may have other uses for these system properties in the applications in which they will embed our code. Here’s the rest of the simple client code:

Getting it Wrong…

I also have a static test “main” function so I can send some content. But when I tried to execute this test, I got an exception indicating that the server certificate didn’t match the host name. I was using a hard-coded IP address (10.0.0.1), but my certificate contained the name “audit-server”.

It turns out that the HttpsURLConnection class uses an algorithm to determine if the server that sent the certificate really belongs to the server on the other end of the connection. If the URL contains an IP address, then it attempts to locate a matching IP address in the “Alternate Names” portion of the server certificate.

Did you notice a keytool prompt to enter alternate names when you generated your server certificate? I didn’t–and it turns out there isn’t one. The Java keytool utility doesn’t provide a way to enter alternate names–a standardized extension of the X509 certificate format. To enter an alternate name containing the requisite IP address, you’d have to generate your certificate using the openssl utility, or some other more functional certificate generation tool, and then find a way to import these foreign certificates into a Java key store.

…And then Doing it Right

On the other hand, if the URL contains a DNS name, then HttpsURLConnection attempts to match the CN portion of the Subject DN with the DNS name. This means that your server certificates have to contain the DNS name of the server as the CN portion of the subject. Returning to keytool, I regenerated my server certificate and stores using the following commands:

At this point, I was finally able to connect to my server and send the message. Is this reasonable? Probably not for my case. Both my client and server are within a corporate firewall, and controlled by the same IT staff, so to force this sort of gyration is really unreasonable. Can we do anything about it? Well, one thing that you can do is to provide a custom host name verifier like this:

When you do this, however, you should be aware that you give up the right to treat the connection as anything but an https connection. Note that we had to change the type of “conn” to HttpsURLConnection from its original type of HttpURLConnection. This means, sadly, that this code will now only work with secure http connections. I chose to use the DNS name in my URL, although a perfectly viable option would also have been the creation of a certificate containing the IP address as an “Alternate Name”.

Is This Okay?!

Ultimately, our client code will probably be embedded in some fairly robust and feature-rich security applications. Given this fact, we can’t really expect our customers to be okay with our sample code taking over the system properties for key and trust store management. No, we’ll have to rework this code to do the same sort of thing that Tomcat does–manage our own lower-level SSL connections, and thereby import certificates and CA chains ourselves. In a future article, I’ll show you how to do this.

Web service designers have tried for some time now to correlate CRUD (Create, Retrieve, Update and Delete) semantics with the Representational State Transfer (REST) verbs defined by the HTTP specification–GET, PUT, POST, DELETE, HEAD, etc.

So often, developers will try to correlate these two concepts–CRUD and REST–using a one-to-one mapping of verbs from the two spaces, like this:

Create = PUT

Retrieve = GET

Update = POST

Delete = DELETE

“How to Create a REST Protocol” is an example of a very well-written article about REST, but which makes this faulty assumption. (In fairness to the author, he may well have merely “simplified REST for the masses”, as his article doesn’t specifically state that this mapping is the ONLY valid mapping. And indeed, he makes the statement that the reader should not assume the mapping indicates a direct mapping to SQL operations.)

In the article, “I don’t get PUT versus POST” the author clearly understands the semantic differences between PUT and POST, but fails to understand the benefits (derived from the HTTP protocol) of the proper REST semantics. Ultimately, he promotes the simplified CRUD to REST mapping as layed out above.

But such a trivial mapping is inaccurate at best. The semantics of these two verb spaces have no direct correlation. This is not to say you can’t create a CRUD client that can talk to a REST service. Rather, you need to add some additional higher-level logic to the mapping to complete the transformation from one space to the other.

While Retrieve really does map to an HTTP GET request, and likewise Delete really does map to an HTTP DELETE operation, the same cannot be said of Create and PUT or Update and POST. In some cases, Create means PUT, but in other cases it means POST. Likewise, in some cases Update means POST, while in others it means PUT.

The crux of the issue comes down to a concept known as idempotency. An operation is idempotent if a sequence of two or more of the same operation results in the same resource state as would a single instance of that operation. According to the HTTP 1.1 specification, GET, HEAD, PUT and DELETE are idempotent, while POST is not. That is, a sequence of multiple attempts to PUT data to a URL will result in the same resource state as a single attempt to PUT data to that URL, but the same cannot be said of a POST request. This is why a browser always pops up a warning dialog when you back up over a POSTed form. “Are you sure you want to purchase that item again!?” (Would that the warning was always this clear!)

After that discussion, a more realistic mapping would seem to be:

Create = PUT iff you are sending the full content of the specified resource (URL).

Create = POST if you are sending a command to the server to create a subordinate of the specified resource, using some server-side algorithm.

Retrieve = GET.

Update = PUT iff you are updating the full content of the specified resource.

Update = POST if you are requesting the server to update one or more subordinates of the specified resource.

Delete = DELETE.

NOTE: “iff” means “if and only if”.

Analysis

Create can be implemented using an HTTP PUT, if (and only if) the payload of the request contains the full content of the exactly specified URL. For instance, assume a client issues the following Create OR Update request:

This command is idempotent because sending the same command once or five times in a row will have exactly the same effect; namely that the payload of the request will end up becoming the full content of the resource specified by the URL, “/GrafPak/Pictures/1000.jpg”.

On the other hand, the following request is NOT idempotent because the results of sending it either once or several times are different:

Specifically, sending this command twice will result in two “new” pictures being added to the Pictures container on the server. According to the HTTP 1.1 specification, the server’s response should be something like “201 Created” with Location headers for each response containing the resource (URL) references to the newly created resources–something like “/GrafPak/Pictures/1001.jpg” and “/GrafPak/Pictures/1002.jpg”.

The value of the Location response header allows the client application to directly address these new picture objects on the server in subsequent operations. In fact, the client application could even use PUT to directly update these new pictures in an idempotent fashion.

What it comes down to is that PUT must create or update a specified resource by sending the full content of that same resource. POST operations, on the other hand, tell a web service exactly how to modify the contents of a resource that may be considered a container of other resources. POST operations may or may not result in additional directly accessible resources.