Apr 9, 2011

Nowadays there's a general reluctance to introduce (more) server-side session state because of scalability. And there's specific reluctance to session state in RESTful web services, due to design principles.

In the stateless requirement of REST we read:

"The client–server communication is constrained by no client context being stored on the server between requests. Each request from any client contains all of the information necessary to service the request, and any session state is held in the client. The server can be stateful; this constraint merely requires that server-side state be addressable by URL as a resource." [Wikipedia]

This is a tough requirement, especially if we want features such as authentication and sessions.

So, can we have session ids without server-side session state? Yes.

The Relation Between Sessions and Authentication
There's often a tight relationship between authenticating users and holding their sessions. Anonymous sessions are not very sensitive whereas authenticated sessions have to be protected against hijacking, fixation, forging, and replay. Actually, a valid session token authenticates the session, so you're basically authenticating yourself every request. Which leads us to the first stateless session solution ...

No Sessions => Authenticate Every Request
If session ids are in fact authentication tokens we might as well use the mental model of no sessions, instead authenticate each request. The old HTTP Basic Authentication does this by storing your username and password for subsequent requests. But there are more advanced versions such as authentication in Amazon's S3 REST API.

They use a custom HTTP scheme based on a keyed-HMAC (Hash Message Authentication Code). To authenticate a request, you first concatenate selected elements of the request to form a string. You then use a shared "AWS Secret Access Key" to calculate the HMAC of that string, i.e. you sign the request. Finally, you add this signature as a parameter of the request.

Canonicalization of course requires some processing such as converting headers to lower-case and sorting them lexicographically. The date works as a timestamp and narrows the replay window.

The server then does the same signing with the shared secret associated with the AWSAccessKeyId. So we're switching from server-side session state to more cycles and latency on both client and server.

Worth noting:

Signed requests are much stronger than mere session ids. Cross-site request forgeries will be mitigated with this scheme.

By authenticating all requests with a shared secret we don't have any time-bound sessions or timeouts. Just fire whenever you want.

The persistent shared secret is much more sensitive than a temporary session id. A cross-site scripting attack will steal the shared secret which is much worse than session hijacking. This means the scheme is less suitable for browsing sessions and more suitable for machine-to-machine communication.

The salt is used for all sessions but only valid for a certain timeframe, say 15 minutes. A new salt is produced every 5 minutes and incoming session ids produced with the previous but still valid salt will be exchanged for a new session id with the fresh salt. That means a session timeout of 15-5=10 minutes.

If we truly want to go stateless we cannot kill such a session since that would require a server-side table of revoked session ids. So in the stateless case an attacker will have a 15 minute replay window in which he/she will refresh the session and have endless access.

Stateless, Encrypted Session ID
By just storing a server-side symmetric crypto key we can effectively decrypt incoming session IDs and trust their contents. Imagine a cookie based on:

This means we don't have to store each session ID. Instead we pay the price of decrypting incoming cookies and checking that the timestamp is within a timeframe, say 15 minutes. For all incoming session IDs older than 5 minutes we regenerate a new cookie to effectively run a 15-5=10 minute session timeout window.

Again, if we want to go stateless we cannot kill such a session since that would require a server-side table of revoked session ids. So an attacker will have a 15 minute replay window in which he/she will refresh the session and have endless access.

Conclusion
There are three competing parameters to prioritize between:

Server CPU cycles per request

Server-side session state

The replay window

The tradeoff between CPU cycles and memory footprint will change with new technologies such as non-blocking IO in node.js. So yesterday's best practice might not be valid today.

The difference between regular session hijacking and hijacking of stateless session ids is that successful theft of a stateless session id authenticates the attacker even if the victim has logged out. Remember, the server doesn't store the session state. And even if the server would store the boolean isLoggedIn for each user, an old session id will still be valid if the user logs in again, as long as it hasn't timed out.

So ask yourselves what your tradeoff between CPU cycles and server-side state is. Then consider the replay+refresh leverage of a successful cross-site scripting attack.

String.prototype.replace, i.e. 'yourString'.replace(), only replaces the first instance of the regexp. So beware. Twitter made the mistake and got vulnerable because of it. Read about it and a suggested patch:

A nice webcast on REST design. For instance brings up the idea of session ids with constant state on the server. But as always, I wonder when the CSRF storm is going to hit all these REST services out there?