One use of an anti-forgery token is to prevent Cross-Site Request Forgery (CSRF) attacks. The attacker doesn't need to sniff the wire in order to carry out a CSRF attack. This attack relays on the fact that the HTTP request is predictable, and the defense is to add a secret value such that an attacker cannot forge this request.

In order for each request to be validated the server must keep track of each client's secret. This requires the server to use significant resources to keep track of this state for a large number of clients.

Is there a cryptographic method that can prevent forging of requests that also very efficient?

1 Answer
1

If you don't want to store the anti-CSRF tokens on the server, for most purposes it is sufficient to simply store the token as an HTTP cookie on the client. The OWASP wiki calls this technique "Double Submit Cookies".

The reason this works is that, in the standard CSRF attack scenarios, the attacker cannot directly read or modify the user's cookies. Indeed, if the user's authentication credentials are also stored in cookies (as is very commonly done in modern web applications), any leak of cookie data already implies a much more fundamental security failure than a mere CSRF attack.

Of course, an attacker might be able to gain (full or partial) access to the user's cookies via an XSS or other injection attack or through session fixation, but this is mostly outside the scope of CSRF prevention and must be addressed by other means. (One exception to this are login CSRF attacks, which can be used to carry out session fixation; the solution there is to ensure that authentication entry points are also protected with anti-CSRF tokens.)

One possible improvement on the basic Double Submit Cookie technique I might suggest would be to let the anti-CSRF token included in the submitted form data be derived from the cookie value using a secure MAC (such as HMAC) with a secret key stored on the server. This has a few advantages over simply using the plain cookie value as the token:

Merely learning (or guessing) the cookie will not allow an attacker to mount a CSRF attack, since they won't be able to derive the token from the cookie without knowing the key.

Conversely, leaking the token will not allow an attacker to reconstruct the cookie; in particular, this means that the user's normal authentication credentials may be safely used to derive the anti-CSRF token, without need for a separate cookie.

By including additional data, such as a form identifier and/or a timestamp, in the MAC input, the token can be made specific to that additional data. This can compartmentalize the attack surface e.g. by ensuring that an attacker who learns the token for one form will not be able to attack another form in the same application, and that they will be able to carry out attacks only for a limited time.

Edit: Something I forgot to clearly mention in my original answer is that, if you derive the anti-CSRF token using a MAC, you don't necessarily need a cookie at all — all you need is something that is unique to the user. If you're using cookie-based authentication, using the user's session ID / auth token as the MAC input is a natural choice, but in principle you might as well use e.g. the username instead.

Of course, it's desirable for anti-CSRF tokens to be short-lived, so if you're using a long-term stable value, such as the user name or ID, as the MAC input, I'd very much recommend combining it with a timestamp. Otherwise each user would be tied to a single token value, such that leaking the token would make the account permanently insecure. A convenient trick, if you don't want to include explicit timestamps in your forms, is to take some value that changes, say, hourly (e.g. the Unix timestamp divided by 3600), include that in the MAC calculation, and accept any tokens that match the MAC for the current or the previous value.

OK, I guess I should describe the MAC-based scheme explicitly, since it really doesn't bear very much resemblance to the original Double Submit Cookie scheme any more.

Let:

$K$ be a secret key known only to the server (guard this well, and change it if you have any reason to suspect it might have leaked),

$\operatorname{MAC}_K$ be a secure Message Authentication Code (e.g. HMAC) with the key $K$,

$U$ be a unique per-user (or per-session) value which may or may not be public, such as a session ID, a username, etc.,

$S$ be a timestamp of appropriate granularity (either explicitly included in the form data, or implicitly included using the trick described above),

$F$ be an (optional) form / action identifier, and

$D$ be some (optional) extra data included in the form (or used to constrain the form and reconstructible on submission, such as a list of allowed choices in a menu) that you want to protect against modification.

When creating a form, calculate the anti-CSRF token $T$ as $$T = \operatorname{MAC}_K([U, S, F, D])$$ where $[U, S, F, D]$ denotes a string uniquely encoding the tuple $U$, $S$, $F$, $D$. Include $T$ (and $S$, if not using the timestamp trick described above) as a hidden field in the form. (If you have plain links that need CSRF protection, you can use the same method; in that case, it's often perfectly reasonable to use the entire URL — excluding the token, of course — as $D$.)

When you receive a form submission, first check that the user is properly authenticated and that the timestamp in the form (if any) is not too old. If those checks pass, recalculate $$T' = \operatorname{MAC}_K([U, S, F, D])$$ and compare it to the token $T$ included in the submitted form, accepting the submission only if they match. (If using the implicit timestamp trick, you should also calculate $$T'' = \operatorname{MAC}_K([U, S-1, F, D])$$ and accept the form if either $T = T'$ or $T = T''$.)

Of course, a clever user will notice that there's nothing special about the $F$ and $D$ values in the MAC calculation — they're just arbitrary inputs that, if changed, change the MAC value in a way which, as long as the MAC is secure, an attacker cannot predict without knowing $K$. You may also notice that the optional inclusion of the extra data $D$ in the MAC calculation makes this technique more powerful than just a classic anti-CSRF token, since it can prevent certain attacks (namely, those involving modification of $D$) even if the attacker has access to the entire form, including the token $T$. I included this feature in the spec above since it can sometimes be useful, and since we get it essentially for free, at no extra cost in complexity. Of course, you don't have to use it if you don't want to.

Note: Ensuring that an attacker cannot recover $U$ even if they learn $T$ requires the MAC function to satisfy some additional security properties beyond the standard ones (resistance to existential forgery) demanded of such functions. However, in practice, most secure MACs (such as HMAC instantiated with a preimage resistant hash function) do have the necessary properties, and in any case, the standard CSRF attack model assumes that the attacker cannot learn $T$. Still, if you're really paranoid, you may want to choose $U$ such that leaking it will not compromise security.

It requires some per-user identifier, as indeed any anti-CSRF scheme must. With the MAC-based approach, you could, in principle, use e.g. the user name as the identifier. (However, if you do that, I'd strongly suggest including a timestamp in the MAC input, so that a given user will not have the same token forever.)
–
Ilmari KaronenApr 6 '12 at 18:33