Assumptions / details

There is a secret random value that won't be known outside of the application.

Once a password reset token is used by a user to reset their password, all available tokens for a user are deleted from the database.

Outline

Note that || here is concatenation. The approach given is a modified version of the Devise authentication library. Specifically, the modification is that there are now multiple tokens.

Let S be a fixed secret picked ahead of time.

To construct a reset token for a user U, generate a (secure) random number r and consider the pair

(id, r)

where id is picked from an incrementing counter value. Generate

S' = PBKDF2(S)

with a salt and let S' be the key for the HMAC value

h = HMAC(S', id || r).

Store the reset information

(U, id, h, e),

with e the expiration time, before sending (id, r) to the user.

When someone tries to reset their password, they give us user-provided values for the id and for r:

(id_user, r_user).

Use id_user to find the previously stored (U, id, h, e) information.

If id_user corresponds to no such information, then there is nothing to do. If e is in the past, then the token has expired.

Generate S' = PBKDF2(S) again, and compare

HMAC(S', id_user || r_user)

with the previous value of h, which was

HMAC(S', id || r).

If these two values match (see note below), then the token is valid and we allow the user U to reset their password. When they do, all stored reset information (the (U, id, h, e) entries for the given U) is deleted.

Notes

Hopefully the first few requirements are obvious met (multiple tokens, etc.). The main one worth explaining is probably: someone with read-only access to a database of reset information cannot easily reset a user's password.

Because the token value itself is not stored in a database, only the corresponding HMAC, someone who can read the database cannot construct a valid reset password URL.

As for the equality checking above: if the user tampers with the values of (id, r), they will generate a radically different value of h which should avoid exposing timing information.

The inclusion of S' = PBKDF(S) (with a salt) is to make the generation of a HMAC value more costly, since a token should have similar properties to passwords.

Allowing for predictable random numbers

If the random number generator ever produces a number that an attacker can predict, then it's easy enough to find the correct ID value, and thus get a valid reset link.

To be paranoid about it (the system in question involves various pieces of personal information) and get around the issue of a predictable random number, we can try sending over the HMAC value to the user so we can verify authenticity.

Before, we were sending

(id, r)

to the user, which was an incrementing ID counter value and the random number. Instead, send the user

(id, r, h).

When handling a reset request, we'll receive

(id_user, r_user, h_user).

We can take this value and verify that neither id_user, nor r_user have been tampered with, by checking h_user is equal to HMAC(S', id_user || r_user).

With that verification step completed, we can use id_user to find (U, id, h, e) and continue as before.

If the random number r (and thus (id, r)) is ever predicted by an attacker, they won't know how to generate a valid HMAC, since it requires S' which is secret and the pair (id, r) has never been seen before.

Note that we don't ever use h_user to look up the (U, id, h, e) information, so the database shouldn't ever expose a timing attack in its look-up algorithm.

Weirdness and questions

Is it actually secure against timing attacks, replay attacks, etc.? This feels borderline roll-your-own-crypto. Are there issues?

Is there a vastly simpler way?

What security properties are reduced by allowing an attacker to generate an arbitrary number of reset tokens, since we're allowing multiple tokens to exist and anyone can send a reset?

Is it unreasonable to try and defend against an ostensibly secure random number generator's hypothetical future bugs? (Edit: reworded for clarity.)

If this scheme is secure against predictable r values, then isn't it equivalent to using only (id, h) provided that id is turned into a GUID/UUID?

Update:

I suspect it actually is unreasonable to try and defend against, e.g., OpenSSL randomness bugs, which leads me to another issue with the above: what's wrong with simply storing (U, PBKDF2(r)) and sending r to the user with no HMAC? If there are enough bits in r then isn't this secure enough given that tokens expire within <= 6 hours?

2 Answers
2

That is, the server calculates an authenticator that allows id_user to change their secret from S before deadline. The key used should be only known by the server and can be changed often, since changing it will only invalidate currently active reset tokens.

(You could use S as the HMAC key, but then the reset token would allow brute force attacks on S, which might be undesirable. Now it doesn't allow them without knowing the server key.)

To actually reset a password, the user must provide the above token, their id_user (can be part of the token if needed) – as well as a new S, of course.

The server doesn't need to have a database of reset tokens at all, because all the information they need is included in the token the user sends. It verifies that the deadline is in the future, then recalculates the HMAC and compares to the one in the token.

Your requirements were:

Multiple tokens per user, to allow for email delays to not cause confusion.

Tokens are single-use and expire.

Someone with read-only access to a database of reset information cannot easily reset a user's password.

The server can create any number of reset tokens and the user can use any before its deadline.

If the user changes their S, even still valid tokens will no longer work.

No reset information database, so not an issue.

Replay attacks don't work due to the same reasoning as 2. To avoid timing attacks all comparisons the server does should be constant time. Completely deterministic, so RNG doesn't matter.

Advantages over your original proposal:

Simpler to implement and reason about.

Faster since there's no PBKDF2.

No database that can be attacked.

No per-token randomness required.

Your update is much simpler, which is good, but the other three factors remain. You can also fix "faster" by making r large enough that you need no PBKDF2, but that makes it use even more randomness if you find that problematic.

@RickyDemer, it's in the last "update" paragraph of the original question.
–
otusJul 21 '14 at 18:14

This is quite clever. $\:$ On the other hand, it requires that the reset component have access to the password-hash component, which would stop one from isolating the latter. $\:$ Related to that, RNG does matter, since if the token validity period is decreased by the same amount of time as passed between two token requests by the same user, then an eavesdropper on however the tokens were sent can tell whether or not the same password-salt combination was reused. $\;\;\;\;$
–
Ricky DemerJul 21 '14 at 18:30

@RickyDemer, true about isolation, but I don't really see how you could really isolate them anyhow, when a password reset eventually needs to change the stored password hash. I'm not sure I follow your RNG note, but it would probably be best if the system enforced that every token has strictly later deadline than the ones before so no two hashes will ever match.
–
otusJul 21 '14 at 18:42

$\;\;\;$ Probably as long as you compare securely, although you're applying
$\;\;\;$ a pbkdf to what should be a uniformly random long secret key.

$\;\;\;$ Don't bother computing S'; let S be a uniformly random secret key and use it instead.
$\;\;\;$ To construct a reset token for a user U, set h = HMAC(S, 0 || (id,U,e) ),
$\;\;\;$ set H = HMAC(S,1||h), and store the reset information (id,e,H,U), before sending
$\;\;\;$ (id,h) to the user. $\:$ When someone tries to reset their password, they give you user-provided
$\;\;\;$ values (id_user,h_user) for id and for h respectively. $\:$ Use id_user to find the
$\;\;\;$ previously stored (id,e,H,U) information. $\:$ If id_user corresponds to no such information,
$\;\;\;$ then there is nothing to do. If e is in the past, then the token has expired.
$\;\;\;$ Compare HMAC(S, 1 || h_user ) with H, which was HMAC(S, 1 || h ).
$\;\;\;$ If these two values match, then the token is valid ... . $\:$ ...
$\;\;\;$ If the reset being tied to an id is merely incidental rather than also
$\;\;\;$ a desired feature, then you can use (U,e) instead of (id,U,e).

$\;\;\;$ You're only losing quantitative security. $\:$ (An adversary can make more queries to the HMAC.)

$\;\;\;$ No, since that can be done very easily.

$\;\;\;$ It isn't equivalent to using only (id, h) even if id is turned into a GUID/UUID,
$\;\;\;$ since there can be a replay whenever you reuse whatever id was a GUID/UUID of.

As a follow-up to (4) I mean that the random number generator is ostensibly secure but the argument is that it's coming from a user space algorithm (OpenSSL). So my question was more: Is it unreasonable to try and work around an ostensibly secure RNG's hypothetical bugs? Also the supposedly simpler strategy you have seems more complex...
–
Adam PrescottJul 21 '14 at 0:17

For (5) too, I'm a little confused: wouldn't the contract of a UUID value be that it shouldn't ever be reused anywhere universally? (As in the RFC notion of UUID.)
–
Adam PrescottJul 21 '14 at 0:19

I'm less suited to address the follow-up to (4) than a large number of other users on this site; I also think that should be a separate question (though perhaps mentioning this application). $\:$ The RFC on UUIDs states that "The requirements for" name-based "UUIDs are as follows: The UUIDs generated at different times from the same name in the same namespace MUST be equal. ...". $\;\;\;\;$
–
Ricky DemerJul 21 '14 at 0:33