For example, if hypothetically an application requirement was to tell users when their password might have failed because caps-lock was turned on, and hypothetically there was no way for the application to actually know that caps-lock was turned on, is there any inherent security risk in storing a hashed password as well as a hashed "caps lock" version of a password so that a failed password can also be compared against the "caps lock" version?

If a database containing user accounts with this setup was breached, would the existence of the "caps lock" password hash in addition to the normal password hash make the passwords any more vulnerable than they would be otherwise?

Note this is a hypothetical and I'm interested more in the security implications than whether this is a good programming practice.

One quick thought: If the two stored hashes (with and without the caps lock transformation) are the same, we can know that the user has no letters in their password. (Or at least has no lower case letters in their password depending on how you do the transformation.)
–
LadadadadaMay 10 '12 at 6:24

6 Answers
6

Interesting question. I guess it depends which and how many versions you are storing. The main concern in doing so is how much you are reducing the search-space / entropy.

If, for example, the caps-lock version of the password is an all-uppercase version of the password, then by storing it this way, you are reducing the entropy of the password. The attacker in this case only has to search using the caps-lock version using only uppercase character-set. That gives them a significantly easier job.

If however, the caps-lock version is simply a case-reverse of the letters, i.e. normal password == aXbJklMP and caps-lock version == AxBjKLmp, then you're probably not hugely reducing the search space. There are still two possible passwords instead of one, making the search twice as likely to find one of those passwords though...

Another approach, if you do wish to allow different password variations.

When a user logins to your application, you would normally get a plaintext version of the password (hopefully over SSL) and then hash it and compare it with your stored hash value.

There's therefore no real need to store different hash versions. You can store one password hash, but when the user tries to login, you perform those transformation on the user-provided password at the time of login and then hash those different variations. e.g. if the provided password failed to compare against the hash, apply the caps-lock transformation, hash and then check again against the stored hash. If it then matches you can decide to give the user access to your application.

This way you store only one (canonical) version of the hash, yet you allow several permutations of the same password.

In terms of attack scenarios:

Offline attack against your password hashes - remains the same

Online attack trying to authenticate with different passwords - this will become easier relative to the number of possible permutations

However it should usually be easier to limit the number of possible remote brute force or password guess attempts to make it sufficiently hard. You can read more about some techniques here that you should probably implement anyway.

I'd still be very careful with what possible permutations are supported and measure how much they might reduce password entropy of course, but at least you don't need to store multiple hashed versions to do this.

I think you want to undo the caps-lock transformation, not apply it again. (This might be different.)
–
D.W.May 10 '12 at 6:04

I just made a comment on the other answer. As far as my keyboard seems to work, caps-lock simply reverses the case of letters, so applying a caps-lock transformation should be the same as undoing it. But that's a small nuance not important to the discussion anyway.
–
Yoav AnerMay 10 '12 at 6:08

If the two stored hashes (with and without the caps lock transformation) are the same, we can know that the user has no letters in their password. (Or at least has no lower case letters in their password depending on how you do the transformation.)

Since Mac and Windows (and Linux) handle caps lock differently, you will have to store two variations of the caps lock transformation hash: the case reversal (Windows / Linux) and the all uppercase (Mac). Now we can determine even more about the password from the hashes.

If all three hashes are the same, the user has no letters (upper case or lower case) in their password.

If the Windows transformation hash is different but the Mac transformation hash is the same as the main hash then the user has upper case letters in their password but no lower case letters.

If all three hashes are different, the user has both upper and lower case letters in their password.

Another thought: Is it safe to apply the same salt to all three variations? Although I think it should be, I don't see any significant advantage in doing so. If you apply different salt to all three hashes, I think the above information leaks are negated. Note that I am not a cryptographer, although there shouldn't be any information leakage when hashing three related and possibly identical passwords with three different salts, it may require a longer salt to be secure.

Other problems:

Since we have three password hash variations that need to checked whenever a user is logging in, all three hashes will need to be calculated. If you are using bcrypt (or similar) with an appropriate number of hashing rounds, you will want to reduce that number of rounds so that the time taken is roughly 1/3rd of the old value. (On the assumption that you set the number of rounds to the highest you can handle and you now need to calculate three times the number of hashes per login.)

The reason you need to calculate all three hashes even if the first or the second one match is that if you don't it enables credential enumeration via timing attacks. In theory, to prevent credential enumeration you should provide exactly the same response whether the user doesn't exist or it does but the password is wrong. If you return an identical response for both cases but it takes milliseconds when the user doesn't exist and a whole second when the user does exist you have still enabled credential enumeration.

All up, I think it should be relatively safe if you use different salts for all three hashes, always calculate all three variations of the supplied password and reduce the number of rounds so that the timing is reduced by 1/3rd.

However, it might be simpler to only store one hash, check the user-supplied password for lower case letters and if there are none and the hash doesn't match, suggest to them that they might have caps-lock on.

Late thoughts

There's no need to store three different hashes as I suggested earlier if you are planning on accepting the Mac-style transformation. Just uppercase all passwords before they are hashed and only store the all-uppercase version. This also prevents the timing attacks since you only need to calculate one hash per login.

By doing this you have reduced the number of possible letters/numbers/symbols that people can use by 26 so you might want to suggest you users use a slightly longer password to compensate.

Storing a hash of the allcaps version seems like a bad idea. On the other hand, informing a user that the password attempt he has just entered contained mostly uppercase letters and no lower case ones would pose very little security risk (the only danger would be letting shoulder-snoopers know that).
–
supercatMar 18 '14 at 23:43

It so happens that storing both hashes, or reversing the capslock effect and trying both versions, really reduces security by a factor which is between 1 and 2. It is a bit tricky to see, so let's define things clearly.

I am using the caps-reverse notion of capslock. If the capslock effect is really an everything-is-uppercase effect, then you should never accept or store that all-uppercase version; instead, return a warning as @D.W. suggests.

I assume that you use a slow and salted hash function like bcrypt (if you do not then that is a bigger issue and you should fix it first). The "slow" part is configurable with an iteration count that you raise as much as you can, based on two constraints: the CPU budget for the hashing (depending on how much free CPU you have and how many client connections per second), and the user patience (which is never very high). The cost for the attacker is directly proportional to the iteration count. If you store two hash values (for the "normal" and "capslocked" version of the password), then both must have their own salt.

When you "try" the password sent by the user, and then "try again" with a capslocked version of the same password, then you are actually hashing twice, so there is an overhead on your constraints. On average, the CPU effort will be multiplied by 1+f, where f is the proportion of wrong passwords (f = 0 if all the users type perfectly, f is very close to 1 if all the users are chimpanzees who must try a dozen times before typing their password correctly). Also, every time you have to try the capslocked version of the password, then the user has to wait twice longer before being granted access (if the capslocked version turns out to be correct) or being ignominiously rejected (if the password is really wrong, capslock or not). To some extent, average users are a bit more understanding about delays when they feel it is their fault, because they typed wrong, but I would not count on it.

The net effect is that testing for two versions of the password increases the cost by a factor which is between 1 and 2; correspondingly, you must then decrease the iteration count by that factor, and the attacker's effectiveness is multiplied by that factor.

This is really a trade-off between user experience (which means "helpdesk costs" in many cases) and security. If possible, it is probably better to detect that the capslock is pressed and warn the user before the password is entered; however, this may prove difficult (I know a site which succeeds in doing that when the client is Internet Explorer, but fails with Chrome and Firefox; instead, it emits visible warnings for each uppercase letter, which is quite bad because the warnings are visible from afar, so that's a leak of information about the password).

@Yoav Aner has totally nailed it about the security impact of storing both the hash of the password and the hash of the capslock of the password. Also, @cx42net has a brilliant suggestion: use Javascript on the client side to check whether capslock is on, and if so, warn the user.

I don't have anything to add to those answers, so let me instead try to expand on a different approach (inspired by a suggestion from @Yoav Aner). For this, I need to know whether capslock works by uppercasing all letters that you type, or whether it works by reversing the case of all letters that you type. It appears the answer depends upon the particular client, so I'll give you an answer both ways; you might need to apply both, if you're not sure about which kind of system the client is using.

If capslock uppercases all letters: Store just the hash of the password in your database. When you receive a candidate password P from the client, can perform the following steps on the server to validate the password:

Compute Hash(P) and see if it matches the user's password hash (as stored in the database). If yes, mark the user as authenticated; you're done. Otherwise, continue to step 2.

Check whether P has any lowercase letters in it. If yes, reject the authentication attempt. If not (if all letters in P are in uppercase), reject the authentication attempt but warn the user to check whether they have capslock on and ask them to try again.

This scheme certainly does not negatively impact the security of the system in any way.

If capslock reverses the case of all letters: Store just the hash of the password in your database. When you receive a candidate password P from the client, can perform the following steps on the server to validate the password:

Compute Hash(P) and see if it matches the user's password hash (as stored in the database). If yes, mark the user as authenticated; you're done. Otherwise, continue to step 2.

Let P' denote the result of reversing the case of all letters in P. Compute Hash(P) and see if it matches the user's password hash (as stored in the database). If yes, mark the user as authenticated but warn them to check whether they have capslock on. Otherwise, reject the authentication attempt.

This scheme reduces the effort needed for an online password guessing attack by a factor of 2; not a big deal. It does not reduce the security of the system against offline attacks at all.

Addendum: Here is a crazy idea you could try if capslock uppercases all letters. I don't recommend it, but I mention it just because it is so crazy.

Store just the hash of the password in your database. When you receive a candidate password P from the client, can perform the following steps on the server to validate the password:

Compute Hash(P) and see if it matches the user's password hash (as stored in the database). If yes, mark the user as authenticated; you're done. Otherwise, continue to step 2.

At this point, you're going to test the hypothesis that maybe P is the capslock-ified version of the user's password. Check whether P has any lowercase letters. If P has any lowercase letters, then reject the authentication attempt and stop. Otherwise, continue on to step 3, where you will continue to test this hypothesis.

You will try to un-capslock-ing P, and then check the up-capslocked version. Unfortunately, there are many possible ways to un-capslock P, so you'll have to try them all. In particular, suppose P has letters in n positions, with all of those letters in uppercase. Try all 2n subsets of those positions. For each subset, let P' denote the result of taking P and then lowercasing just the letters at that subset of position; compute Hash(P'); check if Hash(P') matches the user's hashed password; if it does, mark the user as authenticated and warn them to turn off capslock; otherwise continue on to the next subset. If no subset is successful, reject the authentication attempt.

This is pretty crazy, for all sorts of reasons. It reduces the security of online guessing attacks fairly substantially (by a factor of 2n, for passwords with n letters in them). It doesn't reduce the security of offline guessing attacks. But it also significantly increases the complexity of the server. And given the better answers mentioned elsewhere, there seems to be no reason to use this crazy scheme.

I don't think this is necessary and moreover, it can reduce the security of your application.

For the necessary part, I don't know how you identify your user, but if it's a client/server authentication and the client sends you the password in clear to the server (the server, then, hash the password to compare it with the database), you could check the uppercase here :

(Security note: the security of the above scheme assumes you implement hash_and_find_pwd_in_db() by hashing the password and comparing it against a stored hash value in the database. This way your database only stores password hashes, not cleartext passwords.)

About performance: Imagine you have a "uppercase" version, a "lowercase" version, etc etc. You will have to check the original password hashed, against the correct one, if it isn't found, you'll have to it the database as many times as alternatives you have. Just to let the user know that what he entered is invalid for a specific reason. I'm sure it's not efficient at all.

Now, as @Yoav Aner said, writing aXbJklMP in caps lock mode will result in this string : AxBjKLmp. That mean, you'll have to implement a invertcase function that change the case of each letter to match the hypothetic caps lock mode, without any certitude.

Good answer. I just posted another answer which pretty much says the same as you did about applying those transformations on the server. Regarding your last comment however, it's not correct (at least on my keyboard). Try to turn your caps-lock on and enter those letters as you would normally... My caps-lock simply reverses lower and upper case.
–
Yoav AnerMay 10 '12 at 6:05

@D.W. - not necessarily, if find_pwd_in_db hashes the password and compares it against a stored hash value then you don't need to store passwords in plain-text.
–
Yoav AnerMay 10 '12 at 6:06

cx42net, your idea of using Javascript to check whether capslock is enabled is a great solution -- I like that a lot! Sweet.
–
D.W.May 10 '12 at 6:23

I updated my answer by changing the find_pwd_in_db to hash_and_find_pwd_in_db to make it more clear. I also updated the secure part matching the comment of @Yoav Aner (you're right, it invert the case).
–
Cyril N.May 10 '12 at 7:17