I've given myself the task of writing code that determines the strength of a password, and really want to break out of a lot of already established ways we do that, as they're often lacking, not designed the right way, or quickly become irrelevant.

Generally, we'll see patterns where you enter a password, and a boolean "you need 7 chars with letters and numbers" is presented. Later, we had a reasonless graphic that shows a "strong to weak" scale ranking (and variations where this is a percentage). We've been getting better at this pattern, a lot better actually, where we think about showing user time to brute-force, explicitly showing reasons for why that score was assigned,

Now, I really like the presentation of these last two ideas, but that's not my primary concern here, and my question has nothing to do with presentation of this tool.

My question is what should a program that generates an estimated range of password security take into consideration?

After looking through the source of howsecureismypassword.net, we see it does a few neat things, like keep a list of top 500 common passwords, groups passwords by character classes that are involved, and relates a meta-score to an estimated time to crack a bunch of hash values in that range. There's some problems with this, such as the top 500 passwords don't reflected present "top 500," but rather the ones at time of writing the code, same goes for password cracking speeds, and attacks that may not end up using brute-force.

I see similar issues with passwordmeter.com, but more in the sense that it runs best-case rather than worst-case scenarios.

So my question is

What do I need to think about to measure a password's security score at time "now"?

Just a cautionary note about usability of passwords. Strong passwords are good only to the point at which people remember them and keep them confidential.
– this.joshJun 19 '11 at 3:08

One of the mathematically cool things about true randomness is that you can't measure it accurately. Because the kolmogorov complexity of a string is noncomputable, you can never say how random a string is. All you can say is how random a string isn't, based on the best compression of it you've found so far.
– user502Jun 20 '11 at 12:56

1

@user502 Thanks for that, searching for untruths is often a powerful problem solving tool that so many people forget to think about.
– IncognitoJun 20 '11 at 13:21

1

@user502 And with true randomness you could have a key of all ones or all zeroes, which we specifically try to avoid. You could even improbably have a hundred keys of all ones or zeroes. Thats why we like pseudorandomness: the great taste of randomness with half the calories!
– this.joshJun 23 '11 at 7:03

3 Answers
3

The best work in this area I've seen is by Matt Weir in Reusable Security: New Paper on Password Security Metrics (2010). He describes the difference between "Shannon entropy" and "guessing entropy". He also has an interesting method of taking a password for a user, analyzing it, and offering suggestions to make it better:

....other methods for password creation policies, including our
proposed method to evaluate the probability of a human generated
password by parsing it with a grammar trained on previously disclosed
password lists. This allows us to build a more robust reject function
compared to a simple blacklist, while attempting to provide the most
user freedom possible, given the security constraints of the system,
when selecting their passwords.

” As
applied to a distribution of passwords the guessing entropy is, roughly speaking, an
estimate of the average amount of work required to guess the password of a selected user,
and the min-entropy is a measure of the difficulty of guessing the easiest single password
to guess in the population.

I think you should account for the various ways in which a password is actually attacked, which is going to take some research. Obviously a password that is wholly composed of an exact match to a common password should have a "very weak" strength (or be disallowed outright). You should probably expand this by searching out "default" or commonly used word lists that the script kiddies will definitely try when cracking a password; there are readily available lists of tens of thousands of passwords (or more) that will definitely be tried against your users if your database is ever leaked. So include those in your "common password" check.

But crackers certainly won't limit themselves to a simple "exact match" search, so neither should your strength meter. Research common patterns used by crackers, such as combining two words from the password dictionary, or substituting letters for numbers or other "1337 speak" types of substitutions (e.g. "p@ssw0rd$ 4r3 4w3$0m3!1"). The howsecureismypassword.net site you mention fails here: it rates "passwordpassword" as taking "345 thousand years" to crack, which is absurdly wrong. I'd guess that will fall in less than a second. These are not the only rules to consider; many passwords follow very simple patterns like {capital letter}{6 lowercase letters}{number}! which are far less secure than 9 random characters (but still slightly better than a simple dictionary match). A variety of such common patterns will also be tried before resorting to brute force.

How you handle these transformations or word matches is up to you; but regardless they should be accounted for somehow. One thing to look into, is how have open-source tools handled this?

As an example, the quality estimation function for KeePass password manager reportedly handles this by calculating an entropy based on how many patterns are used to create the password, and the strength of that pattern, rather than using character counts, if patterns are detected. In older versions of the software, a naive count-based entropy was simply penalized based on recognized patterns. You could probably do well with either method. The trick will be to keep the patterns up-to-date with advances in cracking, but at the very least accounting for the really basic stuff could have drastic improvements in the strength of your users' passwords, especially if you explain in the interface what common patterns their password is going to be guessed by.