With that disclaimer — yes. I’m convinced these meters have the potential to help. According to Mark Burnett’s 2006 book, Perfect Passwords: Selection, Protection, Authentication, which counted frequencies from a few million passwords over a variety of leaks, one in nine people had a password in this top 500 list. These passwords include some real stumpers: password1, compaq, 7777777, merlin, rosebud. Burnett ran a more recent study last year, looking at 6 million passwords, and found an insane 99.8% occur in the top 10,000 list, with 91% in the top 1,000. The methodology and bias is an important qualifier — for example, since these passwords mostly come from cracked hashes, the list is biased towards crackable passwords to begin with.

These are only the really easy-to-guess passwords. For the rest, I’d wager a large percentage are still predictable enough to be susceptible to a modest online attack. So I do think these meters could help, by encouraging stronger password decisions through direct feedback. But right now, with a few closed-source exceptions, I believe they mostly hurt. Here’s why.

Strength is best measured as entropy, in bits: it’s the number of times a space of possible passwords can be cut in half. A naive strength estimation goes like this:

This brute-force analysis is accurate for people who choose random sequences of letters, numbers and symbols. But with few exceptions (shoutout to 1Password / KeePass), people of course choose patterns — dictionary words, spatial patterns like qwerty, asdf or zxcvbn, repeats like aaaaaaa, sequences like abcdef or 654321, or some combination of the above. For passwords with uppercase letters, odds are it’s the first letter that’s uppercase. Numbers and symbols are often predictable as well: l33t speak (3 for e, 0 for o, @ or 4 for a), years, dates, zip codes, and so on.

As a result, simplistic strength estimation gives bad advice. Without checking for common patterns, the practice of encouraging numbers and symbols means encouraging passwords that might only be slightly harder for a computer to crack, and yet frustratingly harder for a human to remember. xkcd nailed it:

As an independent Dropbox hackweek project, I thought it’d be fun to build an open source estimator that catches common patterns, and as a corollary, doesn’t penalize sufficiently complex passphrases like correcthorsebatterystaple. It’s now live on dropbox.com/register and available for use on github. Try the demo to experiment and see several example estimations.

The table below compares zxcvbn to other meters. The point isn’t to dismiss the others — password policy is highly subjective — rather, it’s to give a better picture of how zxcvbn is different.

qwER43@!

Tr0ub4dour&3

correcthorsebatterystaple

zxcvbn

Dropbox (old)

Citibank

Bank of America

(not allowed)

(not allowed)

(not allowed)

Twitter

PayPal

eBay

(not allowed)

Facebook

Yahoo!

Gmail

A few notes:

I took these screenshots on April 3rd, 2012. I needed to crop the bar from the gmail signup form to make it fit in the table, making the difference in relative width more pronounced than on the form itself.

zxcvbn considers correcthorsebatterystaple the strongest password of the 3. The rest either consider it the weakest or disallow it. (Twitter gives about the same score for each, but if you squint, the scores are slightly different.)

zxcvbn considers qwER43@! weak because it’s a short QWERTY pattern. It adds extra entropy for each turn and shifted character.

The PayPal meter considers qwER43@! weak but aaAA11!! strong. Speculation, but that might be because it detects spatial patterns too.

Bank of America doesn’t allow passwords over 20 characters, disallowing correcthorsebatterystaple. Passwords can contain some symbols, but not & or !, disallowing the other two passwords. eBay doesn’t allow passwords over 20 characters either.

Few of these meters appear to use the naive estimation I opened with; otherwise correcthorsebatterystaple would have a high rating from its long length. Dropbox used to add points for each unique lowercase letter, uppercase letter, number, and symbol, up to a certain cap for each group. This mostly has the same only-works-for-brute-force problem, although it also checked against a common passwords dictionary. I don’t know the details behind the other meters, but a scoring checklist is another common approach (which also doesn’t check for many patterns).

I picked Troubadour to be the base word of the second column, not Troubador as occurs in xkcd, which is an uncommon spelling.

Installation

zxcvbn has no dependencies and works on ie7+/opera/ff/safari/chrome. The best way to add it to your registration page is:

<script type="text/javascript" src="zxcvbn-async.js">
</script>

zxcvbn-async.js is a measly 350 bytes. On window.load, after your page loads and renders, it’ll load zxcvbn.js, a fat 680k (320k gzipped), most of which is a dictionary. I haven’t found the script size to be an issue; since a password is usually not the first thing a user enters on a signup form, there’s plenty of time to load. Here’s a comprehensive rundown of crossbrowser asynchronous script loading.

zxcvbn adds a single function to the global namespace:

zxcvbn(password, user_inputs)

It takes one required argument, a password, and returns a result object. The result includes a few properties:

result.entropy # bits
result.crack_time # estimation of actual crack time, in seconds.
result.crack_time_display # same crack time, as a friendlier string:
# "instant", "6 minutes", "centuries", etc.
result.score # 0, 1, 2, 3 or 4 if crack time is less than
# 10**2, 10**4, 10**6, 10**8, Infinity.
# (helpful for implementing a strength bar.)
result.match_sequence # the detected patterns used to calculate entropy.
result.calculation_time # how long it took to calculate an answer,
# in milliseconds. usually only a few ms.

The optional user_inputs argument is an array of strings that zxcvbn will add to its internal dictionary. This can be whatever list of strings you like, but it’s meant for user inputs from other fields of the form, like name and email. That way a password that includes the user’s personal info can be heavily penalized. This list is also good for site-specific vocabulary. For example, ours includes dropbox.

zxcvbn is written in CoffeeScript. zxcvbn.js and zxcvbn-async.js are unreadably closure-compiled, but if you’d like to extend zxcvbn and send me a pull request, the README has development setup info.

The rest of this post details zxcvbn’s design.

The model

zxcvbn consists of three stages: match, score, then search.

match enumerates all the (possibly overlapping) patterns it can detect. Currently zxcvbn matches against several dictionaries (English words, names and surnames, Burnett’s 10,000 common passwords), spatial keyboard patterns (QWERTY, Dvorak, and keypad patterns), repeats (aaa), sequences (123, gfedcba), years from 1900 to 2019, and dates (3-13-1997, 13.3.1997, 1331997). For all dictionaries, match recognizes uppercasing and common l33t substitutions.

score calculates an entropy for each matched pattern, independent of the rest of the password, assuming the attacker knows the pattern. A simple example: rrrrr. In this case, the attacker needs to iterate over all repeats from length 1 to 5 that start with a lowercase letter:

entropy = lg(26*5) # about 7 bits

search is where Occam’s razor comes in. Given the full set of possibly overlapping matches, search finds the simplest (lowest entropy) non-overlapping sequence. For example, if the password is damnation, that could be analyzed as two words, dam and nation, or as one. It’s important that it be analyzed as one, because an attacker trying dictionary words will crack it as one word long before two. (As an aside, overlapping patterns are also the primary agent behind accidentally tragic domain name registrations, like childrens-laughter.com but without the hyphen.)

Search is the crux of the model. I’ll start there and work backwards.

Minimum entropy search

zxcvbn calculates a password’s entropy to be the sum of its constituent patterns. Any gaps between matched patterns are treated as brute-force “patterns” that also contribute to the total entropy. For example:

That a password’s entropy is the sum of its parts is a big assumption. However, it’s a conservative assumption. By disregarding the “configuration entropy” — the entropy from the number and arrangement of the pieces — zxcvbn is purposely underestimating, by giving a password’s structure away for free: It assumes attackers already know the structure (for example, surname-bruteforce-keypad), and from there, it calculates how many guesses they’d need to iterate through. This is a significant underestimation for complex structures. Considering correcthorsebatterystaple, word-word-word-word, an attacker running a program like L0phtCrack or John the Ripper would typically try many simpler structures first, such as word, word-number, or word-word, before reaching word-word-word-word. I’m OK with this for three reasons:

It’s difficult to formulate a sound model for structural entropy; statistically, I don’t happen to know what structures people choose most, so I’d rather do the safe thing and underestimate.

For a complex structure, the sum of the pieces alone is often sufficient to give an “excellent” rating. For example, even knowing the word-word-word-word structure of correcthorsebatterystaple, an attacker would need to spend centuries cracking it.

Most people don’t have complex password structures. Disregarding structure only underestimates by a few bits in the common case.

With this assumption out of the way, here’s an efficient dynamic programming algorithm in CoffeeScript for finding the minimum non-overlapping match sequence. It runs in O(n·m) time for a length-n password with m (possibly overlapping) candidate matches.

backpointers[j] holds the match in this sequence that ends at password position j, or null if the sequence doesn’t include such a match. Typical of dynamic programming, constructing the optimal sequence requires starting at the end and working backwards.

Especially because this is running browser-side as the user types, efficiency does matter. To get something up and running I started with the simpler O(2m) approach of calculating the sum for every possible non-overlapping subset, and it slowed down quickly. Currently all together, zxcvbn takes no more than a few milliseconds for most passwords. To give a rough ballpark: running Chrome on a 2.4 GHz Intel Xeon, correcthorsebatterystaple took about 3ms on average. coRrecth0rseba++ery9/23/2007staple$ took about 12ms on average.

Threat model: entropy to crack time

Entropy isn’t intuitive: How do I know if 28 bits is strong or weak? In other words, how should I go from entropy to actual estimated crack time? This requires more assumptions in the form of a threat model. Let’s assume:

Passwords are stored as salted hashes, with a different random salt per user, making rainbow attacks infeasible.

An attacker manages to steal every hash and salt. The attacker is now guessing passwords offline at max rate.

The attacker has several CPUs at their disposal.

Here’s some back-of-the-envelope numbers:

# for a hash function like bcrypt/scrypt/PBKDF2, 10ms is a safe lower bound
# for one guess. usually a guess would take longer -- this assumes fast
# hardware and a small work factor. adjust for your site accordingly if you
# use another hash function, possibly by several orders of magnitude!
SINGLE_GUESS = .010 # seconds
NUM_ATTACKERS = 100 # number of cores guessing in parallel.
SECONDS_PER_GUESS = SINGLE_GUESS / NUM_ATTACKERS
entropy_to_crack_time = (entropy) ->
.5 * Math.pow(2, entropy) * SECONDS_PER_GUESS

I added a .5 term because we’re measuring the average crack time, not the time to try the full space.

This math is perhaps overly safe. Large-scale hash theft is a rare catastrophe, and unless you’re being specifically targeted, it’s unlikely an attacker would dedicate 100 cores to your single password. Normally an attacker has to guess online and deal with network latency, throttling, and CAPTCHAs.

Entropy calculation

Up next is how zxcvbn calculates the entropy of each constituent pattern. calc_entropy() is the entry point. It’s a simple dispatch:

calc_entropy = (match) ->
return match.entropy if match.entropy?
entropy_func = switch match.pattern
when 'repeat' then repeat_entropy
when 'sequence' then sequence_entropy
when 'digits' then digits_entropy
when 'year' then year_entropy
when 'date' then date_entropy
when 'spatial' then spatial_entropy
when 'dictionary' then dictionary_entropy
match.entropy = entropy_func match

I gave an outline earlier for how repeat_entropy works. You can see the full scoring code on github, but I’ll describe two other scoring functions here to give a taste: spatial_entropy and dictionary_entropy.

Consider the spatial pattern qwertyhnm. It starts at q, its length is 9, and it has 3 turns: the initial turn moving right, then down-right, then right. To parameterize:

s # number of possible starting characters.
# 47 for QWERTY/Dvorak, 15 for pc keypad, 16 for mac keypad.
L # password length. L >= 2
t # number of turns. t <= L - 1
# for example, a length-3 password can have at most 2 turns, like "qaw".
d # average "degree" of each key -- the number of adjacent keys.
# about 4.6 for QWERTY/Dvorak. (g has 6 neighbors, tilda only has 1.)

The space of total possibilities is then all possible spatial patterns of length L or less with t turns or less:

(i – 1) choose (j – 1) counts the possible configurations of turn points for a length-i spatial pattern with j turns. The -1 is added to both terms because the first turn always occurs on the first letter. At each of j turns, there’s d possible directions to go, for a total of dj possibilities per configuration. An attacker would need to try each starting character too, hence the s. This math is only a rough approximation. For example, many of the alternatives counted in the equation aren’t actually possible on a keyboard: for a length-5 pattern with 1 turn, “start at q moving left” gets counted, but isn’t actually possible.

The first line is the most important: The match has an associated frequency rank, where words like the and good have low rank, and words like photojournalist and maelstrom have high rank. This lets zxcvbn scale the calculation to an appropriate dictionary size on the fly, because if a password contains only common words, a cracker can succeed with a smaller dictionary. This is one reason why xkcd and zxcvbn slightly disagree on entropy for correcthorsebatterystaple (45.2 bits vs 44). The xkcd example used a fixed dictionary size of 211 (about 2k words), whereas zxcvbn is adaptive. Adaptive sizing is also the reason zxcvbn.js includes entire dictionaries instead of a space-efficient Bloom filter — rank is needed in addition to a membership test.

I’ll explain how frequency ranks are derived in the data section at the end. Uppercasing entropy looks like this:

ranked_dict maps from a word to its frequency rank. It’s like an array of words, ordered by high-frequency-first, but with index and value flipped. l33t substitutions are detected in a separate matcher that uses dictionary_match as a primitive. Spatial patterns like bvcxz are matched with an adjacency graph approach that counts turns and shifts along the way. Dates and years are matched with regexes. Hit matching.coffee on github to read more.

Data

Frequency-ranked names and surnames come from the freely available 2000 US Census. To help zxcvbn not crash ie7, I cut off the surname dictionary, which has a long tail, at the 80th percentile (meaning 80% of Americans have one of the surnames in the list). Common first names include the 90th percentile.

The 40k frequency list of English words comes from a project on Wiktionary, which counted about 29M words across US television and movies. My hunch is that of all the lists I could find online, television and movie scripts will capture popular usage (and hence likely words used in passwords) better than other sources of English, but this is an untested hypothesis. The list is a bit dated; for example, Frasier is the 824th most common word.

Conclusion

At first glance, building a good estimator looks about as hard as building a good cracker. This is true in a tautological sort of way if the goal is accuracy, because “ideal entropy” — entropy according to a perfect model — would measure exactly how many guesses a given cracker (with a smart operator behind it) would need to take. The goal isn’t accuracy, though. The goal is to give sound password advice. And this actually makes the job a bit easier: I can take the liberty of underestimating entropy, for example, with the only downside of encouraging passwords that are stronger than they need to be, which is frustrating but not dangerous.

Good estimation is still difficult, and the main reason is there’s so many different patterns a person might use. zxcvbn doesn’t catch words without their first letter, words without vowels, misspelled words, n-grams, zipcodes from populous areas, disconnected spatial patterns like qzwxec, and many more. Obscure patterns (like Catalan numbers) aren’t important to catch, but for each common pattern that zxcvbn misses and a cracker might know about, zxcvbn overestimates entropy, and that’s the worst kind of bug. Possible improvements:

zxcvbn currently only supports English words, with a frequency list skewed toward American usage and spelling. Names and surnames, coming from the US census, are also skewed. Of the many keyboard layouts in the world, zxcvbn recognizes but a few. Better country-specific datasets, with an option to choose which to download, would be a big improvement.

As this study by Joseph Bonneau attests, people frequently choose common phrases in addition to common words. zxcvbn would be better if it recognized “Harry Potter” as a common phrase, rather than a semi-common name and surname. Google’s n-gram corpus fits in a terabyte, and even a good bigram list is impractical to download browser-side, so this functionality would require server-side evaluation and infrastructure cost. Server-side evaluation would also allow a much larger single-word dictionary, such as Google’s unigram set.

It’d be better if zxcvbn tolerated misspellings of a word up to a certain edit distance. That would bring in many word-based patterns, like skip-the-first-letter. It’s hard because word segmentation gets tricky, especially with the added complexity of recognizing l33t substitutions.

Even with these shortcomings, I believe zxcvbn succeeds in giving better password advice in a world where bad password decisions are widespread. I hope you find it useful. Please fork on github and have fun!

If a web site told me that it wouldn’t accept the first password because
it was too weak, despite being the only website in the world I’ve used
it on, I would naturally assume that their website is routinely hacked
and I should assume the password database is public information. On the
other extreme is github, where ‘password1’ is perfectly valid but
‘f^FH]!n’ isn’t because no numbers are present.

Getting people
on-board with one-use passwords, and building password stores or
generators into browsers, is far more important than having an insanely
long or complex password. The only time I need to care about Password
Maker’s output is when a site thinks my password is too complex or not
complex enough, when the reality is that any of my individual passwords.
No, silly web sites, you aren’t as important to my security as you
might think you are, and why should I care if it would take Russian
hackers months or centuries to crack my Dropbox password when my banking
passwords can’t be longer than 8 characters, isn’t case sensitive, and
can’t have any symbols, and are probably being hacked nightly in a way
the bank is too incompetent to find out? Fine, inform me, but let me sign up on my own terms.

It’s rather too generous with “dictionary words”; a random sequence of a and i characters, of any length, is considered to have an entropy of about length*1.5, because it considers them very obvious dictionary words. 😀 This is true with embedded a and i, too, so that 9SNF has an entropy of 24, but 9aNi has an entropy of 15.

A very simple technique for keeping the password is used. while filling up the online forms the password should be strong otherwise it doesn’t acceptable. Always select the password of good strength before filling online forms.

A very simple technique for keeping the password is used. while filling
up the online forms the password should be strong otherwise it doesn’t
acceptable. Always select the password of good strength before filling
online forms.

[…] sequences (hence the project name) and other typical “bad” password choices. The dropbox tech blog about the project gives an excellent overview of the algorithms used and the rationale for choosing […]

I found this after reading this: http://arstechnica.com/security/2012/08/passwords-under-assault/
It led me to searching around for a good way to test some new passwords I’m trying out. I’ve been very impressed by zxcvbn’s ability to detect all kinds of words and patterns and thus recognize the password as being easy to crack, given the new algorithms hackers are using now.

I type in the Dvorak keyboard as well as QWERTY, so I’ve been just typing a long, plain, l33tspeak word in the QWERTY pattern on a Dvorak keyboard. It makes it easy to remember, but very random. Even your otherwise very hard to fool zxcvbn tool doesn’t recognize any patterns. I don’t think this is a common enough practice for any hackers to run their normal smart algorithms over again, but this time assuming someone is typing in a QWERTY pattern on a Dvorak keyboard. Being unusual has its advantages!

As a gizmo lover but only a moderately capable techie/programmer, I’m not clear what the final outcome of this article was! Do we go with: (a) Tr0ub4dour&3 or (b) correcthorsebatterystaple or (c) zxcvbn ???

Please note: Sometimes we blog about upcoming products or features before they’re released, but timing and exact functionality of these features may change from what’s shared here. The decision to purchase our services should be made based on features that are currently available.