8 million leaked passwords connected to LinkedIn, dating website

An unknown hacker posted the lists online and asked for help in cracking them.

An unknown hacker has posted more than 8 million cryptographic hashes to the Internet that appear to belong to users of LinkedIn and a separate, popular dating website.

The massive dumps over the past three days came in postings to user forums dedicated to password cracking at insidepro.com. The bigger of the two lists contains almost 6.46 million passwords that have been converted into hashes using the SHA-1 cryptographic function. They use no cryptographic "salt," making the job of cracking them considerably faster. Rick Redman, a security consultant who specializes in password cracking, said the list almost certainly belongs to LinkedIn because he found a password in it that was unique to the professional social networking site. Robert Graham, CEO of Errata Security said much the same thing, as did researchers from Sophos. Several Twitter users reported similar findings.

"My [LinkedIn] password was in it and mine was 20 plus characters and was random," Redman, who works for consultancy Kore Logic Security, told Ars. With LinkedIn counting more than 160 million registered users, the list is probably a small subset, most likely because the person who obtained it cracked the weakest ones and posted only those he needed help with.

"It's pretty obvious that whoever the bad guy was cracked the easy ones and then posted these, saying, 'These are the ones I can't crack,'" Redman said. He estimates that he has cracked about 55 percent of the hashes over the past 24 hours. "I think the person has more. It's just that these are the ones they couldn't seem to get."

Update 2:01 pm PDT: In a blog post posted after this article was published, a LinkedIn official confirmed that "some of the passwords that were compromised correspond to LinkedIn accounts" and said an investigation is continuing. The company has begun notifying users known to be affected and has also implemented enhanced security measures that include hashing and salting current password databases.

The smaller of the two lists contains about 1.5 million unsalted MD5 hashes. Based on the plaintext passwords that have been cracked so far, they appear to belong to users of a popular dating website, possibly eHarmony. A statistically significant percentage of users regularly pick passcodes that identify the site hosting their account. At least 420 of the passwords in the smaller list contain the strings "eharmony" or "harmony."

The lists of hashes that Ars has seen don't include the corresponding login names, making it impossible for people to use them to gain unauthorized access to a particular user's account. But it's safe to assume that information is available to the hackers who obtained the list, and it wouldn't be a surprise if it was also available in underground forums. Ars readers should change their passwords for those two sites immediately. If they used the same password on a separate site, it should be changed there, too.

eHarmony officials didn't immediately respond to a request for comment.

The InsidePro postings provide a glimpse into the sport of collective password cracking, a forum where people gather to pool their expertise and sometimes vast amounts of computing resources.

"Please help to uncrack [these] hashes," someone with the username dwdm wrote in a June 3 post that contained the 1.5 million hashes. "All passwords are UPPERCASE."

Less than two and a half hours later, someone with the username zyx4cba posted a list that included almost 1.2 million of them, or more than 76 percent of the overall list. Two minutes later, the user LorDHash independently cracked more than 1.22 million of them and reported that about 1.2 million of the passwords were unique. As of Tuesday, following the contributions of several other users, just 98,013 uncracked hashes remained.

While forum members were busy cracking that list, dwdm on Tuesday morning posted the much larger list that Redman and others believe belongs to LinkedIn users. "Guys, need you[r] help again," dwdm wrote. Collective cracking on that list was continuing at the time of this writing Wednesday morning.

By identifying the patterns of passwords in the larger list, Redman said it's clear they were chosen by people who are accustomed to following policies enforced in larger businesses. That is, many of the passwords contained a mix of capital and lower case letters and numbers. That's another reason he suspected early on that the passwords originated on LinkedIn.

"These are business people, so a lot of them are doing it like they would in the business world," he explained. "They didn't have to use uppercase, but they are. A lot of the patterns we're seeing are the more complicated ones. I cracked a 15-character one that was just the top row of the keyboard."

Story updated to add link to Errata Security blog post, and to correct the percentage of passwords Redman has cracked.

Promoted Comments

As the article makes clear, the 6.5 million hashes are likely just those the hackers couldn't crack. What that means is: It means nothing that you don't find your password in the list. Out of an abundance of caution, readers should presume the entire list has been obtained and change their password no matter what.

134 Reader Comments

While the article is interesting, it is missing the most crucial point for readers:1) Something the average user should be concerned about (i.e. can these be used to access said accounts?)2) Is there a way to check to see if one's information is on there?3) What should the average user be doing to react personally to maintain information security.

H2O Rip

Please reread the article, and pay close attention to the following:

The lists of hashes that Ars has seen don't include the corresponding login names, making it impossible for people to use them to gain unauthorized access to a particular user's account. But it's safe to assume that information is available to the dwdm, and it wouldn't be a surprise if it was also available in underground forums. Ars readers should change their passwords for those two sites immediately. If they used the same password on a separate site, it should be changed there, too.

Note, too, that while these lists aren't hard to find online, Ars has decided not to link to them so they don't circulate even further than they already have.

If any of you are stupid enough to use a formulaic password like "EDCrfvTGByhnUJM" or "!qaz@WSX3edc" you deserve to have your fucking password cracked.

You are getting entirely the wrong lesson from this. "Weak" passwords only matter if somebody is bruteforcing/dictionary attack on random logins. This is not the case here. They managed to crack two sites and obtain and decrypt their entire password lists. So nobody is bruteforcing anything. They ALREADY HAVE YOUR PASSWORD. It doesn't matter how random it is.

The important point is to not reuse passwords and/or usernames from one site to another.

Programmers continue to copy paste solutions from existing code whether or not they understand said solutions, because they are almost always under time crunch and their managers have the rush mentality.

I am a senior software architect, I have seen this in every job and at every level.

I don't know how many of these password leaks have to happen before the industry wakes up and realizes passwords are just not working for security. It doesn't help that practically every site you visit wants you to create an account with them, most people are not going to use 100 different passwords, and password vaults are not a valid solution to that problem.

A statistically significant percentage of users regularly pick passcodes that identify the site hosting their account. At least 420 of the passwords in the smaller list contain the strings "eharmony" or "harmony."

Sshh! Awe man. Now you've told the thieves how we do it! Guess I'll switch back to using a 4-digit PIN.

Since this is likely only a partial list, not finding the password on it is no guarantee that it's not available to other interested parties. So it's probably a good idea to assume your password is out there either way

I don't know how many of these password leaks have to happen before the industry wakes up and realizes passwords are just not working for security. It doesn't help that practically every site you visit wants you to create an account with them, most people are not going to use 100 different passwords, and password vaults are not a valid solution to that problem.

Until someone publishes a better idea passwords will continue to be used. And anything that ties a single global ID around your neck is not a valid solution.

The important point is to not reuse passwords and/or usernames from one site to another.

They only have the hashes of the passwords. They still have to crack them (with modified dictionary attacks, or brute force) to get your plaintext password.

Like they did for the guy with "20 plus characters and was random"? Once they have the hash table it's much easier to find out that one specific password to that site. If you didn't reuse that specific one any place else it doesn't matter. Your account on that site is probably already boned anyway.

The published list is (reasonably, imo) presumed in the article to include only hashes the original poster couldn't crack, so the "20 plus characters" guy was still safe at that point. I'd like to know if it was cracked after the list was posted, but no luck on that question so far.

From the hacker news comments, a script that can tell you whether your password is in the dump and whether or not it was already cracked (the cracked passwords are marked in the file with leading zeros)

My fairly complicated password was in the list and had already been cracked :S

wait, so a hacker news comment tells you to type your password into this to tell you whether or not your password is in the list?

so your giving them a list of passwords that might not be cracked?

This is a very short python script. You download it to your machine and run it locally against the text file. It's trivially easy to verify that the script is non-malicious -- it just hashes your password then loops through the dump file to see if your hashed password is present.

The smaller of the two lists contains about 1.5 million unsalted MD5 hashes. ... A statistically significant percentage of users regularly pick passcodes that identify the site hosting their account. At least 420 of the passwords in the smaller list contain the strings "eharmony" or "harmony."

All I did on Windows XP (yes, work computer), was compute the SHA1 hash of my password (there's tools online for that with a quick Google search), then copy the hash, and paste it into the quotes in this line on the command line:

# First, collect all the SHA1 checksum suffixes# This is an optimization for grep, which can use a source file of simple patterns to make one pass over the search.while read password; do echo "$password" | tr -d '\n' | shasum | cut -c 6-40done < $PSDFILE > $TMPFILE

# Show the user what hash suffixes are applicableecho '# Searching for...'cat $TMPFILE

# Run the search...zgrep -f $TMPFILE $SRCFILE# If this prints anything, your password has probably been stolen.

The only viable solution I can think of is to go to something like SecureID or Battle.net's mobile authenticator for every site -- but that's not dramatically different from using a password program. And the reality is that nobody's going to agree on a single standard so you'll be carrying a dozen fobs or mobile programs around with you.

We've had the cryptographic primitives for proper, cryptographically secure logins for a couple of decades now. Why it's almost completely unused is a mystery to me. It seems like a no brainer to implement public key authentication into web browsers, and I don't think it would be very difficult to do so in a friendly and easy to use manner. Because the key would have a passphrase as well, it would even be reasonably secure to sync this with e.g. your Google account so all your browsers pull in your auth credentials automagically, without giving Google your password or access to your other accounts. Really there's not even a reason to sync, once the key is installed it will work for any future services you give your public key to without any update.

Then a user's key and passphrase (assuming the cipher remains unbroken) both need to be compromised. The user only needs to remember one set of credentials, and a keylogging, or even phishing attack isn't sufficient to compromise their accounts.

So yeah. We've had this for a loooong time with ssh, and even in SSL (though the whole PKI / certificate registry thing is a big barrier to adoption of that, makes more sense to just do individual trust relationships). So why aren't we using it?!

While the article is interesting, it is missing the most crucial point for readers:1) Something the average user should be concerned about (i.e. can these be used to access said accounts?)2) Is there a way to check to see if one's information is on there?3) What should the average user be doing to react personally to maintain information security.

.

1) Yes, even as a rainbow table this could be added to brute-force lists.2) No.3) Change your passwords to something else if you use eHarmony and/or LinkedIn. These websites have just failed you as a user.

Personally, using one password is as silly as keeping 32,768 complex ones for opposite reasons: on one hand your data is extremely vulnerable, on the other, you'll never be able to sign into your accounts again. (Even using your email to generate a recovery password: if they ask challenge questions, do you really remember all the answers you've typed?)

Much like spam e-mail, having a strategy is a good idea. Separate the sites you visit into groups: at a baseline, three categories work: ones of critical importance, ones of less importance, and ones of least.

(Feel free to shuffle things around: if you're a contestant in Major League Gaming on a regular basis, obviously game passwords will have a higher priority with you than they do for me.)

Then set ONE password for each kind of category. The highest category is the most complex password you can remember, the middle category is the good-ole alphanumeric mix, and the bottom one is something that makes you laugh (e.g. a past password of mine was from a funny moment from Superman: "Otisberg?").

That way, when a blog loses your password info, your banking and personal passwords are unaffected. If your ex-wife decides to steal your Facebook password, she's not able to gain access to your bank accounts. And if someone does manage to steal your bank password, just change the password for that category of sites you visit, instead of ALL of them. And even if no one does steal your password, change them yearly or every 6 months at the LEAST. (Minor variations are cool for less important things: instead of "OklahomaJoe", make it "OKJoe" on blogs.)

You can't remember thousands of passwords. But you can remember three.

And hopefully three is all you should need: for those sites that have assanine rules (You know the type of site: Must be over 12 characters, have one lowercase character, one uppercase character, one number, one symbol, and cannot start with a letter, and cannot end with a symbol or contain any prime, mersenne numbers, or any number that's expressable as the sum of two squares): write that crap down, and lock it in a safe.

Makes me wonder if my password was one of those as-yet-uncracked 90k passwords. I had a 28 character random (uppercase, lowercase, numbers, symbols) password. Off to download the hash and try that python script on it. For the benefit of other ars users, here's the script (re-posted from Hacker News, originally written by dpritchett):

I've told people for years LinkedIn was a security nightmare waiting to happen. Of course, i expected it to be a major privacy breech, not a password breech, but either way, I just called in my bet, and begrudgingly a college is sending me a check

8m passwords leaked? man, that's going to be a hell of a cost in identity theft monitoring... I've had 2 companies have to pay for breeches of my data before, I know it's not cheap. Several dollars per user per month, for 1-3 years depending on local laws, this could easily be a $100m cost to Linkedin. ouch!

Also goes to prove no matter if the pass is stored plain or a hash, it's still crackaable, so never use the same password on more than 1 site or system.

As the article makes clear, the 6.5 million hashes are likely just those the hackers couldn't crack. The take-away from this is: It means nothing that you don't find your password in the list. Out of an abundance of caution, readers should presume the entire list has been obtained and change their password no matter what.

I don't know about anyone else but I don't find it difficult to memorize randomized passwords. The initial effort is larger but it pays off.

Just get a random password generator, run it, write down the result, and use it when you enter the password the first few times, until you memorized it. Once you do, shred or burn the paper (or whatever you want) and it will be stuck in your mind.

I have four that I used that method for, right now (three 8 character and one 12 character), and I still remember my 22 character password from school that was a random anagram.

I'm sure some people have a harder time doing this but I expect that most people can handle such a task.

While the article is interesting, it is missing the most crucial point for readers:1) Something the average user should be concerned about (i.e. can these be used to access said accounts?)2) Is there a way to check to see if one's information is on there?3) What should the average user be doing to react personally to maintain information security.

H2O Rip

Please reread the article, and pay close attention to the following:

The lists of hashes that Ars has seen don't include the corresponding login names, making it impossible for people to use them to gain unauthorized access to a particular user's account. But it's safe to assume that information is available to the dwdm, and it wouldn't be a surprise if it was also available in underground forums. Ars readers should change their passwords for those two sites immediately. If they used the same password on a separate site, it should be changed there, too.

Note, too, that while these lists aren't hard to find online, Ars has decided not to link to them so they don't circulate even further than they already have.

Apparently my reading comprehension was not up to par this morning as I glanced through it. I would consider that to be the most important part of the article though, might be worth pulling it out from the middle instead of sitting under twitter quotes, but thank you for pointing it out.

I already use 2 step auth where available, have changed my linkedin, and try to not re-use un / email / pw combos already (though, with so many accounts everywhere its really difficult to avoid that entirely). Shame the MD5s weren't salted either :-/

@uninventiveheart I do a variant of the categorical thing, lower importance categories (esp stuff that doesn't have any persona info attached) may be blocked into one category. I'd suggest with the critical stuff though keeping them unique (And 2 step when possible). Just in case there's a breach that isn't known, the hackers wouldn't have access to all critical stuff.

Also, someone posted something about using unique email addresses with gmail on another ars article (using the email+whatever @ gmail) - a practice I think I'll probably have to get into too.

*Edit* I see you put your comment in the bottom of the article - that works, thanks! =]

Apparently my reading comprehension was not up to par this morning as I glanced through it. I would consider that to be the most important part of the article though, might be worth pulling it out from the middle instead of sitting under twitter quotes, but thank you for pointing it out.

I already use 2 step auth where available, have changed my linkedin, and try to not re-use un / email / pw combos already (though, with so many accounts everywhere its really difficult to avoid that entirely). Shame the MD5s weren't salted either :-/

@uninventiveheart I do a variant of the categorical thing, lower importance categories (esp stuff that doesn't have any persona info attached) may be blocked into one category. I'd suggest with the critical stuff though keeping them unique (And 2 step when possible). Just in case there's a breach that isn't known, the hackers wouldn't have access to all critical stuff.

Also, someone posted something about using unique email addresses with gmail on another ars article (using the email+whatever @ gmail) - a practice I think I'll probably have to get into too.

No problem... I don't anticipate most users in the mainstream crowd will want to use 2-step authorization, since getting out of it when you change phones or forget the main password is usually a multiple day process in order to get your account back. Just thought I'd share advice for those who rip out hair on each news release of a company losing control of their password file... and it seems to be a lot of them lately.

If 2-step authentication doesn't work or isn't available, I like using Roman Rooms (http://www.academictips.org/memory/romanrom.html), using data about three or four objects as elements of a password that can be switched around, but again, too diffcult for mainstream users to adopt.

No, but they only contain a subset of all possible passwords, usually limited by length. Which means that yes, weaker passwords are more likely to be cracked first because they will appear in the usual tables.

The ycombinator site comments seem to imply that this is a complete leak of the SHA1 password database and that the duplicates were simply consolidated. They also mention that the ones that were cracked have the first five characters set to 0.

My password was on that list, and it was cracked. I've changed it now but I frankly don't remember where else I might have used it. All I know now is that the password basically can't be ever used again.

$ ./test-stupid.sh < /usr/share/dict/words00000bca9701606b01b6245d587d26c31b63a433:::: found aardvark000006b960572398e02f82878e2dfeadb4518899:::: found aardwolf00000c1e41f74b4e4a5950a0dda602fda275e4a1:::: found abacate0000058b1c71d517644ff6a4ed5e5421b83c4fca:::: found abacinate00000267f9f1e4469f8eb7bf45704218293412db:::: found abacus00000604ba82485d494fbc5fd8365509f36ee259:::: found abalone0000059e3099495023c7f4c15223e146e3fb6fdd:::: found abandon00000d0fec22d3282d0e70911e563402b8429cfc:::: found abandoned00000906d39b74998716738fbb2b6fa3620079f2:::: found abased0000059e4c7fcba827f22a25fe506baa6d011737:::: found abattoir000006e2be4ada6c7ce5b76554311a3330855949:::: found abbacy000001e35b00e6675efeef5d813dbf1ce62300cd:::: found abbasi0000021312a4ec34d96bce4eca98a879c684878a:::: found abbess

...which lends a little weight to the theory that the file primarily contains hashes which some script kiddie could not crack with basic tools, and hence makes us wonder what he's done with all the ones which he did crack - and how much of the LinkedIn corpus that would represent?

This is precisely why I use LastPass and use unique passwords for every site.

Is LastPass really reliably secure? Maybe it's just paranoia but my first reflex is that that's putting all your eggs in one basket

The problem with the "eggs in one basket" situation that keeps coming up is that it doesn't quite work with password managers. Lastpass is essentially a central server for something like 1Password, they keep a copy of your encrypted file full of passwords and make it accessible. They also create a copy of your database on any local machine (it still works if their server is down) and you can rip stuff out of there. But your eggs, or passwords, can be scattered everywhere. Like all good password managers, you should assume that someone can get to your encrypted file and make the encryption strong, then make that file readily accessible where ever you might need it. That's why your password is important. Even if LastPass explodes tomorrow, your local machine still has the database and you still have the password.

There's a little more inherent trust with LastPass, especially since it apparently stores info in memory for a time (it'll keep you logged in without having to reenter your password) plus presumably you're sending your password to them to decrypt the file (they promise they don't keep your password anywhere, because they don't need to).

...You can't remember thousands of passwords. But you can remember three.

And hopefully three is all you should need: for those sites that have assanine rules (You know the type of site: Must be over 12 characters, have one lowercase character, one uppercase character, one number, one symbol, and cannot start with a letter, and cannot end with a symbol or contain any prime, mersenne numbers, or any number that's expressable as the sum of two squares): write that crap down, and lock it in a safe.

Good ideas. One more to add: salt each password. That way you have three unique bases and if one of the sites gets hacked that specific password is useless for any other.

My password is NOT on the list, with or without the 5 zeros changed on the beginning. I had a much, much simpler password up until about 6 months ago that is on the list, and has already been cracked.

Considering how stupid complex my password is, I'm fairly confident in saying that they haven't omitted it from the list due to it being already cracked. (Not that I care, I will change it; it's just a randomly generated one out of KeePass.) Just figured I'd share.

EDIT: From comments on the ycombinator thread, it sounds like the leaked passwords are somewhere over a year old, if not older. so if you've changed your password in that time frame, you're probably good.

While the article is interesting, it is missing the most crucial point for readers:1) Something the average user should be concerned about (i.e. can these be used to access said accounts?)2) Is there a way to check to see if one's information is on there?3) What should the average user be doing to react personally to maintain information security.

.

1) Yes, even as a rainbow table this could be added to brute-force lists.2) No.3) Change your passwords to something else if you use eHarmony and/or LinkedIn. These websites have just failed you as a user.

Personally, using one password is as silly as keeping 32,768 complex ones for opposite reasons: on one hand your data is extremely vulnerable, on the other, you'll never be able to sign into your accounts again. (Even using your email to generate a recovery password: if they ask challenge questions, do you really remember all the answers you've typed?)

Much like spam e-mail, having a strategy is a good idea. Separate the sites you visit into groups: at a baseline, three categories work: ones of critical importance, ones of less importance, and ones of least.

(Feel free to shuffle things around: if you're a contestant in Major League Gaming on a regular basis, obviously game passwords will have a higher priority with you than they do for me.)

Then set ONE password for each kind of category. The highest category is the most complex password you can remember, the middle category is the good-ole alphanumeric mix, and the bottom one is something that makes you laugh (e.g. a past password of mine was from a funny moment from Superman: "Otisberg?").

That way, when a blog loses your password info, your banking and personal passwords are unaffected. If your ex-wife decides to steal your Facebook password, she's not able to gain access to your bank accounts. And if someone does manage to steal your bank password, just change the password for that category of sites you visit, instead of ALL of them. And even if no one does steal your password, change them yearly or every 6 months at the LEAST. (Minor variations are cool for less important things: instead of "OklahomaJoe", make it "OKJoe" on blogs.)

You can't remember thousands of passwords. But you can remember three.

And hopefully three is all you should need: for those sites that have assanine rules (You know the type of site: Must be over 12 characters, have one lowercase character, one uppercase character, one number, one symbol, and cannot start with a letter, and cannot end with a symbol or contain any prime, mersenne numbers, or any number that's expressable as the sum of two squares): write that crap down, and lock it in a safe.

That's the approach I've personally taken with respect to my passwords. I have 6 categories:

If any of you are stupid enough to use a formulaic password like "EDCrfvTGByhnUJM" or "!qaz@WSX3edc" you deserve to have your fucking password cracked.

You are getting entirely the wrong lesson from this. "Weak" passwords only matter if somebody is bruteforcing/dictionary attack on random logins. This is not the case here. They managed to crack two sites and obtain and decrypt their entire password lists. So nobody is bruteforcing anything. They ALREADY HAVE YOUR PASSWORD. It doesn't matter how random it is.

The important point is to not reuse passwords and/or usernames from one site to another.

They only have the hashes of the passwords. They still have to crack them (with modified dictionary attacks, or brute force) to get your plaintext password.

Since I had little personal info at stake, I settled for using (only) a 7-character, non-dictionary, numbers&miXedCase pw. No two character sequence would've been in any dictionary.

It was on [leakedin.org's cracked list, possibly (per the article's suppositions) since it needed a bit of force. The password that I replaced it with, one character stronger, was not, supporting the belief that the site is legit. Doesn't matter too much, as I'd canceled my linkedin account right after changing it.

Calls into question the whole paradigm of even site-specific, quasi-random passwords for any level of security much greater than your car lock.