Hacked user databases from 500px, Dubsmash, and other sites for sale on the dark web

The Register is reporting that sellers on the dark web are offering user account databases containing information on 617+ million accounts from websites like Dubsmash, 500px, MyFitnessPal, MyHeritage, and others:

This kind of thing is so common now that the odds are extremely good that anyone who's been using the web for more than a few years has had their user account data compromised. If you're curious whether your data has ever been compromised, Have I Been Pwned may be able to tell you:

This is why it's more important than ever to use strong — and most importantly — unique passwords for every website. I can't stress enough how important it is not to reuse passwords! Some of the websites you use will get hacked, and if you use the same password on multiple sites, that could allow hackers (or anyone who obtains the hacked data) to gain access to your other accounts.

My 500px account was among the accounts compromised in this breach, but I'm not terribly worried because I know that my 500px password was only used on 500px. There's no chance anyone can use it to gain access to my other accounts. I use 1Password to generate strong unique passwords for every service I use (and to remember them for me!). If you don't already use a password manager, I highly recommend checking it out:

As a programmer, one of the things that really annoys me about this latest data dump is that almost every single website involved used weak, insecure hashing strategies to store user passwords.

When storing passwords, it's important to assume that you will one day be hacked, and that the hackers will gain access to your password database. So the best practice is to only store passwords in a form that makes them impossible to use.

To do this, you combine the password with a randomly generated value called a salt (like sprinkling some salt on it, get it?) and then you run the password and salt through a one-way hashing algorithm, which is an algorithm that converts data into other data in such a way that it's impossible to convert it back to the original data.

For example, if my password is correct horse battery staple, then a hashed and salted version using the widely recommended bcrypt password hashing algorithm would look like this:

$2a$16$uaq9vdX2puoGl8ZZzLU2EuXtD3GNasBDcYX54nF9F4AAr/5gDfseG

It's mathematically impossible to reverse (or "decrypt") that hash because the hash value doesn't actually contain the original password; it was only derived from the original password. So the result is that even if someone has your password database, they can't see what the actual passwords were before they were hashed, and they can't reverse the hashes, which means they can't use the passwords to log into the hacked website or any other website.

The scary thing is that it seems likealmost every website does this wrong or not at all. 😞

Many of the websites involved in this hack used a weak hashing algorithm like MD5, or they failed to include a random salt in the hash, both of which can make it feasible for hackers to brute force the hashed passwords — which means they simply use a fast computer (or fleet of fast computers) to compute every possible hash for every possible password until they find the one that matches the hash, and then they know your password. With a weak hash, this computation is much easier. And without a salt, hackers can precompute large tables of values that they can then use to rapidly arrive at the original passwords.

There's no reason, in 2019, that websites should still be making these same careless mistakes. It's basically programmer malpractice, and it makes me so angry. 😡

I got an email today from 500px asking me to update my password. Whenever I get an email like that I usually do some quick search to make sure this isn't another phishing scam before pressing any buttons in the email. This led me to an article that @yaypie posted with even more scary information and breadth of the breach.

As a 500px user it makes me angry that my personal information is out there to be sold to anyone anonymously with some Bitcoins:

14,870,304 accounts for 0.217 BTC ($780) total

1.5GB of data taken July 2018. Each account record contains the username, email address, MD5-, SHA512- or bcrypt-hashed password, hash salt, first and last name, and if provided, birthday, gender, and city and country. 500px is a social-networking site for photographers and folks interested in photography.

How can I as a customer defend myself against company-wide attacks? I can only choose my password and provide incomplete information about myself, which is important for a social network of photographers. For many of us our real name is our brand.

Yes, indeed. Security best practices are there for a reason, well known, and well understood. Not using them should probably be grounds for malpractice lawsuit.

With that in mind, imagine my feelings when after forgetting the password to my (state owned) electricity co and clicking 'forgot my password' link, I got my original password, in cleartext, in my mail. :-)

Whenever you get a mail of that sort, don't click on it. Instead, good security practice is to go to the website in question, directly, by manually typing the address in the browser and check it out there. If it's legit, you'll see it on their site.

Thanks for bringing this up, @yaypie. Password hashing and salting a hash must be arcane concepts to non-programmers ("First you're not even storing my password, and then you're adding random garbage to what you store - how can this even work?"), but this is very important to every single one of us who has at least some important data secured by one of these things. :)

A quick back-of-the-envelope calculation shows that only a quarter of passwords in this hack were stored in a secure way, with the security of the remaining remaining quarters being either dubious (MD5+salt?; same salt for the whole table?) or outright crazy. This just isn't enough!

My last encounter with someone not quite understanding the value of hashing and salting correctly is much less spectacular than this, but still shows the general problem:

Last year, I participated in a prize competition where I had to enter my mail address. After doing that, I received a mail stating that I needed to confirm my address for GDPR reasons. This mail contained a link for me to click on, which was built like this:

<domain>/confirm/?mail=factotum@example.com

The URL contained my mail address as plain text! I would have been able to "confirm" any random address I added to their database, even without having access to that mail address. Using the right addresses (some privacy lawyers come to mind), this could have caused them a good amount of trouble.

I contacted them and explained the issue - and they replied, thanking me for bringing this up and mentioning that someone "is working on solving the problem (MD5 hash)". This mention of a hashing algorithm was oddly specific, so I tried participating again from a different address. In fact, the link I received this time was

<domain>/confirm/?hash=cd13b6a6af66fb774faa589a9d18f906

Using an MD5 hash generator, I quickly confirmed that the hash used was created from the mail address string without any salting. They changed the process slightly, but didn't make it any more secure than it was before. Obviously, they didn't have any real understanding for what hashing (and salting) really is.

(Try it out yourself: how long do you need to find out what this hash stands for?)