Disclaimer: I'm a computer programmer, not a security analyst or anything to do with security. I have zero experience in the world of cryptography, so bear with me please.

Situation: I was given the task to integrate a client's site with a data hosting site. While working on that, I stumbled upon a query that dumps all users' data. To make this query, the user has to 'authenticate' to get a session and use that session code to make this query, which then checks that the user has admin privileges before completing the query and responding.

But still... that seems sort of horrible to me. Especially since this is some of the data that is sent back if the query is successfully run:

There is more data returned that is horrible to expose, but that's not my primary concern.

Clearly no person has ever been named "Test Name", and "testemail.blerg" is not a registered domain, let alone "blerg" being a possible top-level domain. Even if it were, that account has been deleted from the data hosting site and cannot log in. The password that was used is a weak, test-case one, and is not in-use by anyone.

Is is possible for me to brute-force/rainbow-table (while I don't have experience, I know some flash-words :P) or something to get the password from that? What little (I think) I know is that the first part of the userpassword is the MD5 salt, but I don't know anything else.

If anyone can explain how easy it is, I can prove to my boss that this site is completely horrible and convince our client to migrate from this data hosting site.

There is more that I do know about the salting (i.e. how it gets the salt, what language/function it's using, etc), but I'd like to see how easily someone who doesn't have access to that information can figure it out. Another thing for me to go to my boss with, hopefully.

EDIT: There seems to a little confusion related to my intentionally being vague in the description, partially for the purpose of preventing any indication of which CRM this is, for legal/etc reasons. I also realize that my calling it 'data hosting' was potentially misleading, so my bad. Hopefully this clarifies:

The work our team is doing is creating a basic website for a company to show off their products. The only interaction I have with the CRM is:

When a person fills out the Contact Us form, we POST to the CRM a Contacts object with the user's information.

Do a GET on a list of Dealers that sell their products to display.

I started with #2, the GET API call, where I found out that I can query the Users table. I have not created the API, I have only been making requests of it.

The GET call requires a param query= where the value is a SELECT statement in the system's query language, which is then translated to SQL (presumably preventing SQLI attacks, but I don't know how it interprets/translates to SQL, so I'm not touching that with a 10ft pole). By changing SELECT * FROM Dealers; to SELECT * FROM Users; in the query I was able to see every user's data.

The way the CRM handles users is with a portal on their site. A user is created in the portal, where there is an "Is Admin" checkbox. This can be edited at any time through the portal. This is the process to making API requests:

A user makes a request to the CRM, containing the username, for a token.

The token is concatenated with the "secret" access key for that user, the resulting string is MD5 hashed and then sent back to request a session token.

This session token is included with every request, as a querystring param, to "verify" that the request is "authorized".

One of the problems is that if an 'admin' user makes a request of the Users table, the response is a list of every user with the information I listed above as well as the user's "secret" access token (and other information). So it's even worse than just exposing a password, it's practically granting access to anyone by impersonating anyone.

Also, this would make me ask "is that application authenticating against system user accounts?", which could be a major WTF if done with the slightest lack of forethought.
– rackandbonemanJun 2 '17 at 21:53

10

Quick check: you say this information is only available to users who already have admin privileges? If so, the first thing I'd check is how much do you trust people with "admin privileges." There are places where "admin privileges" means "you can run things as root," in which case an exploit like this means nothing at all. On the other hand, if "admin privileges" means "the employee isn't currently incarcerated and part of a work-study program," then you might care more.
– Cort AmmonJun 3 '17 at 5:10

1

Given what you've found so far I'd expect a lot of other terrible flaws in the site security too. For example, how well-secured are the admin accounts? Are there many such accounts, all able to access all the user profiles? It looks like they admins don't have a separate login flow - just a flag "this account is an admin" - and sometimes those flags can be overwritten by an attacker.
– CBHackingJun 3 '17 at 10:46

Can you clarify who does the hashing here, your application or the data hosting service? Is this user database used to login to the data hosting service itself or to your application? Without more information, I'm more inclined to think from your description that this is most likely your own application misusing the data hosting service rather than any fault of the data hosting service.
– Lie RyanJun 4 '17 at 2:27

@rackandboneman the fact that it uses crypt() doesn't necessarily mean that it's using system accounts; for example in PHP crypt() is a core function (and defaults to $1$ MD5 hashes)
– LeviJun 5 '17 at 7:24

5 Answers
5

The password format in the userpassword attribute looks like the standard format used by various unix services, such as the default system password service which stores hashed passwords in /etc/shadow. The format is basically:

$ type $ salt $ hash

In your example, the type is 1, which designates an md5 hash. There are other well-known types, such as various sizes of the sha-family hash functions.

Breaking an md5 hash is almost trivial today, even when it's salted, because md5 is so fast. It's not a hash function which is safe to use for hashing passwords; hashcat and other password crackers can literally hash millions or even billions of password candidates per second.

So this data store service isn't just horrible because it sends user password hashes back, it's even more horrible because it actually uses md5 hashes to store passwords.

To prove to your boss that this really isn't secure, you could download hashcat and a few password lists (you can easily find them online), and then run hashcat on a computer with a very powerful GPU on all the passwords you get from the service. Make a note of how many passwords hashcat can crack (without looking at the passwords themselves - they may reveal private information about the people who chose them, and you don't want to actually know their passwords), and inform the people whose passwords were compromised that they need to choose a new password. You'll probably need to get permission to do all this first, because depending on where you live and work, this might actually be considered a malicious attack, or even be illegal.

Aside: If you were stuck on using this service even after voicing your misgivings, I'd suggest that setting up an automated password cracker that runs without human intervention and informs people (by sending them an e-mail) when it manages to crack their password would be a way to protect your users from this kind of bad design. It would help to weed out weak passwords and increase the amount of time needed by a real attacker to crack passwords.

Edit: Zach pointed out that the aside isn't common practice and shouldn't be attempted by someone not able to assess the risks. I fully agree with that. At the very least, if you did something like that, you should have management okay it.

Still, NOT doing this doesn't make the resulting system more secure. It's the equivalent of sticking your head in the sand out of fear that if you actually did something to improve password security, it might backfire on you. We know that people choose bad passwords, and we know that md5 is not a good way to hash passwords. If we can't change these two facts, and we are in a position to weed out weak passwords, then we should probably do it to make our users safer.

One important reason this isn't seen much, or even considered much, is that we usually aren't in a position to implement it. Good systems don't use fast hash functions to hash passwords, and we usually don't get access to the whole hashed password database.

Is that automated self cracking idea used? It seems like a host wouldn't typically want to put in a significant fraction of the expected effort of an attacker, and if the fact of the self cracking was revealed it means that effort is given to the attacker.
– user123931Jun 2 '17 at 16:56

8

Do you help the attacker with the self-cracking idea? Not really, because you don't keep the cracked passwords around for someone to find - the aim is to find weak passwords, not to keep them on record. So if the attacker isn't already on your cracking system (which doesn't need to be accessible from the internet), you should be fine. Is this idea used in practice? I don't know. I'd say it doesn't make sense with professional setups that use password hash functions such as bcrypt, because with them it would be a waste of time. But if I was forced to use weak hash functions, I'd implement it.
– PascalJun 2 '17 at 18:40

17

@Pascal, I find the Aside in this answer concerning. This is simply not a common practice when integrating with a less-secure system. Since you are advising someone who is admittedly new to security there is a significant risk that they will not have the domain knowledge to determine if this is an appropriate approach for them to take or not. Choosing wrong could be damaging to them (professional embarrassment, violation of the other party's TOS or trust, etc.) If I'm wrong and this practice is used in the wild. I would absolutely love if you could provide a source.
– ZachJun 2 '17 at 21:43

11

The 'aside' just seems extremely likely to get somebody fired, or at the very least panic a bunch of your users (perhaps deservedly, but that's still not going to end well for most people).
– Chris HayesJun 3 '17 at 0:00

This password hash seems to use the crypt() format (which, despite its name and what some documentations say, including that very man page, has absolutely nothing to do with encryption; it is hashing). When it starts with $1$, this means that it is a password hashing function based on MD5. Its exact specification is "whatever glibc does". A look at the source code shows some enlightening passages:

201 /* The original implementation now does something weird: for every 1
202 bit in the key the first 0 is added to the buffer, for every 0
203 bit the first character of the key. This does not seem to be
204 what was intended but we have to follow this to be compatible. */
205 for (cnt = key_len; cnt > 0; cnt >>= 1)
206 md5_process_bytes ((cnt & 1) != 0
207 ? (const void *) alt_result : (const void *) key, 1,
208 &ctx, nss_ctx);
209
210 /* Create intermediate result. */
211 md5_finish_ctx (&ctx, nss_ctx, alt_result);
212
213 /* Now comes another weirdness. In fear of password crackers here
214 comes a quite long loop which just processes the output of the
215 previous round again. We cannot ignore this here. */
216 for (cnt = 0; cnt < 1000; ++cnt)
217 {

From this, we can conclude that this hashing function is not very well documented.

Anyway, as password hashing functions go, it's not very good. It uses a salt, and that's good, because it should prevent use by attackers of precomputed tables (including rainbow tables). It also includes a loop with many (1000) iterations, as an attempt to make password hashing slower, thereby making it harder for attackers to try many potential passwords until a match is found (a process known as "brute force" or "dictionary attack").

Unfortunately, it's not very good either. MD5 is fast, and a thousand nested MD5, while still 1000x slower, is still fast. An off-the-shelf PC with a basic gaming-oriented GPU can compute millions of such hashes per second; an average user password won't last long; and by that, I mean that an attacker spending one minute of computation per hashed password will still crack half of them.

Apart from the hashed password, simply revealing user's email addresses is already a pretty damnable offence -- I mean legally speaking. For instance, in Canada, this is known as "personal information" and all users would be entitled to sue you.

I am in Canada, so that is concerning. I didn't write the API, it's the data hosting service's, so I don't think I wouldn't be sued, though (thank god). The API is behind some authentication, though, so wouldn't that mean that this data hosting service has done 'due diligence'? I mean, You have to have the access key of an admin and session to get this information... but still... ugh...
– DaevinJun 2 '17 at 16:48

You're talking about "security in depth", an important pillar of secure systems, and you're right to feel a bit queasy: Once someone can get an admin session, he gets huge amounts of information about every user, no matter how worthy of protection this information is. I can see a few reasons to give an authenticated admin access to email addresses, but there is no reason at all to give him the hashed password, that's just stupid. So if the service followed security in depth design principles, the hashed passwords wouldn't be exposed.
– PascalJun 4 '17 at 8:59

3

@Daevin Arguably, the hosting services has not done due diligence. MD5 is laughable for password security in the modern era. The fact you see it regularly derided on an internet site that anyone can post to should tell you that you don't need to have a PhD to know better. Using a strong hash protects against leakage of the hash to unauthorized persons, especially in the event of a stolen database. This happens regularly, even to major entities where security is paramount. Using a weak hash is negligent, especially if someone informs them that their current hash is insecure. nudge, nudge
– jpmc26Jun 5 '17 at 4:34

Stop. Go no further. Do no more testing, demonstration, etc. until you have been explicitly asked to do so in writing, and even then beg off and suggest that there are many more qualified people than you to do an audit/pentest/etc.

I know of 2 people who did similar with less information (no passwords, just details that they didn't need access to and shouldn't have had access to) just a few months ago and they are still going through the process of dealing with felony charges.

That format looks like a standard unix-style password hash. Fields are seperated by dollar signs, first is the algorithm, then the salt, then the actual hash. Assuming it's a Linux system the $1$ indicates the md5 password algorithm (which is a little more complex than pure md5). Field lengths are also consistent with the MD5 password hashing algorithm.

There are many tools out there for attacking Linux password hashes. Whether you have any success with those tools will depend largely on how strong the passwords are.

I also note that your salt has a lot of zeros in it. If you have a static salt or a relatively small (compared to the number of users) range of salts then this may open up options for precomputed attacks.

Finally I would note that attacking real passwords may bring legal issues.

All of this information is available to an admin user on a typical Linux system in /etc/passwd (readable by anyone) and /etc/shadow (readable by root). It is difficult under standard Linux to prevent a root user from accessing /etc/shadow. As the query requires admin privileges it does not seem like it is exposing much given that an admin account has been compromised. As other answers have mentioned, if the system is really using md5 hashes, that is a problem, but unrelated to the question.