Category: Web security

Plain text is NEVER an option

In last Christmas (2011), there was a breaking news (reported: 1, 2, 3) that over 6 million online user credentials were leaked out from CSDN, a popular China online community for programmers that stores user credentials in plant text, resulting in 50 million user accounts (users use the same credential for multiple websites, including social networks and personal emails) put under security risk.

As we never know when and how hackers steal the user records in the database, storing user credential information in plain text format in the database is never an option for any reasons.

Encryption of password

As a good practice, the user credentials should be encrypted to write to the database so that even the database records are exposed by hackers, the user credential is still less risky. There is many ways to encrypt password, namely using a hash function, adding static salt, adding dynamic information.

Hashing

Hashing a password is the way to convert the password from a human-recognizable strings (e.g. “QuoKa88$%”) into a set of non-recognizable strings (e.g. “398db0fdd8a26857080a77fd4996377d”) so that hackers who steals the user credential cannot recognize the actual password string.

Hashing the password is the very first and the most basic step to store password information. Commonly used hash functions in PHP include sha1(), md5() and hash().

<?php
$password_hashed = md5($password);
?>

However, it’s clearly stated in the PHP documentation that solely using these functions to secure password is not recommended as they can be easily hacked by modem computer’s computation abilities.

Adding salt

One of the attack that breaks the hashed password is called “Dictionary attacks“. Generally users tend to adopt meaningful words as the password to that they can easily remember it. Hackers collect million pairs of meaningful words and their hashed string to build a “hash dictionary”. So when they obtain a hashed password, they can look up the “hash dictionary” and find out the original password.

Adding salt to the hashed password will make the password more secure against the dictionary attack. Salt is additional strings, generated randomly, added to the password so that the password will no likely exists in the “hash dictionary”.

There are two ways to add salt to a password. The first is to add the salt to the original password so as to make the original password longer in length (increase the password complexity) and less likely to be found in an ordinary dictionary. The second is to add the salt to the hashed password so as to make the “hash dictionary” malfunction.

Adding dynamic information

Adding salt to password, either before or after the hashing, has one potential vulnerability : the salt is a static string (no matter how complicated it is) that the hacker can easily identify the salt by comparing the multiple hashed passwords. Once the table of user credentials is being stolen, hackers only need a few days (if not a few hours) to break the salts.

To fix this vulnerability, we could use dynamic information (such as date of account created, record row ID, user ID, check sum of the password itself etc.) which is static to one user but dynamic to other users.

Adding interference

Finally we can add interference in the whole process to make further secure the password encryption. Commonly used interference include re-odering the sequence of the strings and cut the hashed password to a long-enough length.

There is many reasons that we don’t want the same user holding multiple accounts. By using different accounts, users may vote themselves up by, creating fake discussion and opinion, consuming extra resources on the website etc. So we want to ensure just one account per user.

Prevent single-user-multiple-accounts

This is usually done by charging users for creating accounts, or verifying the user by some real-name identifiers (such personal identification number or business registration number).

Charging users for creating accounts does help reduce duplicated accounts, dramatically. Whenever it comes to money, people are very careful and thoughtful (or “mean” if you like) so they would prefer to stick with one account. However, there is one exception: If the user’s benefit from holding multiple accounts is greater than the cost of creating extra accounts, they will be happy to pay for it.

Verifying the real-name identifiers is the most reliable approach to prevent duplicated accounts. But it is also the most difficult approach when it comes to a real practice. There is no universal standard of real-name identifier. Each country has their own identifier, unless your website is targeting only one or two countries, maintaining the formats of each identifier is silly and not possible. Even you do so, you are still putting yourself under huge amount of workload because verifying the identifier in right format and checking if it is really associated with a real entity is not easy and usually they are where manual process involved. The even worse: users can easily crack your checking by borrowing the identifier from others (usually from their family members or friends) to create multiple accounts.

Currently I don’t see any website that is doing well in preventing users from holding multiple accounts.

Detect the duplicated accounts

Since we cannot totally stop users holding multiple accounts, we have to work hard to prevent the situation from going worse. We need to find out the duplicated accounts (i.e. accounts held by the same users), and more importantly, do this on a regular basis.

The process of finding duplicated accounts are based on an assumption that : There are always something in common for two accounts that are held by the same user. For example, they may shows the same name (not user name but the name of the account holder), same email address, same password, same address etc. What we need to do is to quantify the level of similarity of two accounts, and sort out the accounts with high similarity. Then we do further investigation on those suspicious accounts.

The approaches to quantify the level of similarity varies depending on the nature of the website and the user account information. From my personal experience, I prefer to use this algorithm:

1. The value of similarity between two accounts is initialized as ZERO.

2. Go through the checking on each items as listed below:

Password: Usually if the user is holding multiple accounts, the user would prefer to use the same password for all the accounts so as to avoid remembering too many passwords. So if two accounts has the same password (or hashed password string), Similarity += 30;

Email address: If the user emails from two accounts are exactly the same, Similarity += 100; It is almost 100% certain that these accounts are used by the same user. However, sometimes the user may not register the account with same email address, instead, they will use another email address with different domain name (i.e. the string after @), and same local-part (i.e. the string before @). So if the two email addresses has the same local-part, (e.g. peter.johnson@gmail.com vs. peter.johnson@hotmail.com), similarity += 40;

Mailing Address, Phone Number, Answer to forget-password questions : If they are the same for two accounts, each one would add 40 to the Similarity.

UserID or User name : Calculate the word similarity with PHP’s similar_text(), convert the result as a percentage, then times 100 and add it to the Similarity. That is: Similarity += round(similar_text(UserID1, UserID2) / max(strlen(UserID1), strlen(UserID2)) * 100)

Date of birth, Sex : add 10;

Last log-in IP : If the IP of the last log-in of two accounts are the same, add 40;

Last log-in User-Agent String : add 10;

3. Finally if the Similarity is greater or equal to 50, then highlight the two accounts.

Don’t ask me how the scores for each item come out. They are just my experience. My only concern is whether these scores work well, and fortunately they do work quite well in my case. You should alter those scores and the checking items to fit your projects. Nevertheless, there are always false alarms from the detection. So the highlighted accounts should pass through a manual screening and confirmation.

Again, we are not experts on web security, we just discuss what we think possible. Comments are always welcomed.

A few days ago we were asked by a question: How do you know there is no other using the same account at the same time? This is the question often asked when a website is charging its users by the number of accounts. That is, if two users want to access the website, they have to pay for two accounts, instead of using one shared account.

Also, from the security perspective, web master should have the responsibility to ban multiple users using the same account.

So back to the question: How?

No Solution

Yes. There is no complete solution for the question. There is no perfect way to ban people sharing their accounts.

A basic level of checking

However, it doesn’t mean there is nothing to do against this. One of the most commonly used approach is to keep tracking of the IP and the time-span for each activity of each account. If the same account has different IPs at almost-the-same time-span, then it is a hint to show that sharing of accounts happened.

This solution assumes that the access from two different IP is made by two different user, while access from the same IP is made by the same user.

To break it into more detailed steps, here is the process flow:

When there is a request from a user account, record the IP and the time of the action.

Compare the IP of this action with the IP of the last action of this user account.

If the two IPs of the step 2 is the same, it means the account is used by the same person, the server should response normally. Otherwise, go through the next step.

Check if the time of the last request and time of this request. If the time difference exceeds a limit (say 10 minutes), it means that the last action has “expired” and the IP of the last action has no use anymore, so the server should response normally. Otherwise, account sharing is happening, do whatever needed to stop this request.

This is only a lowest level of checking to prevent account sharing. Some other additional checking, such as checking of the browser’s User-Agent string, cookies etc., should be accompanied with it.

Also, please be aware that this approach will not work when users accessing the website through a proxy. Some proxies will show a fixed IP while the request may actually comes from different computers. Some proxy will show a varying IP even when the requests are actually coming from the same computer.

We are not experts on web application security, we just tell whatever we think possible. Comments are always welcomed!