Investigating the strength of the 4-digit PIN

If we wanted to take a look at the statistics behind 4-digit pin numbers how could we do such a thing? After all, it’s not like people are just going to tell you the code they like to use. It turns out the databases of leaked passwords that have been floating around the Internet are the perfect source for a little study like this one. One such source was filtered for passwords that were exactly four digits long and contained only numbers. The result was a set of 3.4 million PIN numbers which were analysed for statistical patterns.

As the cliché movie joke tells us, 1234 is by far the most commonly used PIN to tune of 10% (*facepalm*). That’s followed relatively closely by 1111. But if plain old frequency were as deep as this look went it would make for boring reading. You’ll want to keep going with this article, which then looks into issues like ease of entry; 2580 is straight down the center of a telephone keypad. Dates are also very common, which greatly limits what the first and last pair of the PIN combination might be.

We’ll leave you with this nugget: Over 25% of all PINs are made of just 20 different number (at least from this data set).

I’ve noticed traveling in the mid-west or south (of the U.S.) that could come up a lot.

Most other places I’ve been to the pronunciation of “pen” and “pin” is different enough that I honestly don’t think it’s ever a problem.

In fact, around here (Pacific Northwest, known for our lack of an accent, which is not to say our speech patterns aren’t identifiable) I rarely hear anyone refer to PIN numbers or ATM machines, everyone just says PIN or ATM and stops talking.

At any rate, I would be surprised if you don’t have an attachment to the mid-west or south simply because you brought up the confusion between the two.

Or in other words: When is there a time that you need to enter a PIN and can replace it with a pen? And when do people ever ask you to give a pin when they aren’t crooks? If it’s ever asked the question is ‘enter your PIN’, which is usually on screen and not verbally.

All of these conclusions are invalid, the data set is useless. The simple fact is these aren’t actually PINs but passwords to badly coded websites. I’ll often use the same incredibly simple password like qwerty or maybe say 1234 and a mailinator email on many websites which force you to sign up.
That said there is of course always the odd few idiots who will use it but I don’t believe its as much of a problem as the article suggests it to be.

Agreed. The Hack-a-day article’s also misleading. If you actually read the article in the link (which does have some interesting information) it’s firmly established that the database is from leaked passwords – not actual passwords.

That’s about as unbiased as polling attendees at a Gun Show about how they feel about gun control.

I suspect that in the real world the data is similar, although hopefully the 1234 combo won’t be quite as high a bias.

I agree. Many websites force you to join them, just to be a pain in the arse, and possibly for marketroid purposes. If the login’s worthless to you, didn’t cost you any money, and you can get a new one easily, then it’s more important to pick an easily memorable one than a safe one.

This is completely different from real bank PINs. It’d be interesting to see the same analysis of those (and why not? They mean nothing without account numbers). I bet it’d be very different.

Considering redundancy, since there are only 10,000 possibilities, and over 100 million users, it really has little statistical significance. Maybe it just that we favor some numbers more than others. Just a thought…

This appears to be just another proof of Benford’s Law.http://en.wikipedia.org/wiki/Benford%27s_law
The first paragraph of the wikipedia article is telling. I’m too lazy to cut and paste it here, but you can read it for yourself at the link provided above.

Actually not a particularly good match for Benford’s Law (at least not for the first digit, which is the only one that was graphed).

Benford’s law predicts 2 to be a bit over half as common as 1, for the first digit, and 3 to be a bit less than half as common as 1.
In this distribution 2 is less than 1/4 as likely as 1, and 3 is even less likely than that.

When I first got a debit card th PIN was assigned. After several changes in ownership of the bank I was REQUIRED to select it myself, having one assigned wasn’t an option. I just chose the year of a major event in my life. Secure enough I think. One would have to first figure my first step, figure out what the event was, and finally remembering the year if they got that far. My guess is few of my family that where alive at the time would remember the year if they correctly guessed the event. I winder if anyone uses 2525; Like in “the year 2525 if anyone is still alive if woman can survive”

Yup. In fact for lotteries where the winners share the prize and the prize increments each week the winning numbers are not selected it is possible (barely) to actually make money on the lottery in the long run if you select numbers which are rarely selected by others.

Your numbers won’t be drawn any more than any other combination, but when you do win you won’t share the pot with as many other people.

Of course once you take taxes into account, and the fact that most lotteries give out the winnings in a series of payments over time instead of a lump sum that math advantage is much less.

You want to get banks attention look into the proprietary TCP/IP crypto protocols on their ATM WANs or publish docs on fourth track cloning and analysis. RFID and Bruting stuff is all boring to people who actually know about financial systems and security..

I suspect hte Reason for the extremely frequently appearance of the pin 1234 is the fact that it’s very easy to guess. The Author of the site said he got the numbers from CRACKED PINs in the net, maybe there are many totally random numbers around, but nobody did guess them therefore they don’t appear in his statistic… I can’t believe that 11 % of the human population with acces to modern technology are dumb enough to use such a PIN on something security-critical…

For all the drama queens, there are hundreds of millions of citizens in the US alone and a key-space of 9999 where it’s usually selective per-holder per-network..

And again, there are logistics firewalls that make these all literally&completely worthless. The only real threat is if someone gets the algos to reverse pins off track 3 and 4, which are usually encoded and ran through custom block or stream ciphers..

Bank cards have an eight bit checksum of the PIN encoded in them. If you assume a four digit PIN, then the checksum will provide you with, on average, 39 possible matches. These can be ranked for common patterns or matches to other numbers associated with your person (e.g. birth dates, phone numbers, etc).