Don't store the user password on your database. No matter how many security measures you take, there is not a perfect security system. Use a hash method for the passwords, like SHA1, or MD5.

SHA1 and MD5 aren't secure anymore, because of projects like passcracking, we can't trust these hash functions for one way encryption.

In fact, experts suggest that:

"Given the number of practical attacks on MD5, it may be time to move to a Federal Information Processing Standards (FIPS) approved hash algorithm, such as SHA-256, or SHA-512. Note that vulnerabilities have recently been found in SHA-1, however, and NIST is already planning to phase it out by 2010." (Quoted from cn.bbs.comp.security.)

Update - here are some reactions to these issues:

Microsoft is banning certain cryptographic functions from new computer code, citing increasingly sophisticated attacks that make them less secure, according to a company executive.

The Redmond, Wash., software company instituted a new policy for all developers that bans functions using the DES, MD4, MD5 and, in some cases, the SHA1 encryption algorithm, which is becoming "creaky at the edges," said Michael Howard, senior security program manager at the company, Howard said. (Source)

To understand the consequences, this article first explains what one way hash functions are, shows one of their common uses on password storage, and shows the nature of the current attacks and their consequences, and also suggests other alternative hash functions stronger at the present time.

One Way Hash Functions

The following definitions are taken from Bruce Schneier's Book: Applied Cryptography Second Edition:

Other Security Side Effects

In recent works, three investigators: Lenstra, Wang and Weger, showed that it is feasible to build colliding electronic X.509 certificates, using the MD5 collision techniques developed by Wang. You can read their paper here.

For example, digital signatures like RSA, DSA/ElGamel, and Elliptical Curve never hash the data directly, but rather a hash of the data, often the choice is MD5. Also consider DRM (Digital Right Management) implementations using MD5. All these protection signatures and checksums are at risk because of these findings.

If you read the paper, you can learn that it is possible to add a payload to the data, or alter the data without being noticed.

A Proof of Concept

In this article, I wrote about how to implement the attack in Microsoft.NET.

Finding Collisions for MD5

The typical way of collision search is to use a brute-force algorithm: given a hash value h, for a plain message m, written in an alphabet A, then h = MD5(m), so in brute force collision search, we try every possible combination in alphabet A we find a m'message such as MD5(m') = h. m'can be equal to or not equal to m.

The Rainbow Crack uses precalculated tables for intermediate steps on the process, this can accelerate the cracking process. For example, a password of up to 14 characters, of this charset: "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789!@#$%^&*()-_+=" can be cracked in a few minutes.

Better One Way Hash Functions

Some people suggest to use more complicated applications of MD5, for example, use stored_password = crypt(plain_password + salt), where the key is a fixed one or the user id, or some other fixed value. This schema is not stronger, and can be shown to have several flaws to this approach.

Other alternatives are:

Use a key based hash function

Combine algorithms

Use other functions

The first one relays on a key, that can be stolen. Combining algorithms is better, but probably only results in more CPU usage than real protection.

Alternatives Functions

To use other hash functions can be a solution, but what criteria is used to choose a good one?

The answer is simple, use hash functions with a bigger domain of results. For example, MD5 generates a 128 bits value, so the space of possible resulting values is 2128 in size. By simple logic, if your hash function has an output domain of a size bigger than that, then it's a good alternative.

SHA-2 is available on crypto API, and Microsoft.NET, so I suggest you to use it. The SHA-2 is a group of functions, in Microsoft.NET you have the followings classes:

System.Security.Cryptography.SHA256Managed

System.Security.Cryptography.SHA384Managed

System.Security.Cryptography.SHA512Managed

Change Log

September 7th, 2005: Added some quotes from Chinese security groups, and more links on papers about the MD5 collisions. Also a section about other effects on DRM and checksums is included. Some grammar corrections.

September 8th, 2005: Some title changes, to be more precise about cryptology terminologies. Some typos corrected.

September 9th, 2005: Added list of SHA-2 algorithms available on Microsoft.NET 1.1.

Comments and Discussions

Hello.
We've all heard "oh noes md5 is dead!!!!!!1111"
let me say something. With a big enough rainbow table, no one-way hashing algorithm is safe. Lets take this for example.

You have bob, who stores his passwords in SHA1 format hashes after hearing that MD5 is dead.
you have Steve, who stores his passwords in MD5 format hashes, but is a good crypto guy and modifies the hashes with a Pre-Shared-Modifier.

now you have H4x, who comes along and has a 49 gig rainbow table. He knows that Bob stores his passwords as SHA1s because he looks at it, does some cryptanalysis and goes "Hmmm he uses an SHA1 hash! cool!" so H4x takes and SHA1's EVERY value in his rainbow table, then using a basic walk-through-and-check finds ALL Bobs passwords.
Then h4x goes and looks at Steves. Steve alters his MD5s by 1 (a common thing in crypto), but they still look like MD5 hashes. So H4x takes and md5's all the entries in his rainbow table, then tries the same thing he did with Bobs. H4x finds maybe 1 entry that works, tries in and is fruitless because of the Pre-Shared-Modifier.
H4x is defeated by proper crypto and md5 lives on.

----
Morgan Gangwere
Lead programmer, Unknown Software

"Pinky, are you thinking what im thinking?"
"I Dunno brain, how many licks DOES it take to get to the tootsie roll center of a tootsie pop?"
"You want me to calculate that? or should we take over the world?"
"ooh! OooooOOOooH! lets find out!"

Salting does NOT improve MD5.
Salting does NOT protect against brute forcing other than to add 32bits of searchspace.
Salting ONLY disrupts users of Rainbow tables by making storage size and access times prohibitive.

But rainbow tables are NOT ALWAYS faster than brute-forcing! Thats a myth!

Anyone that thinks that a 32bit binary-salted MD5 cannot be cracked in short order has probably made a simple mistake. They've evaluated the algorithm based upon processor cycles required per processing round and accounting for register width. IE, they are calculating the crack in 'Von Neumann' time.

MD5 is simple - just ADDs RORs and XORs - No MULs or DIVs

How fast can you fixed-ROR, saaay, 128 bits. Think about it...

Well, the answer is... no time at all. Its free! Sure, a Von Neumann processor has to chug through it piece by piece according to register width, and loses cycles in the fetching, register loading, the ROR itself and then the writeback...
But how about in terms of LEDs and Switches. Well, you wire the LEDs up to the switches and all 128 bits are RORed according to your fixed wiring. How does that apply to your CPU? It doesn't - But it does apply to FPGAs.

A single MD5 pipeline in a 2mm high programmable logic chip can handle 64 MD5 hashes AT THE SAME TIME (Thats one in each of the 64 bounded stages), spends no gate time on fetch and writeback, spends no gate time on the RORs or ABCD shuffles. The only thing it spends time on is the ADDs and XORs... and well, hardly even the XORs.

The ADDs

With carry lookahead the adds are way faster than a PC, and the latter stages can be shared as the adds are redundant and can be compressed into a single ADD. Not only this but the ADD is a constant ADD too. So, the final stages can be shared across multiple MD5 chains using a multiplex.

The XORs

Well, if you're familiar with your gates you can figure out that nBit XOR takes only one gate delay, for any value of n that fits on the chip. How fast is that? Well, faster than your CPU can begin to prepare a XOR... An FPGA can perform a few thousand bits of XOR simultaneously... before a PC can even fill the first register in preparation.

So, the result? A chip as large as 2cm square and a coupla mil high makes a high-end PC look like a chump. And if you farm the job across a cluster of these chips then you can sit the equivalent of a 100,000xPC supercluster on one corner of your desk... and it will power from a single 13A outlet. Silently too!

The great thing is, these will crack UNsalteds faster than you can look them up in a rainbow table... yet they are bruteforcing! No storage at all! And salteds? Well... feed in 200 salted hashes and you can have the result the same week - for binary salted hashes... and definately under an hour for non-binary salts of < one block width.

And where do you get these FPGA's? Well, safe disposal points are great for free low-mid range FPGAs. When companies change their STBs this is a perfect time to pull a hundred FPGAs and, if you ain't into etching your own boards, you can either have them made up at a budget PCB outlet using a batch process to group snapoff boards for lower costs - or... if your needs ain't so great you can dead-bug them while glued upside down to a sheet of MDF or Ply.

If you use the JTAG boundary register cells as your interface for setting up the job then you're looking at a 4-wire interface regardless of the boundary size or even whether the array is using mixed devices.

Now thats a hackers take on MD5 bruteforcing. And if you think that MD5 with a 32bit salted hash is safe then you're probably a C/ASM coder looking at the world through Von Neumanns glasses - go learn Verilog/VHDL and grab yourself a copy of either Altera Quartus Webpack (free) or XilinX Foundation ISE (Free version) and start simulating designs without the hardware. Honestly, you'll scare yourself straight. Some things really are far faster than ASM ; )

CPU cycles has NEVER been an effective way to ascertain cryptographic duration.

BTW, forget governments and corporations. A limited pre-production of ASIC based crackers can fit 32 MD5 pipelines on a single chip even with a cheaper coarse process. It costs less to do a run of 60,000 of these than it takes to pay a fully popped Crays electricity bill for a year... and they would definately make my FPGA cluster look like a snail. There is apparently one stacked on 16x16 frames in a tower in a London dockside which can crack bins of 32bit bin-salted MD5 faster in a forever cycling bruteforce... Cracking salted hashes with that is kinda like throwing logs in a woodchipper. I wouldn't be overly surprised if theres a couple of equally effective SHA1 crackers floating around.

Think a million pounds is a lot of money? Well... for me it is, but for corporates it is lunch money. But me? Well, I can still give salted MD5 a good run for its money.

So, use *AT LEAST* SHA1 if thats the only alternative offered! Seriously! Forget MD5+SALT completely... its the only way to stay safe from kids with too much time on their hands. The MD5 algorithm runs way to fast for its own good and thats something that was recognised very early on - long before the emergence of rainbow tables clouded the issue. The latest SHA1 issues are nothing in comparison.

I totally agree with you that MD5 is not secure enough for passwords, but I disagree with what you say about it being dead. If MD5 was only useful for passwords, then I'd say yes, its dead. But this is not the case. In fact MD5 can be used (and is used) for many other purposes.

With that said, I think that you raise important issues about security that, sadly, are not taken seriously enough by most people.

How to use SHA2 in .NET i checked in MSDN and there was onlu descruption about SHA1 and according to your article its not a good choise. Can you give an example or just a link to check for details. 10x

I don't want to create panic, my article is informative, because as I say in the article MD5 is under heavy attack, also other hash algorithm like SHA-1, and RIPEMD (on 128 bits).

There is 2 problems

1. Passwords usually use a weak alphabet, for example, only alphanumeric characters, the Rainbow Projecthttp://www.antsight.com/zsl/rainbowcrack/[^], allows to break all these passwords in seconds. Using salt with hash prevents this kind of attacks, in some way, but one has to store the salt somewhere, or predefine one, and this make it weak.

The article is a warning about weak schemas for password storage. And a warning about the fact that for many, human made passwords, MD5 is no longer secure.

"The point is not that MD5 has collapsed. It hasn’t. The point is that there’s a very clear trend regarding the security level of MD5, and it isn’t good. It is now undeniable that the selection of MD5 matters – the
constraint that deployed implementations of the one-way hash primitive be functionally identical has been broken. The failures detected are not merely algorithmic or theoretical, rather new capabilities above and
beyond what the primitive specifies are made available by the selection of MD5."

That's all the point of the article, the article, is with an informative intention, and to show that we must search for more stronger hash functions.

I just put on my blog some post found on google, regarding collision findings on a notebook, I sugest to read this post: http://tinyurl.com/9uu3z[^].

I want to make my point very clear, we must say good bye to MD5, and other 128 bits hash functions, because is posible to make some collisions, and on some time all those functions would be broken.

MD5 is ok. All the hash algorythms are crackable, so if we wait for one that is not we will never find it. I wrote in much more detail in my blog: http://spaces.msn.com/members/JorgeVaras/ (all tho is very novice intended)

I just read your blog, good article, but you don't take much care about time, is not the same to try 2^128 combinations, than 2^512 combinations.

A super computer, like IBM's Blue Gene is probably can search the 2^128 combinations in hours (Blue Gene can reach 100 Tera Flops).

Suppose, that Blue Gene search 2^128 combinations in 1 hour, the search of the 2^512 values is a problem 10^115 times harder, ie, if problem of 128bits takes 1 hour to blue gene, the 512bits would take 1.15^111 years!!!

Agree, the correct terminology is collision. with Break, I mean a method to find a collision in an adecuate time. Brute force always works, and as you point out, it might take way too long.

I did care about time. Towards the end of the blog I said: "Information, most of the time, is time sensitive. Meaning that after a while, there is no point in descrambling a message that does not bring new information. If I am planning to celebrate my wife's bday with a surprice party and I scramble the invitations, she will gain nothing by decoding the invitations after the party.". With which I try to point out (maybe in a bad way) that finding that collision exists (which is already proven in my blog that collisions WILL exist, no matter the algorythm) is not as important as finding meaningfull collisions in time, which comes down to find the right one. Not just any collision, but the right one, for which one way hashing functions are still working (or in different words: "Of course we can find collision, but can we find in a reasonable time not just any collision, but the exact message that produced the hash?" )

My whole point is "So MD5 had collisions, but we knew all along that it WILL have collisions and it does not means that we have to toss it out and the world comming to a crashing end, it only means that using hashes cannot be the single security measure, which we also knew already, thus nothing to see here lets move along";P

Now, I hear your point that current super computers will still take too long... but I don't think that we really know for how long more this will be valid. We can only expect maybe computers will get twice faster and half cheaper every so often.

Right, we always knew that collisions happens, however, if you limit your message space you can limit collisions, but that is not practical, so we accept this fact.

I'm not saying that hashing functions are useless, in fact I say that given the recent findings of cryptology, the hash functions of 128 bits or less are more "insecure" (is not the exact word for what I like to express).

BTW, I'm realizing now that the more critical effect of theses attacks to has functions has to do with message authentication, check sum, and digital signatures. With passwords, still finding a collition will resist a while, given the passwords are strong enough.

I'm thinking on write about the other consequences of the MD5 attack soon.

It depends on how strong your password is. Using a small x86 cluster (a few hundred nodes) a distributed cracker can hit over 1.8 billion guesses per second (http://rpisec.net/news/show/28[^]) and by using GPUs the crack time can be reduced by orders of magnitude.

When subjected to such an attack, a typical 8-character password will last minutes at best, and for a few million dollars an FPGA/ASIC system can probably break 10-14 characters, even if salted (as long as the salt is known).

"Information, most of the time, is time sensitive. Meaning that after a while, there is no point in descrambling a message that does not bring new information. If I am planning to celebrate my wife's bday with a surprice party and I scramble the invitations, she will gain nothing by decoding the invitations after the party."

This is an excellent point. But let's examine it in the context of passwords. Most users do not change their passwords after they first create them. This is especially true in web applications. So for these people, decoding a message (a.k.a. password) even 2 years after you receive it can prove useful. I know there are methods of securing logins (SSL), but wouldn't those have problems as well (stolen keys)?

I guess the point here and in this article is that your security only holds up for so long until computing power catches up to it. Some people may try to counter this point by saying that we will eventually reach a ceiling on computer speed. That may be true for the current semiconductor based computers, but what about quantum computers?

In summary, I really enjoyed this article as well as all the responses to it. Good stuff.

Teraflops mean nothing. Nothing at all. Can you tell me where the floating point operations come into this ? Besides which, the BG/L is closer to 320 TFlops when running on all 64 racks ... yes, 64 racks ; )

Similarly, a Cray may have 4000+ processors... but thats like trying to happen 4 trillion nails using 4000 bananas... it might be better than a single banana.... but whats that worth when what you really need is a hammer.

The right tool for the Job isn't even benchmarkable. Simply because it has no clearly defined hardware limits as regards speed. The limit is purely related to the job and how thin you can spread it/thick you can layer it per device... and how much pipelining it will support in order to maximise the gate duty.

For a twentieth of the cost of BG/L (And JUST its hard cost - not its development cost) I could blow it out of the water on MD5+SALT. Primarily because MD5 loves dedicated logic and parellizes so well in an ASIC/FPGA/CPLD design that general purpose F/E Cycled processors just cannot come close.

Don't get me wrong, BG/L is impressive... and it knocks socks off dedicated logic when it comes to complex issues... But seriously, Blue Gene doesn't even belong in the arena when it comes to MD5. Thats like expecting a 200ft giant to win a web-spinning competition against a colony of a few million tiny spiders. And mentioning the TFlops like it matters is like betting into the Giant cus hes got bigger muscles... its just silly.