Care to explain? The hashes are what is compared so it seems it's important.

duskwuff · on Dec 14, 2016

The existence of crafted collisions -- being able to create a pair of M1 and M2 such that MD5(M1) = MD5(M2) -- is primarily relevant to situations where MD5 is being used as a signature algorithm, such as in certificate issuance. In these applications, being able to generate a pair of documents with the same hash is catastrophic.

Being able to generate a pair of passwords that are treated as equal, on the other hand, is useless from a security perspective. It's a neat party trick, but it's not dangerous.

Now, if there were a preimage attack -- being able to take MD5(M1) and come up with a M2 such that MD5(M2) = MD5(M1) -- that'd be a much bigger deal, and it'd break MD5 password hashing wide open. But nobody's done that yet.

rev_bird · on Dec 15, 2016

I'm a total greenhorn when it comes to cryptography, but the difference between these two situations was totally lost on me until I read this comment. When I see, "It's easy to create MD5 collisions," my first thought is, "If you give me a hash, it's easy to find a string that results in an identical hash." If I'm understanding this right, that would be a "preimage attack," and would be bad for all the reasons being discussed in this thread.

However, it seems like "It's easy to create MD5 collisions," at least as it is true today, actually means something different: That, given a string, it's easy to find a second string that shares the same hash. If that's the case, I have two questions:

* I am totally lost as to how these are different scenarios. There's no difference I can see between "Here's string A" and "here's the hash of string A," if the goal is to find a "string B" that shares the hash. Are these "crafted collisions" generated by modifying string A and string B, until a collision pops out?

* If that's the case... what's everyone freaking out about? Why were people saying MD5 is unsafe 20 years ago, if even now, we can't achieve a preimage attack that can get you into an account based on the valid password's hash? Yahoo could have printed these hashes out and hung them up on posters in the mall and no one would have been able to get into accounts from it. There are dozens of comments lamenting how stupid this was, but... it seems like there's no actual problem?

duskwuff · on Dec 15, 2016

> However, it seems like "It's easy to create MD5 collisions," at least as it is true today, actually means something different: That, given a string, it's easy to find a second string that shares the same hash.

Very early MD5 collision attacks were even weaker, actually: given nothing, it was possible to find a pair of arbitrary garbage strings which had the same hash as each other. It wasn't until later that it became possible to pick what the strings would "look like".

> Are these "crafted collisions" generated by modifying string A and string B, until a collision pops out?

Generally speaking, yes.

> If that's the case... what's everyone freaking out about?

The issue with using MD5 as a password hash function actually has nothing to do with collisions. That's a red herring. :) The real problem is that using any fast and/or unsalted hash function for passwords is unsafe!

A fast hash function is unsafe because it makes it easy to generate a bunch of potential passwords, calculate their hashes, and look for a match.

An unsalted hash function is unsafe because it makes it possible to build a "rainbow table" of all possible passwords and their hashes, and look up password hashes in that table.

As used in this situation, MD5 is both fast and unsalted.

leereeves · on Dec 15, 2016

Most people here don't seem to understand the difference between collision and preimage attacks. So they're overreacting to the fact Yahoo used MD5.

Storing unsalted passwords, however, would be a huge mistake, if Yahoo did so as someone here claimed.

There are precomputed lookup tables for the unsalted hashes of many, many passwords (both MD5 and more secure hashes) and cracking unsalted passwords is simply a database lookup.

rev_bird · on Dec 15, 2016

Ah ha! There's the weakness I was missing, thank you so much for responding. I hadn't even thought of it that way---I knew salts shook up the resulting hashes, but an actual benefit of it is that it makes it pretty much impossible to do any "homework" (rainbow tables) ahead of time.

supergreg · on Dec 15, 2016

Google(MD5(M1)) = MD5(M2) is more than enough for most users.