People have odd ideas about how serious the vulnerabilities in hashes are. These...

tptacek · on April 1, 2015

He's not using "broken" in the sense of "arbitrary preimages", is he? He's using "broken" in the sense of "easy to brute-force", right?

espadrine · on April 1, 2015

I meant it in the brute-force sense, yes. I'm not sure how practical it would be when the input of the SHA-2 is random, however.

emn13 · on April 1, 2015

Yeah - brute forcing is really only a relevant attack if the number of possibilities is fairly limited (even with a really fast hash). Scrypt output is likely to be effectively random (if salted) - and without salt, you'd need to build an expensive (but potentially feasible) rainbow table of possible scrypt output.

ademarre · on April 1, 2015

> you'd need to build an expensive (but potentially feasible) rainbow table of possible scrypt output

I don't think that's feasible. "Expensive" is an understatement. To compute the rainbow table you need to do the work of actually calculating the scrypt output for each password, even though you don't have to store it. The space is just too big, therefore the time cost.

emn13 · on April 1, 2015

It's a question of size - clearly a rainbow table of 1 is feasible. I'm guessing a rainbow table of a million won't be too expensive either, and with a million common passwords, you'd likely crack something.

But yeah, it's not likely to be a major threat. In any case, salting makes it irrelevant.

ademarre · on April 1, 2015

Is scrypt even implemented without a salt? If not, then it doesn't matter. And if so, why?

emn13 · on April 2, 2015

Good point - it is implemented with a salt! However, the way that's done depends on the implemetation. The underlying KDF doesn't require nor generate a salt, so if you're implementing it client side, you'd need to ensure salting works.

vectorjohn · on April 1, 2015

If collisions are not too hard to find, isn't the process just:

1: Steal database 2: find collision 3: authenticate with the input that hashes to the same value

What stops that from happening?

teraflop · on April 1, 2015

Re-read that comment, particularly the part about the difference between a collision attack and a preimage attack.

MD5 is (currently) vulnerable to collisions, but not to preimages. So you can find two inputs that have the same hash, but not an input that hashes to some particular value in a database.

vectorjohn · on April 1, 2015

@teraflop - Oh, so does that just mean basically that you can, e.g. generate a bunch of MD5 hashes and some will be the same? But if you have a target you're basically SOL finding a collision for that? That would make sense if that's what it means.

teraflop · on April 1, 2015

Yep.

More specifically: if you just hash a bunch of arbitrary strings that aren't specially-constructed for the purpose of colliding, then collisions are basically random, and extremely improbable. But you can fairly easily generate two files, differing only in a small number of bits, with the same MD5 hash by taking advantage of the structure of the algorithm.

Examples here: http://www.mscs.dal.ca/~selinger/md5collision/

emn13 · on April 1, 2015

Another consequence of that is that even MD5 collisions aren't at all trivial to exploit in general. An attacker can create a collision, but it's a slow process, and the colliding content is pretty constrained. You'd probably need to be very well informed, and invest quite some creativity to find two messages that are "valid" to whatever system is processing those and have sufficiently different meanings to be useful to you.

Clearly doable in specific instances, but it's not going to be an addition to the script-kiddie attack handbook anytime soon.

espadrine · on April 1, 2015

> These hashes, even old and obsolete ones, are still entirely secure for password like hashing if the password is random enough

Ah, got you, I'm wrong. Thanks!