Hacker News new | past | comments | ask | show | jobs | submit login
I wrote BozoCrack to show why plain MD5 is a horrible way to hash passwords. (github.com/juuso)
227 points by aparadja on Nov 7, 2011 | hide | past | favorite | 117 comments



I don't think it's demonstrating anything other than:

   1) many developers don't use salts / HMAC
   2) MD5 is popular
   3) hashed passwords end up on Google
From these 3 points I don't think it follows that MD5 is horrible.

Any hashing function would have the same issues, simply because a hashing function is a mathematical function, so for any X from the domain of definition, H(X) will always have the same value, on every call. Therefore, if a hashing method is popular enough and developers don't salt their hashes, then hashes for common passwords will inevitably end up on Google. However, if you salt your hash calls with your own key, the hashes produced will be different from everybody else's.


Think of (3) as saying without salting, with Google and the internet, the precomputed tables that are used for brute-force attacks are already out there and can be queried with a simple Ruby program.


Why ruby in particular?


Because this particular program is in ruby.


Consider the program in vimscript


Perhaps I should have written it as "unsalted MD5" instead of "plain MD5" to avoid confusion. Unsalted MD5, in my opinion, is horrible. MD5 plays it's part in the mess: it's quick to calculate, which means that anybody can churn out huge lookup databases. Missing salts make those databases universally usable.


The salt does not matter. Neither does the specific hash; you'd be just as boned using SHA256. All cryptographic hash functions are designed to be fast.

The vulnerability is "not using a password hash construction", of which the best known are bcrypt and PBKDF2.


Hate to be pedantic, and I'm sure you already know this, but you do realize that PBKDF2 uses SHA in HMAC mode right? There's nothing inherently slow about that, it's the repeated iterations of hashing.

You could slap MD5 into PBKDF2 with a high iteration count and achieve comparable security. The problem is that devs often use a hash function a single time.


I'm drawing a line between "cryptographic hash functions" and "password hash constructions". One is a "function", the other a "construction".

Password hash constructions do more than simply run the hash function multiple times.

We are, obviously, saying much the same thing.

Again: the key point here is, don't DIY this part of your application.


Fair enough, I suppose I didn't read your original comment closely enough. I appreciate your essays on the topic, but I fear that too many people have, without a full understanding, drawn the conclusion from them that SHA = bad, bcrypt/pbkdf2 = good, without fully realizing that SHA is an integral part of pbkdf2.


Man I'd be thrilled if everyone just thought "SHA=bad, bcrypt=good".


Really? How would you pay your bills then? :)


Dance lessons.


Sure it matters.

If you don't have a reasonable salt then you're vulnerable to time/space tradeoff attacks like rainbow tables/etc.

Also, in case you haven't used it. Scrypt is pretty damn awesome. It's designed to be memory hard, which makes it very hard to crack even with GPUs.


> huge lookup databases

"Huge" being the key word here.

Try searching for the md5sums of arbitrary 8-character alphanumeric passwords. You won't find many results. 62^8 is a big number.


You can download rainbow tables as torrents.

8-character alphanumeric are the ones that are being offered today, it's only going to go up as GPUs get better and HD space/bandwidth increases.


Covering the whole solution space would be an enormous (as compared to just huge) undertaking. But if you limit yourself to the types of passwords people usually use, you can prune it down quite a bit.

The currently available databases are surprisingly large, even if not in the scale of 62^8. Consider a random pick from decrypt.fr: "phytostrote972". It's far from a random string, but not exactly a trivial one either.


> Covering the whole solution space would be an enormous (as compared to just huge) undertaking

A HD 5870 can churn through MD5s at about 2.8 billion/sec - 62^8 inside 24 hours on a single unremarkable GPU.


I believe that with GPUs it's now faster to calculate possible MD5s on-the-fly than use rainbow tables.

Hence the current advice is to use "alongpassphraseasyourpassword" rather than "L33$Pa55wd"


This is where password management gets ridiculous, because you'll find a lot of registration forms limit the length of your password input to, say, 12 or 16 characters.

Why? Is it not being hashed? I have a (possibly very wrong) inkling that a longer phrase might increase the chance of collision but even so, so many places enforce a strong password but force you to keep it short.


The worst offender is NVidia. They have half-a-dozen different developer logins for different bits of their site - and they all have different rules = your CUDA one must have a symbol but the parralel Nsight one must not etc


I greatly doubt that it is faster to compute on the fly then to use a rainbow table.


I think the record is just under 1Billion hashes/sec on a couple of NVidia Tesla cards linked by SLI.

You need a fairly expensive SAN to pull 1Billion MD5 hashes/second just from the raw disk - nevermind the DB lookup.


If using your own salt is all you do, you'd still be susceptible to easy statistical attacks once someone compromises your database. But, yes, they probably wouldn't be available on google.


If using MD5 is all you do, you'd still be susceptible to brute force attacks. MD5 is a really fast hash to compute, salting or not.

The solution is to pick a better algorithm and learn how to use it securely. That probably won't happen unless all the ridiculous PHP 'security' tutorials are erased from the history of the internet and only correct methods are shown.


    If using MD5 is all you do, you'd still be 
    susceptible to brute force attacks.
Only if you know the algorithm and salt used (i.e. your source code is also compromised, not just your database).

Otherwise, demonstrate to me how you can find the passwords that relate to these hashes (all of them use the same salt):

     23C206503ABD36FCB575FC8F12791CF0
     D82BDB4160F60B657D6F994B553D2E63
     0DA0572E042F822F91772F14269548E6
     CB8BF6C16029400885F9A68A17576FA7
     7CE4F679810409CF1477CF142B481EF9
It should be easy, right?


A standard assumption in cryptography, known as Kerckhoff's principle (http://en.wikipedia.org/wiki/Kerckhoffs%27_principle), is: "A cryptosystem should be secure even if everything about the system, except the key, is public knowledge." or "Don't rely on security by obscurity.".


In this case, the salt is effectively part of the key. They know how you use the salt, sure. Just not the big arbitrary hex string that is the salt.


No it is not, because the salt is stored in a plaintext or easily reversible format somewhere, and has to be in order for authentication to work, if you are in a position to grab the hashes, it will also be trivial to grab the salts. The point of the salt is not to add some padding bits to the password. Salts should be publicly knowable without it causing a loss of security.

The one and _only_ use of a salt is to prevent precomputation attacks (rainbow tables).


But isn't the SSH private key also stored in plain text?

I mean, yeah, you could also have a password for that key, but then most people use ssh-agent because typing that key every single time is annoying, which means the password is somewhere in memory. Or you could just install a keylogger on it and wait for the user to login.

If the user's computer is compromised, a hacker could gain access to his SSH credentials. Isn't that still security by obscurity?

And if you gain access to the source-code and to the database, I'm pretty sure you'll end up in a position to modify that source-code anyway, thus finding user passwords by simply logging them somewhere.

Security is a complex topic and relying on the slowness of an algorithm like bcrypt doesn't make me feel any safer, as people can always come up with a faster bcrypt. Instead we should rely on computational complexity, because no matter what we do, unless quantum computers become a realy, there are limits to what we can compute when exponential complexity is involved.


> Instead we should rely on computational complexity, because no matter what we do, unless quantum computers become a realy, there are limits to what we can compute when exponential complexity is involved.

Quantum computing does not help with exponential problems in general.


> But isn't the SSH private key also stored in plain text?

Not if you passphrase-protect it.


From throwaway64, who is dead:

">But isn't the SSH private key also stored in plain text?

no, passwording the key encrypts it

>I mean, yeah, you could also have a password for that key, but then most people use ssh-agent because typing that key every single time is annoying, which means the password is somewhere in memory. Or you could just install a keylogger on it and wait for the user to login.

ssh-agent does not expose the private key to clients requesting it, thats part of its design, you can however get a login session to whatever hosts it holds keys for.

> Isn't that still security by obscurity?

That is much more involved than a hit and run attack where you download the DB, and much more likely to be detected/detectable before any harm is done, via such things as IDS, or just plain not possible due to how a system is locked down (stuff like ssh gateways that are heavily secured, etc).

>Instead we should rely on computational complexity ... there are limits to what we can compute when exponential complexity is involved.

the point of bcrypt isnt JUST that it is slow, its that each step requires the data from the previous step, many thousands of times over, that makes it impossible to parallelize across many GPU cores or similar arangements, thats a huge part of the weakness of stuff like plain hashing/salting, its trivial to parallelize and to scale up that parellelization till you are generating billions, or even trillions of hashes a second, that approch is totally useless on bcrypt."


It's somewhat optimistic to assume that the adversary knows no passwords - even if you can't create a user, phishing one isn't that hard. That makes recovering the salt much easier.


Well yeah, but then you're proposing a brute force against the salt and the salt can be anything of any length.

So if you're assuming a salt that can contain standard latin letters and digits and is 256 in length, that's 62^256 possible variations.

Do note that I'm not saying here that MD5_HMAC doesn't have flaws, but it definitely doesn't have the same flaws as MD5, cracking it ain't easy and I can't find a reference for an instance in which this was actually done.


Where did the HMAC come from? I thought we were discussing plain MD5.

And yes, a huge secret salt will help. Of course, if your source code is ever visible to anyone you'll have to lock all accounts.


Well, I thought it was assumed :) ... since salting a hash has inherent vulnerabilities if you're not doing it right and HMAC is a sane way to do it.

I really don't get this argument that since MD5 is fast to compute, that's the reason it is vulnerable. That's not true, MD5 is vulnerable for other reasons, like collision attacks are possible, preimage vulnerabilities were demonstrated, huge rainbow tables are precalculated and so on and so forth.

However, HMAC_MD5 does not have the same vulnerabilities and increasing the size of the salt increases the time of a brute force attack exponentially. 62^256 is a freakishly big number and it doesn't matter much if you divide the work by 100,000 computation units. But if that makes you feel uncomfortable, you can always increase the salt to 1000 chars.

And I simply don't buy that we have enough computation power in this world to brute force something with 62^1000 computational complexity.


A salt will not remain secret if your hashes are compromised, its always going to be a trivial step to also grab the salts.


except you shouldn't have 'the' salt, you should have 'a' salt for each user...


The parent to my post specified "all use the same salt".


Phishing the password from one user and recovering the salt shouldn't be useful in the first place. The parent example was only meant to show how difficult it is to recover a salt even with multiple examples of its use, not to give a real life example of password hash use. (Which was my point)

That said, I don't know how you would obtain a list of hashed passwords without also getting the associated list of salts (wouldn't they be in the same database?), so it is kind of a moot point. The different salts are intended to prevent against the ability to have a single rainbow table to crack every password in the database.

Instead, you need a table for each salt, which means that you basically have to brute force the entire database. This still doesn't really help if you are using something fast like MD5, as brute force solution will be possible with that algorithm. Which is why you want something reasonably slow.


Having the exact salt in the same database as the user data defeats the purpose of the salt.

Normally you have a global salt, somewhere in your source-code, which you combine with the per-user generated salt. It also doesn't have to be something obvious in the database (like a column named user_salt :)), you could just use something like HMAC_MD5(global_salt, email + username + joined_date) for each user.

Of course, this may seem like security by obscurity, but even in the case of SSH you keep your private key safe and as a business if you have both your database and your source-code compromised, you're fucked anyway.


How does it defeat the purpose of the salt?

The purpose of the salt is to defeat time/space tradeoff attacks by inflating the required space to the point of impracticality. ie. 20 bits of salt will increase the size of the rainbow table required a million times.


immutable fields are probably recommended for salts ... I quite like being able to change my email address on a site ;-)

but yes, a hashed (global + immutable-user-specific) combination seems to be best practice.


Mostly, there will be a column conveniently labeled 'salt'. And in e.g. a MySQL database, you can bet the native hashing format has been used.

Ignoring that, if someone got your database, you should assume they got your code. If you care about passwords not being lost, you use bcrypt or something similar.


You ommited this:

  > MD5 is a really fast hash to compute, salting or not.
Just spend some time thinking about this.


Yes MD5 is fast to compute and ... NO, in the example I've given it really does not matter much.

As I said, it should be easy cracking those hashes right? Prove me wrong.


"Prove me wrong."

This is not how discussions in this topic work.


If I have your hashes, I probably have your salts too. (Unless you use a single salt, in which case, I can easily get your salt if I've already got a username/password for your site.)


Do you have a reference that you would consider "the minimum you should know?" And, perhaps, maybe one somewhat better than that one? TIA.


Yes: http://codahale.com/how-to-safely-store-a-password/

This is the answer to this question. You should bookmark it and share it with your peers when this question inevitably comes up on another thread.


OK, so this is still the definitive thinking. I think you've told us about this before. Thanks again.


Just remember "use bcrypt" and you should be fine. There is almost certainly a library for it for your language.


Definitely. I'm just saying that I could recover the common passwords with a 486 instead of a GPU :)


I don't believe you understand what the salt is for.

You use a salt so that time/space tradeoff attacks are no longer viable. All you need to do is provide a few bits (say 20) of entropy to make them infeasable.


Obligatory http://codahale.com/how-to-safely-store-a-password/ link which taught me a ton.


Thank you. This article is quick and easy to digest and understand.

There's always a lot of whining and complaining about how {crappyhash} algorithm is bad and how salts don't matter but not a lot a of {dothisinstead}.


This is the HN discussion and comments about this article (How To Safely Store A Password): http://news.ycombinator.com/item?id=2004833


I see that link referenced a lot and don't think that's a good thing. He's right, but he doesn't explain why we should use bcrypt (or any other adaptive password hashing function). Picking bcrypt without knowing why is just as bad as picking MD5 without knowing why.


No, it's really not, as long as you follow current guidance on counts for iteration. There are people smarter than <x> random person and sometimes (often with crypto) it's better to follow their advice. Anyone can make a system that they themselves cannot break. Don't be that person.


I strongly disagree. If you are the person tasked with implementing password verification, then you really should learn enough about the field to give a rough explanation of why you've used bcrypt (or whatever you use). If you know what MD5 (and other simple hashes)'s deficiency is (and Hale explains it), then you should know how bcrypt, etc. solve that problem. One should follow expert advice but not blindly.


For those who don't want to read the source code, BozoCrack has a simple algorithm. It googles the MD5 hash and hopes the first result page contains the plaintext password. It usually does.


"It usually does" isn't quite accurate here. Common or weak plaintexts might work, but for the vast majority of input you're SOL. Sure "nicetry" comes back, but "nicetry99" produces 0 results and for every "nicetry" there are an infinite number of "nicetry"+i hashes.


Ah, you are technically incorrect in saying there are an infinite number of "nicetry"+i hashes. There are an infinite number of "nicetry"+i passwords, but eventually there will be collisions as the hash set stays a constant size and the password set grows without bound. "Infinite" isn't a term to throw around too lightly.


There are 2^128 possible MD5 hashes. When it becomes impossible to increment a counter to a number, that's as good as infinite.


The point is that it's unnecessary to find the plaintext; all you need is some plaintext that produces the same hash value. It doesn't matter if your actual password is "zipobibrok5x10^8" when "fordprefect" also gets you into the system. (That, of course, only applies to a single system -- or to a cluster of systems all using something like an unsalted MD5. It would matter if you're trying to leverage a password found on a cat fanciers' site to empty someone's bank account.)


I actually think you're missing the point here. While it is true that an infinite number of strings correspond with each md5 hash, the question was about trying to actually find a match. With a suitably large hash, say 256 bits, it becomes physically impossible to even count that high, let alone compute that many test hashes. A problem that is too large to evaluate is effectively infinite.

(Yes, md5 is 128 bits and might be possible if an entire country dedicated itself to the effort. Or an attack on its flaws could be used. But both these points are tangential to themouth's use of infinite.)


With the technique under discussion (using Google to search for the MD5 hash), it doesn't really matter what the computational cost of finding a plaintext for the every possible hash value is -- you're not brute forcing a collision, you're doing a search using someone else's enormous resources. That's always going to be O(1) from your point of view (with a lot of overhead, of course).


In my experience, it is accurate. Google doesn't know "nicetry99", but it does know "octopus99", "octopus98", "octopus97", "hacker99", "hacker98", "hacker97" etc.


Anecdotal evidence might as well be no evidence. I'm not saying that you don't have a unique or clever idea, but it should be judged on its merits. This tool will be great for finding common passwords and variants where a system uses unsalted MD5 hashes; but it relies on known hashes, so for the vast majority of possible input strings that have no reason to be generated or indexed by Google it's ineffective.


Anecdotal evidence is perfectly fine to say 'usually' here.


Tried it with a bunch of passwords that I know my friends use (yes, they treat me like tech support) and it failed miserably with all of them.

It did however get a password that I use regularly (though not anymore) which I though was pretty complex.

So, friends with "idiot" passwords 1. Their so called computer-expert mate (me) 0.

I'm not so smug anymore.


I've used Google before at work.

We inherited a system, without inheriting the administrator passwords, that we had to work on. It was a spaghetti mess, so creating a new account didn't seem obvious, and I wasn't sure how the passwords were hashed, but they did seem md5-like to me, so I googled one.

Turns out, 90% of them, including most of the admin passwords, were just four numeric characters, like 9678.


Whenever one of these posts comes up it seems like there's a lot of comments rushing to defend salted MD5 or SHA1.

What's actually wrong with bcrypt that prevents people from using it? Is it not available on all platforms? Too computationally expensive?


When people rush in to defend salted MD5/etc, they aren't actually doing it because they objectively think it's ok. They're doing it because presumably at some point in the past they have done it or allowed it to be done, they're just defending themselves. Unfortunate, since this is meant to be constructive criticism.


I would use bcrypt but it's not available as a encryption option for the realms in Apache Tomcat. One possibility would be to write an own realm but it's not that easy. Plus additional work is needed to update existing hashes, currently I therefore use salted MD5-hashes.


Integrate your email systems with Google mail or MS mail. You'll quickly find that they do not accept bcrypt. Plain md5 or plain sha1 is all they support (at least that was the case two years ago). When you are forced to inter-operate with the big guys, you'll find not many actual use bcrypt.


If you need to authenticate from another party's hashes, why not run bcrypt on those?


I dub this “shameware” and declare it awesome.


    wordlist = response.split(/\s+/)
Thank God I use spaces liberally in my passwords.


Unfortunately, the same sites that are naive enough to use MD5 for cryptographic hashing are also likely the same sites naive enough to use oversimplified regexes that fail to validate all possible inputs.

(If I had a dollar for every time the 'emailaddress+foo@gmail.com' failed to validate....)


Seriously? Who is still using md5? There are strong hashing libraries for like every language. Anyone reading this uses md5?

Can we find these people and just let them know?


> Anyone reading this uses md5?

I use MD5 all the time, just not for security.


Ultimately, data integrity is the same thing as data security. If you cannot trust your data not be detectably corrupt in the face of a malicious collision attack, you do not have data integrity. Collided data can be used to cause a DoS, to overrun buffers, any number of nasty things that arbitrary user data can cause when trusted implicitly.


You are making assumptions that are unwarranted. There are other uses for MD5 besides data integrity and data security. For instance, I generate an MD5 hash of a user's email address and use that hash for Gravatar. While someone who knows quite a bit about MD5 and the other person's email address, can force a collision ... all that would get him would be the other person's avatar image ... which is public anyway. In other words, using MD5 is perfectly fine in situations where collisions don't pose a serious problem.


Wasn't MtGox, the Bitcoin exchange that was hacked a while ago, using md5?


Take a look at the PHP code in this question and weep:

http://stackoverflow.com/questions/1581610/how-can-i-store-m...

I think we need more code examples of how to do things right, because that code is very, very wrong.


Microsoft Active Directory Server systems use MD4 in their central distributed credential database (ntds.dit). Yes, that's a 4, not a 5... MD4. This is the strongest storage hash that one can use in Active Directory.

Shock and horror when geeks meet the real-world. Yes, I know.


This kind of work is one way we let people know.


37signals used to store passwords in plaintext.

http://blog.jgc.org/2009/05/can-you-trust-37signals-with-you...


Majority of voip connections and some simple http authentication for example: http://en.wikipedia.org/wiki/Digest_access_authentication

That means they're both md5-hashed on the wire and in the database.


Your post feels rather optimistic to me. Not only do a lot of people still use md5, I'd argue a sizable number of sites still store passwords in plain text.


A simple salt would fix the issue with it finding on google (unless your salt is incredibly common).


Yep. I find it fascinating why plain unsalted md5 hashes are as common as they are. Developers go through the trouble of hashing, but don't go the single necessary step further.


Salted md5 is still surely laughably weak in an age of GPU cracking?


Sure. I guess there's not a whole lot of excuses to avoid bcrypt these days.

Bozo's idea was to show that unsalted MD5 is, for most passwords, as bad as no encryption at all. An attack doesn't get much easier than a lookup table.


To the other posts saying how MD5 is not bad and/or it's stupid to use MD5 without salt (or whatever):

See this script more as a fun little hack rather than a "Formal proof". Thanks aparadja for sharing. That being said, using MD5 without salt is asking for trouble. I mean, I know security is usually just a time vs $ vs quality problem, but it costs almost nothing more to add a salt in front of the password. Why not do it?


Using MD5 with a salt is asking for trouble.

Using SHA256 with a salt is asking for trouble.

Use bcrypt, scrypt, or PBKDF2. Do not DIY your password hash.


PostgreSQL does this right by supporting adaptive Blowfish hashing in their pg_crypto extension[1]. I checked to make sure since we do use pg_crypto.

1. http://www.postgresql.org/docs/current/static/pgcrypto.html


Thanks for the enlightenment. I have to agree that once you have access to the db, salt or not doesn't really matter security wise. But at the end, it really boils down to: Do you have a strong password? If yes, even md5 is "somewhat" safe. But to be fair, I don't know how big the rainbow tables are.. and could become.

For instance, I've got lots of password with over 20 characters mixed with upper/lower cases with a few symbols in it. (i.e. Seems hard but: phzb0xis@mynickname.on.hackernews.com is already pretty long and hard to crack but still easy for me to type/remember), btw that is not my password ;) I wonder if I encrypt that with MD5.. Could it be contained in a rainbow table somewhere?

Your search - 822d6c6e12b26dd30161967354aa4302 - did not match any documents.


A question on the practicality of expensive compute time for password hashes:

If somebody got read-level access to your password hashes, it follows (based purely on the assumption that any app with the rights to read the hash will probably have the right to change it when applicable) that one could simply overwrite the password hash with a new one that is already known, gain unauthorized access, and change the hash back to prevent the user from finding out. Unless you really need that password to crack multiple accounts that might be reusing it, it seems unnecessary. (And honestly I wouldn't care to implement these special password hashes just to protect extraneous accounts of my customers which I don't control)

edit I should clarify that while I agree that expensive compute time for password hashes helps prevent the ultimate compromise of a user's password, I find it a much more worrisome prospect that somebody got access to the database in the first place. To me, a person's password strength is almost irrelevant compared to the importance of preventing brute-force attacks on a login API or ensuring the integrity of the password database and db apps.

That being said, for those wishing to implement a bcrypt-type hash:

  perl -le'print crypt("something", "\$2a\$random")'
should give you blowfish-encrypted password hashes on systems with it patched into glibc. ("6" instead of "2a" for SHA-512)


Strong password hashes don't protect your users from the consequences of security flaws in your own application.

They protect the (majority of) users that re-use passwords on multiple services, and, more importantly, they protect you from the PR shitstorm of having dumps of cracked passwords posted to Pastebin after a compromise.


security is about defense in depth, and doing as much as is feasible to mitigate damage. Proper password hashing wont protect you from a break in on a primary system, but it could very easily prevent a break in on secondary ones.

Also, a note to anyone reading the above post, that is not how bcrypt works and is incredibly insecure.


What is not how bcrypt works? The example that I gave? That comes from the 'crypt' man page explaining how to use blowfish encryption patched into libcrypt?


Great. I think `bcrypt` is another way of hashing passwords. Read this: How to safely store a password http://codahale.com/how-to-safely-store-a-password/


Note to self: poison search results for md5 hashes of my passwords.


Because that way, anyone sniffing or monitoring your traffic doesn't even need to crack the systems to steal the hash, and so you'll be saving everyone some time?

Edit: I'm trying to illustrate the general principle, that you shouldn't take any action thats visible outside your secure perimeter, that depends on knowledge of your password.

What you define as 'outside the perimeter' depends. In the case of your corporate systems, its probably everything outside the corporate network. In the case of your gmail password, its everything outside of [your computer, the SSL connection to google's auth servers, and those servers].

You shouldn't ever leak any information outside that perimeter, that reveals knowledge of your password.

Its generally pretty hard to steal the password hash; if you start revealing what your password hash is to someone doing passive analysis, you compromise a lot.

If its worth thinking about poisoning hashes to protect, then don't try and poison the hashes!


If someone is sniffing his traffic he has much bigger things to worry about...


Poison over TOR?


I don't understand. This is pretty much the definition of a dictionary attack. The dictionary is just stored and accessed through Google rather than packaged with the program.


#1 rule - don't use unsalted md5. I think md5(md5($pass) + $seperator + $pass) works great.


You need an app for this?


An app proves automated cracking is possible. Doing it manually is often unfeasible, therefore many don't worry about such security issues.


There's a bunch of MD5 search engines with rainbow tables etc. plugged in, it could tap into that easily as well.


I think this is exactly why this method is so effective. Google's index all of those sites.

The first result when I searched the hash for superman (84d961568a65073a3bcf0eb216b2a576) was a link to a page titled literally "Google Hash: md5(superman) = 84d961568a65073a3bcf0eb216b2a576", the page is hosted at(http://www.nth-dimension.org.uk/utils/ghash.php), basically someone's gone through the trouble of making a rainbow table that's easily crawlable that makes this method of lookup via Google even easier.

The page has a description:

  > Google Hash is a PoC implementation of an hash search engine using Google.  
  > Unlike other implementations, the aim here is to get Google to store the  
  > word and associated hash. We do this by putting them into the title where it  
  > will always be stored by Google's spider....
The next top hit is md5rainbow.com etc. etc. I would guess that most of the positive results from BozoCrack.rb are thanks these sites.


The exact same thing could occur if another (unsalted) hash becomes as popular?


Yes, as long as it's designed for speed, it's not a good fit for storing passwords. That's why people should use bcrypt or an equivalent. It has both built-in support for salting & a work factor.

(Source: http://codahale.com/how-to-safely-store-a-password/)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: