What is the point of including the domain tied to the address? It just decreases the anonymity of what you've hashed, and actually does a disservice. There are corporate domains in there and the namespace of what to search for becomes a lot smaller.
In addition, my domain is my name. I saw many others in the file that this was the case for. It's not a big leap to compute my e-mail from 'jedsmith.org', and I'm sure it isn't for those guys either. You're leaking data with this view.
Because I'd prefer not to help distribute the leaked information? If you want the information, nothing stopping you from getting it, true. However, me throwing up an entire list of e-mails just adds to the problem.
Ah - true - we're talking about different things though (which is my fault to start with, but I see confusion in others too).
MD5 is what's used in the linked spreadsheet's email address fields, which is what I thought we were talking about. SHA-256 is used in jedsmith's lists.
I found it useful because I use site-specific email addresses (as in username+domain@example.com). I didn't know if I had a gawker login, so I searched for my domain. If I hadn't found it, I could've saved myself digging through my saved email.
I don't mean "trust that they won't compromise it" (though in this case it's a salted DES password -- on the gradient of exposures, this one isn't terribly high for anything but ultra-simple dictionary tests like password), I mean "trust anyone working at or for Gawker".
That's the point of these releases that really surprises me: People get paranoid because they use the same password everywhere...yet they provide every agent at every one of those places with their credentials and assume its safe.
What's the point of these tools that let you know whether Gawker's database held my email address? It's no secret that I have a Gawker account, and a Twitter account, and a Facebook account... What I would like to know is how likely it is that my password could be compromised. How were the passwords stored? Hashed? Salted and hashed?
If your email is in that list, expect it to get spammed heavily in the coming days.
While it's nice of random social whatever startup hint.io to warn people that their password is compromised, linking to their landing page multiple times in the email makes me think they have ulterior motives.
I got a "We've detected unusual activity on your account" lockout on my Gmail account this morning. Since I can't find anything suspicious on the account I assume it's just bots hammering away, trying the Gawker password and other guesses. Anyone else get that?
Interesting that people only have a work account, or interesting that people who work for the government also slack off at work?
.gov is not just Obama and super-secret crytpo scientists. It's also the person who makes sure that every form they send out has an OMB Control Number.
(I wasn't going to reply but since I'm getting pretty heavily down-voted I will)
I was commenting on how I thought it was interesting that so many government employee's would use their work email for (most likely) non-work related sites.
Also I find it interesting how assuming you were off such a small comment I posted.
For some reason every md5 from this spreadsheet I try to decrypt, I get nothing. I'm using online tools like md5decrypter.com to do this. Am I missing something?
md5 is a hash function, and hash functions are designed to have two properties:
1) they are hiding. You (theoretically) can't reverse the function by any method other than brute-force.
2) they are *binding. You (theoretically) can't find any other input that hashes to the same output by any method other than brute-force.
Any tool that "decrypts" md5 hashes most likely does so by generating what is called a rainbow table -- a giant list of many possible inputs, and the hashes they generate. If you look at the spreadsheet and find a hash from your rainbow table, voila, you know what it came from. To make it harder to use rainbow tables, any security-conscious site will "salt" the passwords before hashing them, by adding a random string prefix. The point is for the random "salt" to be different for each password you are hashing, so a standard (unsalted) rainbow table won't work, and further, the same rainbow table won't work for every password.
(md5 itself has been shown to be vulnerable to collision attacks, which is why I said "theoretically")
Okay thank you for the explanation. Let me try and apply my rudimentary knowledge here...
So a hash function is used to encrypt data by translating it with a certain rule set--I've learned about a simple key%b type function before. But with md5 this hashing function isn't the same each time a new code is created? How is the system able to decode it then? _Something_ out there has to know how to translate that back into a readable string right?
And collision is when different strings end up with the same encrypted code (except if you use a hash chain structure). So how is this used in an attack?
Sorry for all the questions. I know I could probably google this but I always learn better through instruction. Thanks!
"_Something_ out there has to know how to translate that back into a readable string right?"
Wrong. That's exactly your misunderstanding - MD5 is not an encryption function, but a hashing function.
The way it works is, given some string, it will output a new, random-looking string. It's impossible to go backwards, i.e. given the output of running MD5, you can't tell the input.
In a nutshell, The way password authentication works is this: when you sign up to a site, a hash of your password is saved. At this point no one, not even the site itself, can tell what your password was.
When you want to log in, you send the password over to the site, they hash it again, and compare the output with the saved hash. If you put in the same password, the hash will come out the same. And it's very, very hard to find a different string which isn't your password which will get you the same hash output.
>How is the system able to decode it then? _Something_ out there has to know how to translate that back into a readable string right?
Wrong. Password hashes are meant to be one-way and chosen specifically so that getting plaintext (readable string) from the ciphertext (hashed gibberish) is very hard. When you create an account, the plain text for your password is hashed and stored. When you want to subsequently login, the system only needs to use the exact same hashing steps and see if they produce an identical hash to the one stored.
Things are done this way specifically so that if a compromise such as the one at gawker happens, it is harder for the attacker to get people's actual passwords.
This is the primary difference between encryption, where you want to be able to recover the plaintext and hashing, where you want to make it very hard to recover the plaintext.
an MD5 hash is a one-way encryption, you cant reverse an MD5 hash to reveal what creates it. The quickest way to decrypt an MD5 hash is if someone calculates and stores all the possible combinations and then you look up to find them (known as rainbow tables), otherwise you just have to use brute force. Even so, the returned value could be different again, as there is only so many combinations that md5 can make (although for something as long as a password or email address, thats highly unlikely).
md5decrypter would just be a big database of tested strings and their MD5 hash,
In addition, my domain is my name. I saw many others in the file that this was the case for. It's not a big leap to compute my e-mail from 'jedsmith.org', and I'm sure it isn't for those guys either. You're leaking data with this view.
Here's a version that is far more anonymous (and easier, I think): http://undertow.jedsmith.org/gawker/