Yahoo discloses hack of 1B accounts

Arubis · on Dec 14, 2016

Fittingly, attempting to change my password to a 32-character random string generated by 1Password returns an error that the password "cannot contain my email or username", regardless of the contents of that random string (I tried several).

It does, however, _happily_ accept `passwordpassword` and cheerily move along to confirming that my recovery email account from 2003 is still valid.

duaneb · on Dec 14, 2016

Gonna guess that's a bad message for a password length violation or something else.

Not that it's much better. Is it so hard to allow 50 character passwords?

bagacrap · on Dec 15, 2016

I'm guessing it detected an @ symbol?

divanvisagie · on Dec 15, 2016

has anyone tried password@password ?

AsyncAwait · on Dec 15, 2016

if the password is stored properly, (i.e. bcrypt), the number of characters shouldn't matter at all, be it 50 or 5000.

sk5t · on Dec 15, 2016

It sort of does matter for bcrypt, surprisingly: http://security.stackexchange.com/questions/39849/does-bcryp...

In the interests of hewing closest to cryptographic reality, I design not to allow a password longer than the algorithm can usefully use.

dlubarov · on Dec 15, 2016

I think it's best to allow longer passwords for those who use long phrases. It's easier to remember the full phrase than a truncated version. You could show a warning that the extra chars beyond 50-55 will be ignored.

kijin · on Dec 15, 2016

Or you could SHA256 the original password and feed the hash to bcrypt. Remember to use the 64-byte hexadecimal hash, not the 32-byte binary because bcrypt chokes on null bytes.

Everyone's been saying "just use bcrypt", but bcrypt has too many gotchas to be the default choice. We really need to work on getting scrypt and argon2 into the most popular programming languages and frameworks a.s.a.p.

blowski · on Dec 15, 2016

> Everyone's been saying "just use bcrypt", but bcrypt has too many gotchas to be the default choice

This has got to be the underlying problem of modern security. By the time a best practice is well known, it's no longer best practice.

dajohnson89 · on Dec 15, 2016

I think that's a good observation. The implication seems to be that we're not iterating fast enough, or not sufficiently fast in implementing changes/improvements.

On the flipside, isn't there a risk of moving too quickly? There's a certain culture of caution because there's something to be said for "if it aint broke, don't fix it." and even if something is broke, how certain are we that cool new encryption algorithm is better or safer?

kaeluka · on Dec 16, 2016

Like nutrition!

paulmd · on Dec 15, 2016

You would probably want to use PBKDF2 as a key-stretching function rather than just naive SHA256. Otherwise you're clipping your bcrypt input from "56 arbitrary bytes" down to "56 hexadecimal characters".

I haven't looked deeply at this, but using "key stretching" that clips your output characters to such a small space smells very suspect to me.

Remember: there is only 32 bytes of actual output there, regardless of whether you represent it as hex or binary. And since bcrypt can't take more than 56 bytes of input, you are clipping that down to the equivalent of 23 bytes.

bschwindHN · on Dec 15, 2016

Is "just use scrypt" an acceptable answer then? I'm not a security expert and I don't know the advantages of one over the other.

amenghra · on Dec 15, 2016

Yes, scrypt is a perfectly fine password hash.

If you are currently using something else (say salted md5 or even just plain md5), you can migrate your passwords to scrpyt(current_hash()) without having to change everyone's password and/or wait for everyone to log in.

See also this comment thread: https://news.ycombinator.com/item?id=12549110

aaronbasssett · on Dec 15, 2016

Don't do that. You've essentially just turned the old hashes into plain-text passwords, and how sure are you that those hashes don't exist in backups anywhere?

rietta · on Dec 15, 2016

No, not exactly. An adversary who has the old hash, but not the plaintext that it represents cannot login because scrypt(H(H(value))) != scrypt(H(value)). This is not considering the offline crackability of a compromised hash. But there are legitimate situations where upgrading the password backing to a modern slow hash is preferable to continuing to use the old hash or worse storing the old hash as a field for a long time so that when a breach happens both the new and old hashes are available.

There are user experience battles when talking about forcing a million users to change their passwords in a real system. Hashing the hash may be vastly preferable to management nixing the security upgrade. A password updating schema that changes the hash as users login and eventually locking the accounts of users who have not logged in for an extended period of time can accomplish rolling the hashes without having to tell users to change their passwords.

stavros · on Dec 15, 2016

Not if you mark the converted versions and try scrypt(oldhash()) on users authenticating with them.

fractal618 · on Dec 15, 2016

Woah! Very good point!

kijin · on Dec 15, 2016

scrypt is okay if you use it correctly. It's too easy to use it incorrectly, though, because scrypt is a low-level algorithm that wasn't specifically designed for password storage. [1]

http://blog.ircmaxell.com/2014/03/why-i-dont-recommend-scryp...

In order to be able to tell people to "just use scrypt", we would need to have a sort of standard wrapper that uses the correct parameters by default and produces identical results in every common programming language.

zeveb · on Dec 15, 2016

> You could show a warning that the extra chars beyond 50-55 will be ignored.

Or you could use a better KDF, e.g. scrypt (even PBKDF2 is better on this metric). Artificial password-length restrictions are symptomatic of poor design.

baby · on Dec 15, 2016

This is surprising, do you know how Argon2 behaves compared to this?

pjscott · on Dec 15, 2016

Argon2 does the right thing. No silly upper bounds.

KMag · on Dec 15, 2016

That depends on how silly you consider 2^32 - 1 bytes, if I recall correctly.

Dylan16807 · on Dec 15, 2016

That's just a bug. Truncation invalidates the 'stored properly' part of the statement.

mannykannot · on Dec 15, 2016

Could you expand on that? I did not think bcrypt was responsible for storing the resultant hash. The limit appears to be in calculating the hash.

Dylan16807 · on Dec 15, 2016

The original phrasing was "stored properly, (i.e. bcrypt)". That's including the hashing as part of the 'storing'. Bcrypt has a size limit, but a size limit is not the same thing as truncating. It's broken code on the front end that truncates instead of doing something like sha512.

sk5t · on Dec 15, 2016

Bcrypt spits out a string, that the caller must store, somewhere. I presume the parent post means that Bcrypt "stores" in its output string a value that, for all practical purposes, varies reliably with the same salt but different plaintext.

drodgers · on Dec 15, 2016

If the password is stored properly, (i.e. bcrypt) then there does need to be some length limit or it becomes too easy to DoS a service by sending it hundreds of megabytes of password to bcrypt. There's no reason for that length limit to be less than 100 characters though.

dogma1138 · on Dec 15, 2016

You are going to be limited by the max http request size way before that.

To upload 100s or even more than a few megs you need a multipart message, a password form won't accept MP http requests.

patates · on Dec 15, 2016

On the back-end there usually is a naive POST handler which happily accepts anything it can parse, unless a mature framework with sane defaults is used.

cookiecaper · on Dec 15, 2016

Yep, people who've run marginally popular sites have dealt with this before. Give someone a text box and watch them try to stuff 4GB of content in it. There has to be a cutoff somewhere, but as you note, it should be well outside of the realm of reasonable password lengths (hundreds of characters).

jsjohnst · on Dec 15, 2016

GitHub is the only website I can think of off the top of my head that doesn't limit to an arbitrarily small number (aka <100). Do you name any other "major" websites that allow 100 character passwords?

snowpanda · on Dec 15, 2016

Amazon.com allows 128 characters: https://www.amazon.com/gp/help/customer/display.html?nodeId=...

amichal · on Dec 15, 2016

Anything built with the popular rails gem devise allows 128 by default [1]

https://github.com/plataformatec/devise/blob/88724e10adaf9ff...

XorNot · on Dec 15, 2016

Hash the password locally (you are serving JavaScript over SSL right?) and only send the SHA256.

zeveb · on Dec 15, 2016

That would lock out anyone who chooses not to execute your JavaScript.

Requiring me to trust your code in order for you to decide whether or not to trust me is asking too much.

0xfeba · on Dec 15, 2016

How would you know that the hash is of a password of sufficient entropy?

jensvdh · on Dec 15, 2016

Never trust the client.

XorNot · on Dec 15, 2016

This isn't about trusting the client: it's about your endpoint being able to only accept a SHA256 hash sum from the client (thus: length limited) while allowing the user to input arbitrarily long passwords.

They hash in the browser: the only way they can mess with it by producing silly outputs, but that only hurts them.

libeclipse · on Dec 15, 2016

I can't think of any security implications of hashing on the client-side. What's your thinking?

fnordsensei · on Dec 15, 2016

Does salting work if you hash in the browser?

libeclipse · on Dec 15, 2016

Well in this case the hash would be passed to Bcrypt or Scrypt, which have built in salt support, so client side salting wouldn't matter.

shawnz · on Dec 15, 2016

If the hashes are leaked, you could log in with them.

XorNot · on Dec 16, 2016

Well serverside you store them as plaintext equivalents - i.e. salt+hash the hash. So a leak doesn't leak the user-side.

chrischen · on Dec 15, 2016

It's probably a naive substring detection check.

dmckeon · on Dec 15, 2016

I hit this on the last Yahoo hack go-round, and it seemed that having a name in the form 'F Lastname' (for example) disallowed use of the letter F in the password.

I say "seemed" as I did not go through the exercise of testing with multiple combinations of name, initial, and password.

dbg31415 · on Dec 15, 2016

* Defending Against Hackers Took a Back Seat at Yahoo, Insiders Say - The New York Times || http://www.nytimes.com/2016/09/29/technology/yahoo-data-brea...

Time to update that article from September. Hooray for Yahoo, they made it 76 days without a 500M+ user security breach.

(No, I don't know the actual dates... just making a joke.)

johnchristopher · on Dec 15, 2016

Last time I tried changing my Yahoo password it took me days before it accepted something (and I had password generator scripts and my brain). Now it's back to something along the lines of `letmein`.

raverbashing · on Dec 14, 2016

Just leave it at passwordpassword, it will be leaked eventually anyway

Strong passwords that need to be memorized shouldn't be wasted on security bozos

rovr138 · on Dec 15, 2016

He doesn't need to memorize it. He mentioned he used 1Password to generate it. I'd assume he's storing it there too.

_ao789 · on Dec 15, 2016

I've always wondered when 1Password is going to get hacked..

oddevan · on Dec 15, 2016

Won't do much; AFAIK everything is encrypted client-side with your master password. So a hacker could, in theory, get my encrypted database, but by the time they crack my strong password, I will, at the very least, have changed all those passwords.

developer2 · on Dec 16, 2016

That's not what a hack against 1Password, LastPass, or similar product will look like. When it happens, it will be because someone manages to commit to the VCS repository of one or more of their client applications (iOS, Android, desktop, etc.). All it takes is a few lines of code to dump the unencrypted contents on the device itself, and post them to some API endpoint or email address.

One commit to a VCS by a disgruntled employee, or an attacker who social engineers credentials to the VCS, and the client applications themselves - which must be trusted to decrypt the contents locally - will be compromised.

This is the problem with proprietary password managers, where the client applications are provided by the company. You cannot vet that software which is running on your device today, let alone all the app updates coming down the pipeline.

SnacksOnAPlane · on Dec 16, 2016

Thank you for writing this. I use a password manager, and whenever I see someone say "it's unhackable because of the encryption" I want to tell them this, exactly. All someone needs to do is to surreptitiously send your password to their own server and all your passwords are owned. It's not difficult.

_brnu · on Dec 16, 2016

I've often wondered about this. Is there a preferable alternative?

raverbashing · on Dec 15, 2016

At the very least you need to memorize the 1pw password, but I do memorize some others as well

contravariant · on Dec 15, 2016

I can kind of understand that reasoning, but one of the nicer things about strong passwords is that there are a lot of them. In some sense that's what makes them strong passwords.

Buge · on Dec 15, 2016

Why do you assume the password would be leaked eventually? Usually hashes are leaked (as in this case), not passwords.

throwaway729 · on Dec 15, 2016

leaking unsalted md5 passwords == leaking passwords

Buge · on Dec 15, 2016

Not if it's my password. I use ~100 bits of entropy.

throwaway729 · on Dec 17, 2016

Hence "Strong passwords that need to be memorized" in OP's comment. Or else your memory is way better than mine (or I care way less, or probably both).

Buge · on Dec 17, 2016

Whether they need to be memorized or not does not make the statement "it will be leaked eventually anyway" more true.

I use a password database so I don't memorize most of my passwords.

throwaway729 · on Dec 18, 2016

Well, it does, because memory puts some limitations on length and complexity...

Buge · on Dec 19, 2016

It is possible to memorize a 100 bit password. I once had a 1000 word poem memorized, and could write it down flawlessly from memory.

I agree that it's not worth memorizing, you should instead use a password database. But I still maintain my original point that there's no reason to assume that your password will be leaked eventually if you use a strong password.

supergreg · on Dec 15, 2016

Could the attacker find an easier to find string that matches the same md5 hash?

Dylan16807 · on Dec 15, 2016

The current best attack wrt matching an existing hash brings MD5's 128 bits of security down to 123. So no, that's not going to happen.

Dylan16807 · on Dec 15, 2016

MD5 is terrible for human passwords because it's fast. But md5 is not actually broken for password storage purposes. If you use a long random password, md5 is enough.

raverbashing · on Dec 15, 2016

Yes, if you add a (long - at least 32 bit) salt and something like at least 10^9 rounds of MD5 then, yeah, it's probably ok

Dylan16807 · on Dec 15, 2016

No. I mean single unsalted MD5. You will not crack a 20-random-char password. You cannot process 2^120 guesses, and MD5 is not broken for this use.

late2part · on Dec 15, 2016

Just follow NIST guidelines and never change it. That way when the servers in Utah crack your password, they don't have to recrack it later.

syntheticnature · on Dec 15, 2016

I've run into off-by-one issues in password length requirements in the past, so if 32 characters is the stated maximum it might only be capable of 31 on the validation side.

KMag · on Dec 17, 2016

That asymmetry in length support strongly suggests they're storing passwords in plain text.

ajanuary · on Dec 15, 2016

I tried to change it to a 64 character random 1Password string with numbers, characters and symbols. It complained it was too easy to guess. I submitted the exact same password and it accepted it.

niftich · on Dec 14, 2016

> August 2013

> hashed passwords (using MD5)

I don't even know what to say.

> investigating the creation of forged cookies that could allow an intruder to access users' accounts without a password. Based on the ongoing investigation, we believe an unauthorized third party accessed our proprietary code to learn how to forge cookies

How is this possible? Aren't most auth cookies just a session ID that can be used to look up a server-side session? Did they not use random, unpredictable, non-sequential session IDs?

jsjohnst · on Dec 14, 2016

1) As Yahoo "upgraded" all password storage in UDB (where all login / registration details are stored) to be bcrypt before 2013, I'm curious how this was possible.

2) Yahoo doesn't use a centralized session storage. If you know a few values (not disclosing the exact ones) from the UDB, it's theoretically (guess not so theoretical now) possible to create forged cookies if you steal the signing keys. To my knowledge, the keys were supposed to only be on edit/login boxes (but it's been a while so I may be forgetting something), so this is a pretty big breach.

dsl · on Dec 14, 2016

On a number of engagements I've come across password databases that have been migrated to bcrypt. In one case I checked CVS to see who made the code change, and found the MD5 passwords on his dev box. In another I tracked down a MySQL slave that had broken replication for over a year.

In both cases I tried to track down backups, but discovered neither company was keeping them. That is another possible vector.

jsjohnst · on Dec 15, 2016

1) I'd be flabbergasted beyond belief if there was ever a Yahoo! engineer who had user passwords on their laptop / Dev box. The technical hurdle for that would be a stretch, let alone the fact of the other ramifications of doing this.

2) there's no SQL database involved with Yahoo!'s storage of passwords. It's a custom built db system with proprietary access and replication protocols.

dsl · on Dec 15, 2016

I wasn't saying either possibility was the cause of the Yahoo breach. Simply pointing out that there is always another way.

The NSA's MUSCULAR program for example decoded proprietary secret squirrel cross datacenter replication protocols designed by both Google and Yahoo, so that isn't much of a safe guard against state level actors.

mturmon · on Dec 15, 2016

Yet, somehow they did get out.

jsjohnst · on Dec 15, 2016

Apologies, I've heard the details at this point and I can't disclose them. The limit of what I can do is poke holes in the theories that are wrong.

CobrastanJorji · on Dec 15, 2016

Aren't the details "three years after we were hacked, law enforcement told us that we had been hacked, and we believe them?"

The press release explicitly says "We have not been able to identify the intrusion associated with this theft." I especially noticed that the "What are we doing to protect our users?" section doesn't mention anything about Yahoo fixing any security issues.

Presumably, then, as a Yahoo engineer, you know what your security practices are but you don't know what you did wrong or whether you've fixed it.

jsjohnst · on Dec 15, 2016

Do you honestly believe a press release covers every detail, especially ones with strong legal implications, and might not have rather been worded very carefully?

mustacheemperor · on Dec 15, 2016

The contrast between your statements and the press statement is great enough to imply Yahoo is being dishonest.

jsjohnst · on Dec 15, 2016

"Dishonest", not in the slightest. From what I'm told, they really don't know how they got in. But that's only the part of the story discussed in the press release, what's not discussed is how the data existed in that format.

normaljoe · on Dec 15, 2016

From my experience if Paranoids did know they would have locked it down at the expense of engineers or others. I know since I have made breaking changes to infrastructure which did lock out some engineers and cause plenty of headaches.

Every Yahoo I have ever known has cursed the Paranoids for getting and the way. Every Yahoo that has actually been in a situation has also blessed the Paranoids for the same reasons.

Simple fact is that Yahoo has a mega butt ton of code from several decades. There are going to be holes and when they are found they are fixed pretty damn quick. Last one I dealt with was solved in hours with all hand on deck. Sometimes it just sucks to be as old a Yahoo is.

geekone · on Dec 15, 2016

If they do not know how the adversaries got in, how do they know the adversaries are not still in to some degree?

jsjohnst · on Dec 15, 2016

Good point. I don't know if they do know that for sure.

a3n · on Dec 15, 2016

> the "What are we doing to protect our users?" section doesn't mention anything about Yahoo fixing any security issues.

"We continuously enhance our safeguards and systems that detect and prevent unauthorized access to user accounts."

At the end of the same paragraph. They're already continuously updating security, before they even knew they were hacked. Three years have passed, so for all they know something in those continuous updates covered this hack.

normaljoe · on Dec 15, 2016

I am taking a WAG here but if they got code then they might be able to take educated guesses at the UDB values without actual access to UDB. Those guesses are more likely to be true with bot registered accounts where there is duplication of information.

This goes back to my theory that a good portion where junk accounts.

Not saying this is acceptable, just saying garbage in garbage out.

jsjohnst · on Dec 15, 2016

You can't guess the XX (anonymized for obvious reasons) key without access to the UDB.

normaljoe · on Dec 15, 2016

I'm guessing by your handle I know who you are :). Ex-Yahoo super chat moderating guy here, which should let you know me.

Wouldn't the upgrade require the accounts to actually login to migrate password? Last I was at Yahoo there was at least 3B junk accounts in UDB. With out knowing details I am guessing that many of the "compromised" accounts fall into that bucket.

I get that membership can't just trash junk accounts but marketing was very aware of them. Paranoids also can't just say a compromised junk account is not a compromise, they are too paranoid for that.

This unfortunately sounds bad PR wise, with little knowledge of actual impact. On the flip side I'm pretty sure I am not on the radar of the state actor since they would more then likely be looking at their own.

jsjohnst · on Dec 15, 2016

Just to confirm, purple Yahoo! car in YEF spot ;)

As to your question, no, they didn't need to login due to how the hash "upgrade" was done (unlike how Tumblr did it around the same time). I was one of the people in the billion accounts and I definitely have logged in and also changed my password multiple times (also have very high entropy passwords and use TFA).

normaljoe · on Dec 15, 2016

It wasn't me despite your DR Ycan't photos. :)

Tumblr was indeed what I was thinking about.

alibee · on Dec 15, 2016

What's funny is that there's someone currently working at Yahoo with a name scarily similar to yours and I was pretty sure for a moment that you were some random ycombinator person faking being him.

Although...he IS cool.

yuhong · on Dec 15, 2016

bcrypt(md5(password)) allow the existing password hash to be reused.

bazzargh · on Dec 15, 2016

No. They've stolen the hash, so if they crack it, you've just let them waltz back in.

The correct response is force a password reset, and _delete_ weak hashes so that they cannot be stolen in a subsequent breach. At worst, store a bcrypted md5 password as you suggest, but only as a check for a password the user must not be allowed to use again; it _cannot_ be used to sign them in.

One of the attacks you're preventing is on _other_ sites, where the user has reused the passwords. Keeping around weak hashes even to let that user perform a reset is risking that hash being taken, cracked and used in a breach elsewhere.

jsjohnst · on Dec 15, 2016

When they did the bcrypt(md5(password)) there was no leaks of Yahoo!'s md5'd passwords. That's obviously changed now and thus why the billion passwords were invalidated (I'm one of those folks btw, but I also had TFA on my account and my password had sufficient entropy you won't brute force the md5).

CWuestefeld · on Dec 15, 2016

Keeping around weak hashes even to let that user perform a reset is risking that hash being taken, cracked and used in a breach elsewhere.

We're currently working on PCI compliance. In pen testing, we got dinged for not preventing re-use of prior passwords, and that bothers me for exactly this reason (plus the new NIST standards say NOT to force periodic changing).

I believe that our hashes are strong (using scrypt, salt, etc.). But the belief that you're getting it right shouldn't let you be lax in other areas, hence security in depth.

So I really object to the requirement that we keep around those old hashes.

normaljoe · on Dec 15, 2016

Good point. Thanks for pointing out my mistake.

niftich · on Dec 14, 2016

Is the info about the Y and T cookies in this pdf [1][2] accurate?

[1] (EDIT: now with screenshots) http://imgur.com/a/g61VZ

[2] (Not affiliated with link, but the risk-averse may wish to open in a sandbox) ftp://hackbbs.org/milworm/270

jsjohnst · on Dec 14, 2016

Doing a google search for the link showed me the title of the document which I remember reading in the past. The overall coverage of Y&T cookies is more or less accurate at the time of writing back in like 2010/2011, but there's a bunch of mostly minor technical inaccuracies too. I don't want to comment on much without rereading it, but I remember the description of Sled ID made me laugh (which btw I'd guess less than 1% of current Yahoo employees knows what that is).

jsjohnst · on Dec 14, 2016

Also, the video that goes with the PDF is too funny! Just watched it on YouTube [0] again. Notice how he doesn't actually sign into Web Messenger, just goes to the login page? If he had, it would've failed. Same thing with him closing the browser before Yahoo Mail loaded. "Sensitive" reads and everything that did a write operation always (unless there was a bug) validated the cookie against the UDB. So even if you stole the signing key, without the values from the UDB, you would have very limited ability to do anything other than the trivial things shown in the video.

[0] https://m.youtube.com/watch?v=n2CNp_zmje8

fergie · on Dec 15, 2016

It seems that Yahoo has a problem with moribund accounts- many people had a Yahoo ID 10-20 years ago, and then abandoned it.

If these accounts are not deleted (and there are a bunch of organisational reasons not to), then the MD5 hash has to be kept around somewhere, until the user re-enters a password and a better hash is generated.

chinathrow · on Dec 15, 2016

> Yahoo doesn't use a centralized session storage. If you know a few values (not disclosing the exact ones) from the UDB, it's theoretically (guess not so theoretical now) possible to create forged cookies if you steal the signing keys. To my knowledge, the keys were supposed to only be on edit/login boxes (but it's been a while so I may be forgetting something), so this is a pretty big breach.

Isn't that highly confidential company information?

jupp0r · on Dec 15, 2016

> 1) As Yahoo "upgraded" all password storage in UDB (where all login / registration details are stored) to be bcrypt before 2013, I'm curious how this was possible.

You check the plaintext password sent to the backend against the md5, on success you rehash it as bcrypt, insert it in the table.

xanadohnt · on Dec 14, 2016

Web tokens, for example, don't necessarily include just a session ID. Some include the full session details within its payload. This can be quite useful, actually, because it offloads session-lookup onto the client.

camus2 · on Dec 14, 2016

How do you invalidate a JWT server-side without the user interacting with the server ?

Normal_gaussian · on Dec 14, 2016

My preferred method:

Add an "expires" field to the token, this should contain a date after which the token is no longer valid. Now all token s auto-invalidate after a certain period.

Allow some or all tokens to "refresh" by calling a particular endpoint (call with valid token and get a token with expiry from now).

Optionally add some form of identifier to the token (user_id works great) so that you can push a message out to your servers that looks like this: "All tokens for x expiring before y are invalid". Once time y has passed your server can forget about the message. This will be a very small set (often 0) as very few people use the "log out my devices" features.

Logouts should be done client side by deleting the token.

If you are worried about your token being sniffed you are either not using HTTPS, or sticking it somewhere stupid.

knz · on Dec 15, 2016

> Add an "expires" field to the token, this should contain a date after which the token is no longer valid. Now all token s auto-invalidate after a certain period.

Doesn't JWT already have this - "exp" is a reserved claim for expiration time?

https://tools.ietf.org/html/rfc7519#section-4.1.4

4.1.4. "exp" (Expiration Time) Claim

The "exp" (expiration time) claim identifies the expiration time on or after which the JWT MUST NOT be accepted for processing. The processing of the "exp" claim requires that the current date/time MUST be before the expiration date/time listed in the "exp" claim.

spydum · on Dec 15, 2016

Yes but that is more for standard idle time expiration.. The problem being addressed above is for actively invalidating an existing JWT for a user once they already have it (and before the default/original expiry is met).

danielweber · on Dec 15, 2016

> Now all token s auto-invalidate after a certain period.

You need to make sure that there is some process that will refuse to keep on re-upping the cookie lifetime. Otherwise an attacker could indefinitely keep the stolen cookie alive.

Normal_gaussian · on Dec 15, 2016

If you see a suspicious usage pattern then force a login by invalidating the tokens. Allowing indefinite refreshing is a feature and a drawback of this method.

merb · on Dec 15, 2016

You CBS Combine a session cookie with a jwt Token That get sent over a Header

Normal_gaussian · on Dec 15, 2016

Which gives you the worst of both worlds

corford · on Dec 15, 2016

Tokens have in-built expiry dates (cryptographically signed by the server upon issuance). Once that date has passed the token becomes useless.

If you meant "how can you prematurely invalidate a specific user's JWT without needing a server side lookup", you can't.

I think the best you can do is issue different classes of JWT to a user based on what actions you wish to grant them. This lets you reduce load going to backend lookups to only a subset of JWTs where the ability to invalidate them earlier than planned on a per user basis is necessary/desired.

For JWTs that aren't tied to backend lookups the only solution if one or more users are accessing resources they no longer should be via one of these tokens is to invalidate all of them.

xanadohnt · on Dec 14, 2016

The client can hold onto the token indefinitely, the server doesn't care. But next time a request comes in with that token it will be expired. The server validates the timestamp which is part of the encrypted payload that only the server can decrypt; instant validation and no DB lookup.

edibleEnergy · on Dec 15, 2016

This is possible if you support the 'jti' claim[1]. There's a discussion of an implementation of it here[2].

[1]: http://self-issued.info/docs/draft-ietf-oauth-json-web-token... [2]: https://auth0.com/blog/blacklist-json-web-token-api-keys/

Alex3917 · on Dec 14, 2016

Each JWT has an issued at date, so you just need to reject all tokens issued before that time. In addition to invalidating all tokens if there is a breach, each user account can have its own datefield to invalidate all the tokens for that account if a user changes their password or whatever.

arpa · on Dec 15, 2016

I'm not too familiar with JWT, but i have some hands-on experience with Macaroons; the simplest way would be to have a custom caveat of validity set in the token, let's say, a validity GUID, which is an id of server-side record of validity (true/false), e.g. in some database table. Once you set that record of validity to false, the token bearing that GUID automatically becomes invalid.

Otherwise, without server-side changes (such as change of secret key used for signature generation), it is impossible.

perlin · on Dec 14, 2016

With JSON web tokens (JWT), the client or server must know the secret key used to sign the token in order to validate it, but anyone can view its payload.

bpicolo · on Dec 14, 2016

Could do it if you knew the JWT token text in theory?

imaginenore · on Dec 15, 2016

MD5 is still not too bad, if properly salted. And if you use multiple rounds of hashing, it can be as slow as Bcrypt. As far as I know, MD5 is still not generally broken, we only found some weaknesses.

To prove me wrong you can try and reverse this one (unsalted , just one round):

27c8ac15df9357d92385f59aea2049e0

joatmon-snoo · on Dec 15, 2016

Even so, the fact that we have the knowledge to generate collisions in MD5 means you really shouldn't be relying on it when there are better alternatives.

imaginenore · on Dec 15, 2016

Try and generate a collision with the hash I gave. You can't, as far as I'm aware.

We can only generate collisions of carefully crafted sources, not arbitrary ones.

So MD5 is fine, as long as you follow the standard procedure for storing password hashes:

1) Unique salts + long master salt (to prevent rainbow table lookups).

2) Enough rounds of hashing.

3) Don't allow the most common passwords.

4) Don't allow very short passwords.

I'm not saying MD5 is ideal, I use Bcrypt / Scrypt myself. But it's not MD5's fault Yahoo's engineers are lame.

baby · on Dec 15, 2016

I'm wondering if this is one of the reason Alex Stamos left...

kstrauser · on Dec 15, 2016

DO NOT delete your Yahoo account! In their disclaimer when you delete it, they state:

> "[...] we may allow other users to sign up for and use your current Yahoo! ID and profile names after your account has been deleted"

Bummer if you forget that it was the password reset email for your Facebook account, huh? Instead of deleting your account, purge it of all data: https://honeypot.net/purge-your-yahoo-account/

ProbabilityMoon · on Dec 15, 2016

I just deleted all data from my account and set an automatic responder stating that, due to security concerns, I no longer use that account. I created my Y! account in 1998, it's a shame it has come to this. There were a lot of memories I had to purge along with my account (even though I had a different main account in the last decade). Shame!

2sk21 · on Dec 15, 2016

This is a terrible policy. Do other email providers have a similar policy?

mkj · on Dec 15, 2016

Probably not that terrible if they only do it for accounts that were created and never used. Like all the good GitHub usernames that seem to be abandoned.

hobarrera · on Dec 15, 2016

GitHub usernames and emails are very different things. You don't get password reminders sent to your github profile, but you can get those via email.

BTW, no, most email providers never allow the reuse of close account names.

kstrauser · on Dec 15, 2016

Microsoft seems to, although I can't find a specific statement from them confirming it: http://windowsitpro.com/blog/recycled-email-addresses-and-ou...

b_emery · on Dec 15, 2016

If someone knows how to delete more than 100 emails at a time, let me know. I have more the 10k emails, 80% of which are probably spam!

b_emery · on Dec 15, 2016

... And the answer is, scroll to the very bottom, then delete. I was able to delete over 1000 that way.

allenz · on Dec 16, 2016

The other way is to search before:"2016/12/15", and delete all the search results.

adam12 · on Dec 15, 2016

They used to automatically put an email address back into circulation if you failed to log in for 6 months.

alyandon · on Dec 14, 2016

"Separately, we previously disclosed that our outside forensic experts were investigating the creation of forged cookies that could allow an intruder to access users’ accounts without a password. Based on the ongoing investigation, we believe an unauthorized third party accessed our proprietary code to learn how to forge cookies."

So that exactly explains how my Yahoo account was used to send spam despite having a password that can't be reasonably brute forced (despite them using MD5). :-/

dsl · on Dec 14, 2016

The forged cookie attack was used on a limited number of accounts, by a state sponsored actor. Going to this amount of effort and then sending spam would be on par with breaking into a bank just to steal the printer paper from the office.

Most likely either: 1) you were phished and didn't realize it 2) logged in to your Yahoo account from a device that had malware on it

joering2 · on Dec 15, 2016

> just to steal the printer paper from the office

Or stealing $6,000 with $100,000 gun :)

http://www.fenrir.com/free_stuff/columns/callcops/ctc-436.ht...

alyandon · on Dec 15, 2016

I'm willing to accept that perhaps that was not how my account was compromised but the time frame when this happened was well in line for when this breach supposedly occurred.

Regardless, it was some sort of automated spam/phishing emails that were sent from Yahoo's network using my account to contacts on my list. I analyzed the headers of multiple bounced messages that were sent to email addresses no longer in use and confirmed the origin of the traffic.

I'm not going to fall for a phishing attack and I only access email from devices I personally control. Could one of them had some sort of malware infection? I guess it is possible but I am security conscious and it is highly unlikely. I also would expect a hacker that has compromised one of my devices would be far more interested in using my banking credentials than using my Yahoo account to send spam.

mike_hearn · on Dec 15, 2016

You reused the password on other websites, I'm guessing. Especially likely if it was a strong (i.e. hard to memorise) password.

The bulk hacking attacks that began around Spring 2010 hit all the big webmail providers. The source of the passwords was always, without fail, reversed hashes from breakins at other big websites:

https://googleblog.blogspot.ch/2013/02/an-update-on-our-war-...

Source: was a tech lead on the Google anti-hijacking team during this period.

alyandon · on Dec 15, 2016

Nope, not password re-use either. I learned that lesson the hard way over a decade ago.

Regardless, it's something that has always continued to eat at me since I can't say for certain how it happened.

0x0 · on Dec 14, 2016

Are you sure they actually logged in to your account to send spam (are the spam emails visible in your sent folder), or could it be that someone is just spoofing the SMTP MAIL FROM / email From: header?

alyandon · on Dec 15, 2016

As far as I can tell it wasn't someone spoofing my email address. Emails were sent to people on my contact list and the numerous bounce messages to contacts that no longer had valid email addresses confirmed the origin of the traffic.

Buge · on Dec 15, 2016

It's possible that a contact of yours was compromised, and that contact had many contacts in common with you. And then they spoofed your address.

alyandon · on Dec 15, 2016

That's a good theory but in my case the sets of common contacts would be almost nil for that account.

Dr0Dre · on Dec 14, 2016

I had the same issue, I could see the email sent from sent folder. This happened about year ago and I was very surprised.

cptskippy · on Dec 14, 2016

Given Yahoo's security policies, whose to say someone wasn't just sending it from Yahoo's SMTP servers without any access to user's email accounts?

ubercore · on Dec 14, 2016

What do you mean by a password that can't be reasonably brute forced?

EDIT: To clarify, I mean specifically with md5. I'm by no means an expert, just curious because I had considered md5 so broken that this comment caught my attention.

parenthephobia · on Dec 14, 2016

Rumours of MD5's death have been greatly exaggerated.

MD5's weakness is that it's (relatively) easy to produce two strings which have the same hash. However, given an MD5 hash, it's not easy to produce a string which also has that hash.

In principle, one could intentionally construct two passwords which have the same hash. It's hard to see how that could be exploited maliciously - any attacker knows both passwords to begin with. Even then, making colliding strings that would make acceptable passwords hasn't been done yet, AFAIK: the shortest colliding strings found so far are 64 bytes long and contain several unprintable characters.

OTOH, computers are fast enough now that brute-forcing MD5 is practical for short strings with a limited set of characters, which is what passwords tend to be. One should use algorithms like PBKDF2, scrypt, and bcrypt which can increase their complexity as the computation capacity of potential attackers increases. This isn't because of a particular weakness in MD5, though, and one should equally avoid storing passwords as SHA-512 hashes, say.

The thing you definitely shouldn't use MD5 for is digitally signing a file you didn't make, because it's possible that whoever did make it also made another file with the same MD5 hash, for which your signature would also be valid.

manquer · on Dec 15, 2016

On a side note: You can use such crafted strings as a black box testing tool to verify if a site does infact use md5 or other weak algorithms to store the passwords. This can perhaps be used in conjunction with other factors to craft an attack.

As a corrollary this can also be used as a testing tool by anyone for any third party site to determine known vulenrablities in their password storage

ergot · on Dec 18, 2016

https://technet.microsoft.com/en-us/library/security/2862973...

terrib1e · on Dec 14, 2016

Definitely check this episode of 'Hacked' out for a simple overview. I just started listening to this show. It's a shame there are so few episodes.

http://www.hackedpodcast.com/episode-3-the-problem-with-pass...

metafunctor · on Dec 14, 2016

A preimage attack for MD5 has complexity of about 2^123. So, even if you get the MD5 hash for a password, it will be exceedingly hard to find a password that has the same hash (assuming the original password is long and random).

jlarocco · on Dec 14, 2016

I don't think that's true.

This site from 2006 claims they could find collisions in an average of 45 minutes on a 1.6 Ghz Pentium 4: http://www.bishopfox.com/resources/tools/other-free-tools/md...

If you account for speed increases over the last 10 years and assume the password thief has access to a botnet, then it wouldn't surprise me if they've found collisions for the entire list.

Edit: Nevermind, the link finds two strings that hash to the same thing; it does not find a string that hashes to an existing hash.

metafunctor · on Dec 14, 2016

The collision generator behind that link does not implement a preimage attack (given a string X, come up with another string Y with the same MD5 hash).

Instead, it implements the much easier collision attack (come up with two strings that have the same MD5 hash).

JonDav · on Dec 14, 2016

I thought the whole point of the MD5 vulnerability was that the limit was 2^128 and as such there are more inputs that possible output hashes, meaning more possible input collisions.

metafunctor · on Dec 14, 2016

All hash functions have collisions. The point is that a good cryptographic hash function makes it very hard to find collisions.

The “preimage attack” on a cryptographic hash function tries to find a message that has a specific hash value. That is, you lock down a hash value (the MD5 hash for a password) and try to find a message that hashes to that value (the original password, or any other input that happens to have the same hash).

The best known preimage attack against MD5 has complexity 2^123. It's better than brute forcing, but still unpractical. Thus, if I come up with a good password that is long and random, you will have a very hard time coming up with a string that has the same MD5 hash value.

The practical attacks against MD5 are collision attacks. A collision attack tries to find two messages with the same hash value. With MD5 in particular, there's a chosen prefix collision attack, where you choose two messages and append to them so that the hashes will match. This was particularly devastating with X.509 signatures and certificates, where the attacker could have the MD5 hash signed by a certificate authority, and then use the same signature with their other message that has the same MD5 hash.

bartl · on Dec 15, 2016

What about Rainbow Tables? (https://en.wikipedia.org/wiki/Rainbow_table#Precomputed_hash...)

Instead of computing the MD5 of a huge number of passwords looking for a match, you simply store the precomputed password and hash pairs in a database table.

metafunctor · on Dec 15, 2016

A rainbow table is just a precomputed table of hashes for a lot of passwords. Some tricks are used to make the table smaller, but you can think of it as just a lookup table. Only the passwords that were precomputed and put into the table will be found.

Rainbow tables are usually computed for short passwords (1-10 characters) and limited character set (say, alphanumerics). They are good for finding the bad passwords if you get your hands on a set of MD5 hashed passwords. But they are of no help if you need to reverse a good, long, random password.

geofft · on Dec 15, 2016

Every hash has a finite output length, and therefore a finite number of possible outputs. 2^128 is a very large finite number. It's not that large in the grand scheme of things (there are over 2^260 or so atoms in the universe), and it's definitely better to use a hash with 2^256 outputs now that there exist good 256-bit hashes that are faster than MD5, but 2^128 is still quite a large number. The internets are quoting me about 10 billion hashes per second on a good GPU from a few years ago, which comes out to about one sextillion years to find an input for every possible output. (It divides linearly if you have more GPUs, but that clearly won't help very much.)

What's broken about MD5 is that, due to an algorithmic flaw, it's very easy to generate two inputs of your choice that have a matching output. That's great if you want to do things like spoof an SSL certificate (you generate two certificate signing requests, get one of them signed, apply the signature to the other), but not directly helpful for attacking a password hash where someone else chose the password.

What is conceptually broken is that such an algorithmic flaw exists, and also due to algorithmic flaws it takes a bit under 2^128 tries to find an input for a specific possible output. That worries mathematicians, because it's a sign the hash isn't behaving as randomly (speaking informally) as one would hope, and that people are starting to understand its structure. If that understanding continues, it might be broken more in the future, so you absolutely shouldn't build new systems on MD5 because we expect the research to happen at some point.

But, at least today, it's still true that you can have a password that can't be brute-forced despite the use of MD5. Maybe someone will present a paper tomorrow that disproves that.

kreetx · on Dec 15, 2016

This is a very clear explanation, thanks!

semicolon_storm · on Dec 14, 2016

All hashing algorithms that I am aware of have more inputs than outputs. By the pigeon hole principle, there will always be collisions. MD5 is weak, but it still isn't trivial to find an input that hashes to the same thing as a high entropy password.

danielweber · on Dec 15, 2016

> that hashes to the same thing as a high entropy password.

To be clear, it's not the entropy of the original password that matters, except for the fact that all common low-entropy passwords already have their MD5s stored in public databases. (What hashes to 5f4dcc3b5aa765d61d8327deb882cf99? You can look it up with Google.)

You can come up with two plaintexts that hash to the same thing in MD5. You can't come up with something that hashes to a new MD5 value given to you, aside from finding it in one of those databases.

agf · on Dec 14, 2016

If it's a password so long and complex it wouldn't be in any rainbow table computable in reasonable time. While MD5 can be computed quickly, there is still a limit to how many you can compute -- and there are an infinite number of possible passwords if they aren't length limited.

DougBTX · on Dec 15, 2016

Interestingly even if the password has infinite length, an MD5 hash has a fixed finite length. You can think of it as a glorified modulus operator, beyond some point the longer passwords will have hashes that match shorter ones.

agf · on Dec 16, 2016

True -- but assuming these passwords aren't stored the same (very, very wrong) way on another site, and they're no longer useful on Yahoo, what's important is finding the real password, not just a password that happens to match the given hash.

manarth · on Dec 14, 2016

Rainbow tables are attacks against secure algorithms.

MD5 is recognised as an insecure algorithm: given a known hash, there are multiple possible passwords that would resolve to the same hash, therefore appearing to be the correct password.

With MD5, it's not necessary to compute an infinite number of possible passwords, and it is possible that, given a particular hash, a collision can be found within a reasonable time.

jsjohnst · on Dec 15, 2016

Either a) you don't have a clue about the complexity involved in finding a collision for a specific hash or b) your definition of "reasonable time" is longer than the age of the universe and/or using 100 trillion state of the art GPUs is realistic.

I'm leaning towards option a, you read a blog post once and think you're an expert on cryptography now.

manarth · on Dec 15, 2016

  > the complexity involved in finding a collision for a specific hash

If it can be shown that a preimage collision can be computed in less time than an exhaustive search, the algorithm is generally regarded as having a weakness, even if the given "less time" is still a very very long time.

The theoretical complexity of MD5 is 2^128, but a preimage attack was discovered in 2009 which showed that a collision can be found in 2^123.4. [1]

Collision attacks against MD5 have become more practical, there are even frameworks for it [2]. The complexity of 2^123.4 still makes a preimage attack against MD5 computionally unfeasible, but given that it's been shown to be weaker than its theorerical 2^128, it's possible that MD5 has other weaknesses which would allow the complexity to be reduced to a level that is computationally feasible.

[1] https://www.iacr.org/archive/eurocrypt2009/54790136/54790136...

[2] https://marc-stevens.nl/p/hashclash/

mypalmike · on Dec 15, 2016

To be fair, pretty much every MD5 discussion I've ever seen or been involved in (including with "security expert" former coworkers) has had someone making the same claim.

SamBam · on Dec 15, 2016

What you're describing is the same for every having algorithm in existence. All hashes can represent multiple (indeed, infinite) passwords. So they all have collisions. This is because all hashes are fixed-length, and so finite, while the possible inputs are infinite.

This isn't the reason that MD5 is weaker than other algorithms.

Buge · on Dec 15, 2016

You are describing a first preimage attack. There have not been any computable first (or second) preimage attacks on md5.

https://stackoverflow.com/questions/822638/does-any-publishe...

There are collision attacks, but that is not relevant for password cracking.

manarth · on Dec 15, 2016

From 2009: a preimage attack reduced the complexity from 2^128 to 2^123.4 [1].

It's still a big number, but it's less than the theoretical complexity.

[1] https://www.iacr.org/archive/eurocrypt2009/54790136/54790136...

Buge · on Dec 17, 2016

What I meant by "computable" is something that can be computed with today's hardware.

gamapuna · on Dec 15, 2016

https://codahale.com/how-to-safely-store-a-password/

JonDav · on Dec 14, 2016

Pretty much even if you choose a high entropy password like say:

  `]{;&<C9v98QO#]M~Ff$>rQQQjoJkxm0ayM+gG,@vf*>#-{X4E>aZG(A1~tf<Wu

the MD5 algorithm can be broken using various techniques like collisions, unsalted I believe means that their database would accept the hashes the third party has. End result is they should have migrated away from MD5 after it was declared unsafe.

danielweber · on Dec 15, 2016

No it can't.

Two principles here:

1. If your password is very very good (a Diceware password would suffice), then any method of storing passwords that is better than storing them in plaintext will stop someone from brute forcing it.

2. If your password is very bad, then even an excellent password hashing algorithm will not save you.

"Just use bcrypt" is meant to save people who are in the middle.

Buge · on Dec 15, 2016

No, a collision attack would not give you the plaintext from a hash. A first preimage attack would do that, but no computable first (or second) preimage attacks against md5 have been found.

https://stackoverflow.com/questions/822638/does-any-publishe...

jsjohnst · on Dec 15, 2016

Nope, that doesn't explain it. Without Yahoo! UDB access to get a couple values unique to your login, you can't forge a cookie that allows you access to Yahoo! Mail.

kajecounterhack · on Dec 14, 2016

Related: former Yahoo security engineer talks about a backdoor Yahoo installed for the NSA to read private emails...behind their security teams' backs...

https://diracdeltas.github.io/blog/surveillance

ilarum · on Dec 14, 2016

In case you are looking for the important information, it seems to be MD5 hash without salt.

chillydawg · on Dec 14, 2016

Bloody hell. Sloppy and incompetent.

dopamean · on Dec 14, 2016

I'm genuinely curious how the decision to use MD5 gets made. Who says, "hey, maybe we should use MD5." And then who responds, "that sounds like a great idea Bob." Seriously. I've known for years that MD5 is insufficient for hashing passwords and I'm just some random guy. This kind of thing really baffles me.

stanleydrew · on Dec 14, 2016

Yahoo has been a company for a long time. I imagine your conversation happened round about 1999 when using MD5 wasn't insane. And then they were just slow to upgrade.

It's still bad, I'm just saying the conversation about what hash algo to use didn't happen yesterday.

throwaway34916 · on Dec 14, 2016

I'd like to believe that. However, I was recently asked to test a new website for an organization I volunteer for, and discovered their "forgot password" flow emailed me my plaintext password. I wrote an explanation of why this was bad, and how it could be fixed, to a non-technical friend of mine who works there; he passed my email to the (Bay Area based!) consulting shop that did their website. The shop sent this response:

"We do not store passwords as a plain text in database. We have functionality which encrypts and decrypts passwords. We have only ecnrypted passwords in the database.

Almost all other servers use one-way encryption. In this case, passwords cannot be decrypted from hashing."

Again, this is a Bay Area based shop. For code written in 2016.

I was shocked to receive this, but it (among other things) leads me to suspect that there are lot of people out there, in positions of power, who aren't just ignorant, but who actively cling to password-storage anti-patterns.

I'm at a loss for how to fix this.

crazypyro · on Dec 15, 2016

Just for clarity, the "forgot password" flow emailed you the current password of the account (not a temporarily one)?

That's insane...

throwaway34916 · on Dec 15, 2016

Yes, the current password.

wtfishackernews · on Dec 15, 2016

submit the website to http://plaintextoffenders.com/

syncsynchalt · on Dec 15, 2016

Ironically, hosted on a Y! site.

cm2187 · on Dec 14, 2016

But it's not like if we didn't have a pretty much continuous stream of major data leaks for the past 5 years. Surely yahoo engineers occasionally open a newspaper...

Endy · on Dec 14, 2016

From everything I've read, the engineers did. The problem was that the security team had to go head-to-head with the budget team. And unfortunately, the budget team won - since the upper levels didn't feel that the IT security salaries were a necessary expenditure. And beyond that, there was concern that making people actually change their passwords regularly and requiring anything like security in said passwords was going to discourage users from using Yahoo and send them over to GMail.

Unfortunately... that argument wasn't wrong.

pbhjpbhj · on Dec 15, 2016

> The problem was that the security team had to go head-to-head with the budget team. //

Wouldn't engineers at such a big corp whistle-blow such incompetent decision making?

Apparently [1] they had a $1.37B net income in 2013. Given using bcrypt with a Blowfish hash and salting was pretty much a de facto standard by that point (I think that's what Wordpress were doing, hardly revolutionary security work) it seems the relative cost for Yahoo was approximately zero.

All I can imagine is that those in control were asked to leave the system open for government snooping? Why else would engineers working there not [anonymously] bring this to press attention - "hey, Yahoo security amounts to a piece of sticky tape holding a bank-vault shut".

- - -

[1] http://www.marketwatch.com/investing/stock/yhoo/financials#

danielweber · on Dec 15, 2016

It's not that hard to implement something at the start. It's more work to retrofit it on top of an existing system in a way that doesn't reduce the total security.

cm2187 · on Dec 14, 2016

But would it require users to change their password?

The way I would have implemented it, but would be keen to know how secure it is, is that you start with the md5 of the password (md5(password)). You then bcrypt or scrypt that md5 (bcrypt(md5(password))) and replace the md5 in your database with the bcrypt hash.

When a user logs in, all you need to do is to calculate the md5 first then check that md5 against the bcrypt hash you have stored.

I am not a crypto expert but intuitively it doesn't look like I would have weakened the security that way. You can't really attack bcrypt(md5(password)) much more than bcrypt(password). Can you?

lotyrin · on Dec 14, 2016

The method I've used is to add the column for the new stronghash then you update the old column to stronghash(<oldhash>), where <oldhash> is dumbhash(password) check against that on login stronghash(dumbhash(password)) and generate just stronghash(<password>) while you have the plaintext password in memory and update the row to add the new hash (simple and interoperable, not dependent on dumbhash) and drop the stronghash(<oldhash>). After a <longtime> limit (to optimize both maintenance overhead of the additional column / behavior and limit exposure to only minority users that haven't logged in for <longtime>), you drop the stronghash(<oldhash>) from everyone and do a "we sent you a reset email" for anyone that's trying to log in but has no <stronghash> password hash.

danielweber · on Dec 15, 2016

This is fine workflow, but keep in mind

> and do a "we sent you a reset email" for anyone that's trying to log in but has no <stronghash> password hash.

Yahoo is an email provider so many of these users won't have an external provider to refer to.

sah2ed · on Dec 15, 2016

This workflow is much better than the other proposals I've read up-thread.

user5994461 · on Dec 15, 2016

It's one way to do it, which is okay sometimes.

The other way is to add a new empty column for bcrypt. The next time the user logs in, you save the bcrypt hash and you remove the MD5 hash.

Over time, the active users will be migrated to the new scheme. The only issue is the abandoned accounts, they'll keep the old weak scheme.

danielweber · on Dec 15, 2016

There are other migration techniques. If you know md5(password), you can create bcrypt(md5(password)).

syncsynchalt · on Dec 15, 2016

That's what I do, though care should be taken that you can't then login against the old passwords by putting md5(password) in the password field.

Usually you do this by decorating the bcrypt(md5(p)) entries in some way so you can recognize which ones are tested with bcrypt() vs bcrypt(md5()).