> investigating the creation of forged cookies that could allow an intruder to access users' accounts without a password. Based on the ongoing investigation, we believe an unauthorized third party accessed our proprietary code to learn how to forge cookies
How is this possible? Aren't most auth cookies just a session ID that can be used to look up a server-side session? Did they not use random, unpredictable, non-sequential session IDs?
1) As Yahoo "upgraded" all password storage in UDB (where all login / registration details are stored) to be bcrypt before 2013, I'm curious how this was possible.
2) Yahoo doesn't use a centralized session storage. If you know a few values (not disclosing the exact ones) from the UDB, it's theoretically (guess not so theoretical now) possible to create forged cookies if you steal the signing keys. To my knowledge, the keys were supposed to only be on edit/login boxes (but it's been a while so I may be forgetting something), so this is a pretty big breach.
On a number of engagements I've come across password databases that have been migrated to bcrypt. In one case I checked CVS to see who made the code change, and found the MD5 passwords on his dev box. In another I tracked down a MySQL slave that had broken replication for over a year.
In both cases I tried to track down backups, but discovered neither company was keeping them. That is another possible vector.
1) I'd be flabbergasted beyond belief if there was ever a Yahoo! engineer who had user passwords on their laptop / Dev box. The technical hurdle for that would be a stretch, let alone the fact of the other ramifications of doing this.
2) there's no SQL database involved with Yahoo!'s storage of passwords. It's a custom built db system with proprietary access and replication protocols.
I wasn't saying either possibility was the cause of the Yahoo breach. Simply pointing out that there is always another way.
The NSA's MUSCULAR program for example decoded proprietary secret squirrel cross datacenter replication protocols designed by both Google and Yahoo, so that isn't much of a safe guard against state level actors.
Aren't the details "three years after we were hacked, law enforcement told us that we had been hacked, and we believe them?"
The press release explicitly says "We have not been able to identify the intrusion associated with this theft." I especially noticed that the "What are we doing to protect our users?" section doesn't mention anything about Yahoo fixing any security issues.
Presumably, then, as a Yahoo engineer, you know what your security practices are but you don't know what you did wrong or whether you've fixed it.
Do you honestly believe a press release covers every detail, especially ones with strong legal implications, and might not have rather been worded very carefully?
"Dishonest", not in the slightest. From what I'm told, they really don't know how they got in. But that's only the part of the story discussed in the press release, what's not discussed is how the data existed in that format.
From my experience if Paranoids did know they would have locked it down at the expense of engineers or others. I know since I have made breaking changes to infrastructure which did lock out some engineers and cause plenty of headaches.
Every Yahoo I have ever known has cursed the Paranoids for getting and the way. Every Yahoo that has actually been in a situation has also blessed the Paranoids for the same reasons.
Simple fact is that Yahoo has a mega butt ton of code from several decades. There are going to be holes and when they are found they are fixed pretty damn quick. Last one I dealt with was solved in hours with all hand on deck. Sometimes it just sucks to be as old a Yahoo is.
> the "What are we doing to protect our users?" section doesn't mention anything about Yahoo fixing any security issues.
"We continuously enhance our safeguards and systems that detect and prevent unauthorized access to user accounts."
At the end of the same paragraph. They're already continuously updating security, before they even knew they were hacked. Three years have passed, so for all they know something in those continuous updates covered this hack.
I am taking a WAG here but if they got code then they might be able to take educated guesses at the UDB values without actual access to UDB. Those guesses are more likely to be true with bot registered accounts where there is duplication of information.
This goes back to my theory that a good portion where junk accounts.
Not saying this is acceptable, just saying garbage in garbage out.
I'm guessing by your handle I know who you are :). Ex-Yahoo super chat moderating guy here, which should let you know me.
Wouldn't the upgrade require the accounts to actually login to migrate password? Last I was at Yahoo there was at least 3B junk accounts in UDB. With out knowing details I am guessing that many of the "compromised" accounts fall into that bucket.
I get that membership can't just trash junk accounts but marketing was very aware of them. Paranoids also can't just say a compromised junk account is not a compromise, they are too paranoid for that.
This unfortunately sounds bad PR wise, with little knowledge of actual impact. On the flip side I'm pretty sure I am not on the radar of the state actor since they would more then likely be looking at their own.
As to your question, no, they didn't need to login due to how the hash "upgrade" was done (unlike how Tumblr did it around the same time). I was one of the people in the billion accounts and I definitely have logged in and also changed my password multiple times (also have very high entropy passwords and use TFA).
What's funny is that there's someone currently working at Yahoo with a name scarily similar to yours and I was pretty sure for a moment that you were some random ycombinator person faking being him.
No. They've stolen the hash, so if they crack it, you've just let them waltz back in.
The correct response is force a password reset, and _delete_ weak hashes so that they cannot be stolen in a subsequent breach. At worst, store a bcrypted md5 password as you suggest, but only as a check for a password the user must not be allowed to use again; it _cannot_ be used to sign them in.
One of the attacks you're preventing is on _other_ sites, where the user has reused the passwords. Keeping around weak hashes even to let that user perform a reset is risking that hash being taken, cracked and used in a breach elsewhere.
When they did the bcrypt(md5(password)) there was no leaks of Yahoo!'s md5'd passwords. That's obviously changed now and thus why the billion passwords were invalidated (I'm one of those folks btw, but I also had TFA on my account and my password had sufficient entropy you won't brute force the md5).
Keeping around weak hashes even to let that user perform a reset is risking that hash being taken, cracked and used in a breach elsewhere.
We're currently working on PCI compliance. In pen testing, we got dinged for not preventing re-use of prior passwords, and that bothers me for exactly this reason (plus the new NIST standards say NOT to force periodic changing).
I believe that our hashes are strong (using scrypt, salt, etc.). But the belief that you're getting it right shouldn't let you be lax in other areas, hence security in depth.
So I really object to the requirement that we keep around those old hashes.
Doing a google search for the link showed me the title of the document which I remember reading in the past. The overall coverage of Y&T cookies is more or less accurate at the time of writing back in like 2010/2011, but there's a bunch of mostly minor technical inaccuracies too. I don't want to comment on much without rereading it, but I remember the description of Sled ID made me laugh (which btw I'd guess less than 1% of current Yahoo employees knows what that is).
Also, the video that goes with the PDF is too funny! Just watched it on YouTube [0] again. Notice how he doesn't actually sign into Web Messenger, just goes to the login page? If he had, it would've failed. Same thing with him closing the browser before Yahoo Mail loaded. "Sensitive" reads and everything that did a write operation always (unless there was a bug) validated the cookie against the UDB. So even if you stole the signing key, without the values from the UDB, you would have very limited ability to do anything other than the trivial things shown in the video.
It seems that Yahoo has a problem with moribund accounts- many people had a Yahoo ID 10-20 years ago, and then abandoned it.
If these accounts are not deleted (and there are a bunch of organisational reasons not to), then the MD5 hash has to be kept around somewhere, until the user re-enters a password and a better hash is generated.
> Yahoo doesn't use a centralized session storage. If you know a few values (not disclosing the exact ones) from the UDB, it's theoretically (guess not so theoretical now) possible to create forged cookies if you steal the signing keys. To my knowledge, the keys were supposed to only be on edit/login boxes (but it's been a while so I may be forgetting something), so this is a pretty big breach.
Isn't that highly confidential company information?
> 1) As Yahoo "upgraded" all password storage in UDB (where all login / registration details are stored) to be bcrypt before 2013, I'm curious how this was possible.
You check the plaintext password sent to the backend against the md5, on success you rehash it as bcrypt, insert it in the table.
Web tokens, for example, don't necessarily include just a session ID. Some include the full session details within its payload. This can be quite useful, actually, because it offloads session-lookup onto the client.
Add an "expires" field to the token, this should contain a date after which the token is no longer valid. Now all token s auto-invalidate after a certain period.
Allow some or all tokens to "refresh" by calling a particular endpoint (call with valid token and get a token with expiry from now).
Optionally add some form of identifier to the token (user_id works great) so that you can push a message out to your servers that looks like this: "All tokens for x expiring before y are invalid". Once time y has passed your server can forget about the message. This will be a very small set (often 0) as very few people use the "log out my devices" features.
Logouts should be done client side by deleting the token.
If you are worried about your token being sniffed you are either not using HTTPS, or sticking it somewhere stupid.
> Add an "expires" field to the token, this should contain a date after which the token is no longer valid. Now all token s auto-invalidate after a certain period.
Doesn't JWT already have this - "exp" is a reserved claim for expiration time?
The "exp" (expiration time) claim identifies the expiration time on or after which the JWT MUST NOT be accepted for processing. The processing of the "exp" claim requires that the current date/time MUST be before the expiration date/time listed in the "exp" claim.
Yes but that is more for standard idle time expiration.. The problem being addressed above is for actively invalidating an existing JWT for a user once they already have it (and before the default/original expiry is met).
> Now all token s auto-invalidate after a certain period.
You need to make sure that there is some process that will refuse to keep on re-upping the cookie lifetime. Otherwise an attacker could indefinitely keep the stolen cookie alive.
If you see a suspicious usage pattern then force a login by invalidating the tokens. Allowing indefinite refreshing is a feature and a drawback of this method.
Tokens have in-built expiry dates (cryptographically signed by the server upon issuance). Once that date has passed the token becomes useless.
If you meant "how can you prematurely invalidate a specific user's JWT without needing a server side lookup", you can't.
I think the best you can do is issue different classes of JWT to a user based on what actions you wish to grant them. This lets you reduce load going to backend lookups to only a subset of JWTs where the ability to invalidate them earlier than planned on a per user basis is necessary/desired.
For JWTs that aren't tied to backend lookups the only solution if one or more users are accessing resources they no longer should be via one of these tokens is to invalidate all of them.
The client can hold onto the token indefinitely, the server doesn't care. But next time a request comes in with that token it will be expired. The server validates the timestamp which is part of the encrypted payload that only the server can decrypt; instant validation and no DB lookup.
Each JWT has an issued at date, so you just need to reject all tokens issued before that time. In addition to invalidating all tokens if there is a breach, each user account can have its own datefield to invalidate all the tokens for that account if a user changes their password or whatever.
I'm not too familiar with JWT, but i have some hands-on experience with Macaroons; the simplest way would be to have a custom caveat of validity set in the token, let's say, a validity GUID, which is an id of server-side record of validity (true/false), e.g. in some database table. Once you set that record of validity to false, the token bearing that GUID automatically becomes invalid.
Otherwise, without server-side changes (such as change of secret key used for signature generation), it is impossible.
With JSON web tokens (JWT), the client or server must know the secret key used to sign the token in order to validate it, but anyone can view its payload.
MD5 is still not too bad, if properly salted. And if you use multiple rounds of hashing, it can be as slow as Bcrypt. As far as I know, MD5 is still not generally broken, we only found some weaknesses.
To prove me wrong you can try and reverse this one (unsalted , just one round):
Even so, the fact that we have the knowledge to generate collisions in MD5 means you really shouldn't be relying on it when there are better alternatives.
> hashed passwords (using MD5)
I don't even know what to say.
> investigating the creation of forged cookies that could allow an intruder to access users' accounts without a password. Based on the ongoing investigation, we believe an unauthorized third party accessed our proprietary code to learn how to forge cookies
How is this possible? Aren't most auth cookies just a session ID that can be used to look up a server-side session? Did they not use random, unpredictable, non-sequential session IDs?