Hacker News new | past | comments | ask | show | jobs | submit login
Disqus Security Alert: User Info Breach (disqus.com)
177 points by sashk on Oct 6, 2017 | hide | past | favorite | 162 comments



This is a good time to remind everyone to use a password manager and to have each password generated. Each service should have it's own password so if it gets compromised, your exposure is limited to that one service.

It is clear this is going to be a staple of internet life, so might as well be prepared.


Can you suggest a good password manager?

Ideally, I'd really like something that's deterministic - ie, I can provide a seed, and then that seed plus the domain name becomes the basis for the password.

That way, it's trivial to recover passwords when sitting at a new computer. And I imagine if the seed is sufficiently long (say, a 10-word sentence that includes 2 or 3 non-dictionary words) then it'd be highly impractical to force.

Right?


A better argument against deterministic password generators than the current replies to your post is that if you have to change your password on a given site - due to password expiration, ‘forgot password’, security breaches, etc. - you have to change the seed, and remember which seed you’re using for which site.

There are some others:

https://tonyarcieri.com/4-fatal-flaws-in-deterministic-passw...

But if you want to use one anyway, that blog post links to a bunch of them.


One idea for a deterministic password manager to avoid having to change the seed when a single site is compromised would be to give a simple counter to each site which is hashed with the domain name (or whatever is used to uniquely identify that site) as well the secret to produce a password. When that site is compromised or a password change is called for its counter is simply incremeneted by one.

That counter can be exposed publicly as well as likely a regularization mapping to coalesce variants in the domain name to a single value, having those are pretty worthless without the secret. So I think that just need to be kept secret and then everything else can be kept in dropbox or wherever.

Edit: I missed it on first glance but I see that this is referenced in the link you posted, so nevermind then.


Ahh, thank you. This is the kind of refutation I need. I'll read up and decide.

As for your specific qualm: I have to imagine that this can be solved by adding a numerically-determined nonce (so that I can say, "this is my 2nd password on disqus", etc) without having to fashion a new key.

edit: OK, so I have read the article. Here are my reactions to the author's four concerns:

* #1: Password schemes

> Unfortunately, sites have wildly varying and often conflicting password requirements: non-alphanumeric symbols are mandatory! Passwords must be alphanumeric only! Capital letters required! Passwords must be lower-case only! Passwords must be at least 12 characters long! Passwords must be at most 8 characters long!

> While many of these requirements seem silly and in a perfect world all sites would adopt the new NIST password guidelines, reality is messy and there is no single deterministic password generation scheme which can accommodate the password policies of all sites.

There aren't that many different schemes. Maybe a grand total of three dozen? I think that the program can simply have a user-updated registry (perhaps that's shared) of schemes, and the generator accounts for this. This seems like... maybe 150 lines of Python.

#2) Revocation and new passwords

> We could ask the user to remember the site-by-site counter and input the correct counter value to derive the correct password. But I think this is silly.

Umm, that's a strange argument. I don't think that's silly. I think it's highly practical. Currently, users remember a shitload of different details for different domains, including different usernames (this one was taken here, that one didn't meet the scheme there, etc) and often slightly different passwords.

Making them remember an integer that will rarely be larger than 3 instead of a password seems awesome. Worse case scenario, they can keep trying until they get the right one.

> You can’t store credit card numbers or bank account numbers in such a vault.

> You can’t put arbitrary cryptographic keys in such a vault.

> You can’t store randomly selected answers to security questions in such a vault.

> I consider this to be part of the basic functionality of a password manager.

I don't know what to say except that... I don't. I'm happy to have a password manager literally just manage passwords and allow me to use other tooling (like form memory in a browser) for this other stuff.

> Exposure of the master password alone exposes all of your site passwords...If you accidentally type or paste your master password into email, IM, or social media, an attacker can leverage that alone to derive all of your site-specific passwords.

Yeah, I get it - that's the same argument the others in this thread are making.

But I'm not going to accidentally type along seed, like a 10-word sentence with punctuation, into any of those places. This seems like a completely absurd argument to me.


That's effectively no different from changing the seed data (say, going from "password1" as the seed to "password2"), you still have to remember which iteration you're on on each site as part of the seed data.

Edit: and this point is actually made in the article the GP linked.


> There aren't that many different schemes. Maybe a grand total of three dozen? I think that the program can simply have a user-updated registry (perhaps that's shared) of schemes, and the generator accounts for this.

> Making them remember an integer that will rarely be larger than 3 instead of a password seems awesome.

Ok, so, we have the "master seed" that has to be remembered. We also now have to remember X from [1..36] of schemes, and a Y version integer.

There's two ways to "remember" these triplets. In your head. I which case you have to remember, for all 50 [1] different logins to all the sites you have, a triplet of <master,scheme,version>. Master will be easy, as it will be typed enough times to be easily remembered. But remembering that site A, that has not been visited for 4 months, used <master,14,2> while site B, that as not been visited for 6 months, used <master,27,3> and so on for 50+ logins will be untenable.

Or your 'deterministic' generator really has to store the "scheme" and "version" parts of the triplet somewhere in a local file/db, associated with each site, to allow it to work. At which point, you are 80% of the way to a proper password manager that would allow each password to simply be a proper randomly generated string of characters, tailored as need be, for each site.

> Worse case scenario, they can keep trying until they get the right one.

Unless, because of breeches or general password rotation they are now up to "version 12", but the site only allows three tries until lockout. Now you have to start trying "within 3 steps" of 12 or else you are guaranteed to be locked out.

[1] In the manager I use (Password Gorilla, https://github.com/zdia/gorilla, which is one of the many Password Safe compatible managers) I've got 371 stored entries, of which 285 are actual passwords for various sites that wanted a registration of some form or another along the way. That is something you learn quickly when you start using a password manager. The sheer number of websites that want you to 'register' in some way or another is actually huge. This is after about 13 or so years of using a password manager.


You should use 1Password or "pass" or, if a glutton for punishment, KeePassX, and honestly, I think you should use them to the exclusion of all other password managers. You can lower your security with a bad password manager.

I agree with everything Tony Arcieri wrote about deterministic password managers here: https://tonyarcieri.com/4-fatal-flaws-in-deterministic-passw....

Before you consider a deterministic password manager, just use "pass".


Does your KeePassX recommendation include the Windows equivalent it was ported from (KeePass)?


I've been using Keepass on Windows for several years now since I switched from LastPass. I like it a lot and as long as you are using it alone, syncing works great :)


Except this all but negates the benefits of a password manager. Now if someone steals your master password, they can generate every single password you use.

Use something like Keepass / KeepassXC. Yes, it's somewhat less convenient, but remember that security and convenience are always a trade-off.


> Except this all but negates the benefits of a password manager.

It preserves the main benefit: that I no longer need to remember a separate password for each domain in order to stop sharing passwords across them.

> Now if someone steals your master password, they can generate every single password you use.

So it's just a matter of keeping the seed in the user's head and nowhere else. Right?

That doesn't seem to be any more serious a security challenge than keeping secure an encrypted datastore full of passwords.


> So it's just a matter of keeping the seed in the user's head and nowhere else. Right?

Until you enter it on a compromised system to generate a password...


Cloud managed (but still pretty damn secure): LastPass or 1Password

You manage security: KeePass

I use KeePass with a keyfile and password. The DB is stored and sync'd across devices in DropBox, I have my DB and key regularly backed up on a USB drive. The key is also present on some machines that I cannot easily add the USB to (eg. phone). It's the only password I need to remember -- I have literally hundreds (400+) of accounts. I do not regret it at all -- but it was a huge pain to migrate everything to it.

I also store my Google authenticator initial QR codes on it, in PNG format.


1Password doesn't have to be "cloud managed" the way LastPass (ick) is. You can do pretty much everything in 1Password native clientside, and that's what you should do. 1Password is basically the gold standard for commercial password managers, and I think probably the only commercial password manager you should consider.


Thoughts on syncing 1Password database via Dropbox? I have multiple machines and that's what I do, but it worries me. Haven't found a more elegant solution though.


I use home-brew script with rsync to syncronize. I don't dropbox on critical machines.


KeePass can also generate TOTP codes. Kind of ruins the point of these codes, but if you're already storing the QR codes on it, you can go a step further and just get KeePass to generate the output code for you.

https://bitbucket.org/devinmartin/keeotp/wiki/Home


If you wish to retain the "two factor" aspect, you can put your 2FA codes in a separate keepass database. You do lose the "two separate devices" bit of extra security (your threat is now keepass-aware malware or keyloggers with filesystem access), but if your one db is compromised the 2fa codes are not (and vice versa).

However, storing 2fa codes alongside your password is still more secure than not. If the service password is compromised without the 2fa seed (main threats: MITM, phishing), the one-time password prevents the account from being immediately compromised.


"One of these days" I'm going to get a safe and print them off and remove them digitally.


I've been using a manager called Enpass [1] for over a year now and really like it.

It has browser extensions (Chrome, FF, & Safari), local storage (with many cloud-sync options), and apps for Mac, Windows, Linux, iOS, Android, etc.

The desktop versions are free and the mobile version is about $10/platform (lifetime license).

I have no connection with the company, just a very pleased user.

[1] https://www.enpass.io/


Another recommendation for Enpass. I started using it because they had a great desktop Linux version which worked perfectly with their Firefox & Chrome plugins. After that, paying $10 for the Android version was a no-brainer.


What if you need to update the password of one of the sites (e. g. this disqus breach) ? What if the domain name changes?


Maybe the program can take an integer nonce in addition. Then you have to track how many times you've used it for a particular domain.

And if the domain name changes: you just keep generating for the old domain and tell the program to override.

These seem like relatively simple challenges.


I just use my browser's password manager. (It's not deterministic though; other responses cover that angle well.)

Everyone seems to treat password managers as an arcane mysterious tool and acts temporarily amnesiac to the fact that every browser that anyone has used asks if you want to save your password. Both Firefox and Chrome let you sync your passwords between multiple computers, and I believe both do it in an end-to-end encrypted way.


>and I believe both do it in an end-to-end encrypted way.

in chrome, the end-to-end part is opt-in. https://support.google.com/chrome/answer/1181035?hl=en


Relying on your browser's password storage is very risky:

Sorry, But Your Browser Password Manager Probably Isn't Enough https://www.wired.com/2016/08/browser-password-manager-proba...

Why you shouldn't use your browser to store online passwords https://scotthelme.co.uk/storing-passwords-in-browser/


If it's deterministic then anyone who gets your seed has all your passwords.


But it's the same in any case, right? It's always, "if an attacker gets your <x>, they have all of your passwords."

The question is: are they any more likely to gain a seed that I keep only in my head vs. a list of passwords in an encrypted datastore?


They're still different, the two schemes. In the case of a traditional password manager, an attacker would need to get your master password and the password datastore. In the case of datastores stored locally, rather than online, that's still two separate steps.

In the case of your idea, once an attacker knows the seed, they have everything. Furthermore, if there's some weakness in the algorithm you use to generate passwords from the seed, then some site which has one of your deterministically-generated passwords could potentially breach the weak algorithm to reverse the process and obtain your seed, along with all your other passwords.

Lastly, with a traditional password manager, you can change the master password at any time. Would you be able to change your seed when needed, or would you need to regenerate your passwords for EVERY site, if you need change the seed because it's been compromised?


Except you won't keep it only in your head: you'll be typing it into a computer at some point. Really, in either case, you should also be using token-based 2FA as well.


OK, so a surreptitious keylogger still defeats this config. I'm OK with that. My entire life presumes trust from my keyboard to my computer.

I mean, if that were my main concerns, I'd have to rethink most of the bedrock security precautions I take, like passphrases for private SSH keys, etc.

At the end of the day: I understand the risks and view them as absolutely acceptable in order to gain reasonable portability in a password manager.


In the case of a password manager they would need the database and the master password. In the deterministic case they only need the seed.


The problem with this style of password management is twofold:

1. You can't change your master password without changing every other password (upon refresh - looks like others mentioned that already)

2. You can't modify the output of your deterministic password algorithm to adhere to sites with weird password requirements


I generate unique passwords using command-line tools like apg and pwgen and store them in an encrypted Emacs org-mode file [1]. Emacs can open files transparently via SSH [2], so I have easy remote access to my password store without having to rely on any external service.

[1] http://orgmode.org/worg/org-tutorials/encrypting-files.html

[2] https://www.emacswiki.org/emacs/TrampMode


Personally, as an OSX user I use KeePassXC[1]. It's a fork of the original KeePass better designed to be cross platform.

If you want to sync across multiple computers, you can throw it in whatever file sharing service (Dropbox, Google Drive, etc) you use. Personally, I prefer Spideroak[2] due to their strong privacy policy.

[1] https://keepassxc.org [2] https://spideroak.com/one/


The problem with the seed-only approach is you have to remember the password policy of every website, at the time you registered, in order to regenerate the password used.


If that's all you're after then nullpass fits the bill exactly.

http://www.nullpass.org/

Warning though, I stopped using it because seed + domain doesn't guarantee that the password will fit the sites password requirements.


Replying to your comment (that wasn't here when I started writing) directly - I just wrote my suggestions and a starting guide here: https://news.ycombinator.com/item?id=15421444

Hope this helps.


It seems that you might not need a good password manager for this type of scenario. Password manager for me is the one that stores unique passwords for each login. I don't know the majority of my passowrds


For me, I use pass cli. Gets the job done.


Safari on Mac already has a strong built-in password manager, it syncs with Safari on iOS


Password Safe is a good one...


Enpass.


Someone please build this if it doesn't exist already!


IIRC Someone has before, either here or reddit but it was terribly implemented.

Use LastPass, 1Password, or KeePass.


And to provide different email addresses for every website, so that your identity is difficult to correlate between accounts, which partially mitigates the privacy implications of a breach, and fully mitigates the spam / phishing consequences (since you can delete that email alias).


Generating forwarding email addresses is something password managers like 1Password should do. They should generate a new username, password, and email per service in one click as you please.

Thanks to social engineering, email address reuse is the new password reuse.


Thanks for your suggestion, but I am not sure how to implement this in practice? How do you generate different email addresses for different websites for a typical user who uses gmail?


I run my own domain and the mail server I use does that for me (smartermail). If you don't control your domain there must be third party services. I googled "free email alias", most results seem to be for temporary aliases, but there must be a service for permanent aliases. It's something major providers like gmail should provide.


For some websites, you can put a "+" sign and whatever you want after your gmail username. For example: username+dropbox@gmail.com, and you will receive mails at username@gmail.com. Unfortunelately, this doesn't work with websites that have aggressive regexp checking on email fields.


I don't see the benefit of using name+alias@gmail.com if the point is to obscure your email address


Not everyone realises that'+' is a valid character in an email address, at least not when I tried this several years ago.


If you don't want bots to try email+password from a leak on a bunch of websites, it's good enough, but I agree on the fact that it won't hide your identity.


Plus it leaks your underlying email address so it offers very limited protection. Also you can't delete the alias to stop any future spam/phishing.


I think with gmail it's possible to filter based on which alias it was sent to.



I use 33mail


A better reminder would be not to use services like Disqus.



If you're just starting, here's some guidance on setting up a password manager.

First of all: Don't be afraid of using one. It's not just more secure, it's super convenient. Never again will you ask yourself: Did I make an account for this website/service? What email did I use? Never again will you have to remember a password. Using a password manager is a quality of life improvement.

KeepassXC is what I recommend to people at this point. It's free and you own your data (your passwords). They live wherever you want them to live. There are plenty of online services that are supposedly more convenient but I have to say I trust them less -- YMMV (1Password is the best I'm aware of).

https://keepassxc.org

If you do use keepassxc, you get the added benefit of being able to store 2FA settings in it as well (if you store them in the same database as your passwords, be aware that you lose the security benefit of a second factor, however it is still more secure than not having 2FA enabled due to the One-time password component).

Put every account you ever made and ever make into keepass. Enable 2fa wherever you don't have it enabled. Add login URLs and notes. Generate your passwords from keepass itself; the password generator is really powerful and lets you very easily deal with site-specific shitty password limitations. I'm telling you this because, seriously, it's incredibly convenient to have this stuff as long as you're rigorous about maintaining it.

Oh, also, keepass has the full history of all your passwords. Need to look up an old password? Go into details and look at "History". You can also attach files to items (items don't have to be accounts at all, you can use keepassxc as a simple encrypted storage db).

Mobile support: Keepass2Android. Best android client, with google drive support. iOS I have no idea, suggestions welcome.

IMPORTANT: BE STUPIDLY PARANOID AND RIGOROUSLY CAREFUL ABOUT YOUR MASTER PASSWORD. That thing, together with your keepass database, unlocks all your accounts ever. Use a really long passphrase that you will never have to write down (if you do decide to write it down because you don't trust yourself, store it in a safety deposit box, don't put it in a bloody drawer). Make sure the device you unlock the database on is malware-free.

PS: Wondering what's up with Keepass vs. KeepassX vs. KeepassXC? Keepass is the original app, written in .NET but with poor multi-platform support. KeepassX is a rewrite in Qt and is a fantastic password manager, but has gone unmaintained recently. The open source community picked up the slack in the KeepassXC fork (after continuing countless attempts to upstream the patches) and has implemented lots of powerful features. I've switched to it and at this point I strongly believe it's the better client.


What if you need to update the password of one of the sites (e. g. this disqus breach) ? What if the domain name changes?


Sorry. Replied to the wrong comment.


You can delete your comments if you get to them quickly enough.


as they become the staple the password manager becomes the target


But that's a much much smaller target to aim at than the vast number of websites that you trust with a password


[flagged]


Again, HN, please explain to me why “This.” was downvoted here.

Why is it wrong for somome to acknowledge and undersore what somone else says?

It’s the kind of thing that happens in real world conversation and we don’t rebuke people for agreeing and underscoring what somone else says in that context so why do it here?

I’m genuinely curious.


There is a significant difference between HN and real life; in real life, there are both a limited number of participants, and you are more likely to be interested in what each participant thinks. But consider a very large group, like hundreds of people, in a room. One person gets their turn to speak, and merely says "I agree with the previous speaker" and leaves the podium. Did this unknown person contribute meaningfully to the debate, or did they just senselessly waste everybody’s time? It’s obviously the latter, and it’s the same here on HN; nobody knows who you are, and merely agreeing or disagreeing without backing it up with arguments is a waste of everybody’s time.


That’s a very good answer to a genuine question, thanks!


It adds nothing and wastes space. If you agree, just upvote.


> It’s the kind of thing that happens in real world conversation

HN isn't real world conversation. These link comment threads do not function at all like conversations in the real world.

And if they did.... Where in the real world would you walk into a physical space, in which a crowd of 187 people are gathered to read and discuss an article (or broadly a topic), and then pull out a bullhorn and declare one word: "this" and then walk back out the door? Your premise doesn't make sense.


At the risk of increasing an unnecessary thread, the parent comment adds nothing to the discussion and the upvote exists for this purpose.


Frankly, I don't trust password managers. Perhaps I'm naive and they're perfectly safe, but still.

My solution is to develop an algorithm for all my passwords that is pretty quick and simple to memorize but allows me to "generate" a unique password for each service that I use.

Although I seriously doubt anyone could figure out my algorithm by learning one of my passwords, I'll admit that if they were to gather 3 or more then they'll probably be able to figure out my system. But since I never write down or record any passwords whatsoever (all I need to remember is my algorithm), someone would have to steal my credentials from multilple sources and also be able to know which credentials from one service match a user from another, etc.


1. Linkedin, Yahoo, Disqus. Probably a few others for good measure.

2. Search for any combinations of john, garrison, and any other names I can determine by looking up your comment / post history here

3. Check the list of results to see which appear to be reasonably complex and of similar construction to determine which are likely yours. People with "good" passwords are very much in the minority, so this should be pretty straight-forward and mostly automatable.

4. Manually try to determine your scheme, which according to you is probably doable with this information.

---

Not saying it'd get you for sure, but if your replacement of a password manager is hamstrung by knowledge of at most three of your previously used passwords, you're probably doing yourself a disservice.


I just learned about this thanks to an email from https://haveibeenpwned.com/ alerting me that my account info was leaked.


I had three emails from HIBP today, for 3 different breaches: Disqus, Kickstarter and Bitly. Three, in a day. These breaches are all years old and at least one of them was set to one of my old passwords I reused in a lot of places (before moving to a pwd manager), so it could have done some real damage. The fact that it didn't happen, makes me think that salted hashing did what it was supposed to do, discouraging bruteforce decryption just enough.

Still, the state of the industry is pretty shocking. We are doing more and more stuff (super-encryption at rest, constant password rotation, password managers etc) and still breaches happen with regularity. It feels a bit like when Office started having a system to recover documents after a crash - yeah, great, but why is the crash happening so often that we need it in the first place?


And I learned it from your comment. Just tried haveibeen and confirmed disqus leak. No emails from disqus!


I got an email too. The thing is, I don't have a Disqus account. The only mention of Disqus in my gmail account is the haveibeenpwned email. Any idea what's going on?


Same happened to me. I think I may have used the social media login via Google, which I'm guessing could get your email into the leak without any password being present.


Big thank you to Troy Hunt for this service, I just got my email too.

I read your comment just after you posted it and thought, "oh I haven't gotten an email yet, I must be safe". . 50 minutes later it came through :-p


Right now there isn’t any evidence of unauthorized logins occurring in relation to this. No plain text passwords were exposed, but it is possible for this data to be decrypted (even if unlikely)

Where by "unlikely" we mean "with almost complete certainty".


That was my first thought upon reading that. Someone at Disqus should take those lists of hundreds of millions of hacked passwords, hash 'em with SHA1, and see how many aren't in their own list.


You mean, like Troy Hunt, of “haveibeenpwned” fame?


We just have to concede centralization is a dark pattern online. Its creates incentives for obsessive profiling, surveillance and makes a complete mockery of privacy.

In many ways the old world of dispersed forums were much better and are still better content wise than anything Reddit can throw up.

You can share photos and remain connected to family and friends without needing anything like facebook. This kind of centralization only favours self promoters and exhibitionists and shifts tremendous personal data patterns to entities who have no business having or analyzing this data. No one should have access to this kind of data as the consequences can only be negative.


This highlights an important aspect of Data Security that is often never talked about

Data Minimization.

It seems they had a old copy of the database, likely a backup, just sitting around somewhere. The question is why did this old copy, from 4 years ago, still exist. Why was it not deleted under a standard data retention policy.

Companies not only need to spend time on how they will secure data, they need to spend time on how they will age out and delete old data.


We really need to get companies to see user data as a liability, rather than an asset. Our incentivisation system is completely backwards for this.


The rate at which these security breaches are happening is very alarming to me.

I don't know what the effects are of them (including this one), but my gut tells me they'll only increase over time; perhaps the severity of them as well.

It's honestly making me re-consider my use of the internet, technology, and electronics.

What's the future hold with regard to security, breaches, and keeping our data and information safe?


> The rate at which these security breaches are happening is very alarming to me.

It is the rate at which they are being published that is alarming to you, the rate at which they are actually happening is much higher still.


> The rate at which these security breaches are happening is very alarming to me.

This should be no surprise if you're a developer. "Lean/'agile' companies" have tried to cut cost so agressively that we don't have QA teams, tickets have to go as fast as possible, etc.

Businesses won't care about this until it hurts their bottom line. Even then they'll deny that it's possible.


For a long time people didn't care because they thought it couldn't happen or wouldn't affect them but I'm optimistic that people are learning, albeit slowly.


From the philosophical point of view, we cannot avoid these security breaches in the digital world. We need to leap forward to analog or quantum computing to leak proof our information.

I wonder if there will be any information private by then.


SHA1..... That's bad

Current gpus can hash tens of millions of passwords per second with SHA-1.


Some back of the envelope calculations...

[A GTX 1080 can do ~8.5 billion hashes/second][1].

An 8-character, truly random password with mixed case, numbers, and symbols has (26 * 2+10+32)^8 possible combinations.

That means an attacker with a single GTX 1080 could crack such a password ((26 * 2+10+32)^8/2) / (8538.1 * 10^6)) seconds ([approximately 4 days][2]) on average. And most user's passwords are most certainly not truly random (i.e. they're significantly weaker than that).

[1]: https://gist.github.com/epixoip/a83d38f412b4737e99bbef804a27...

[2]: https://www.wolframalpha.com/input/?i=((26*2%2B10%2B32)%5E8%...


For the record, Bcrypt exists since 1999.


I am always puzzled when a company reports a breach and says "X account details were compromised, and Y passwords were obtained..." when Y is a smaller percentage of X.

I would assume that there is always a 1 to 1 relationship of user account details to passwords, or that the passwords are stored within the user table in the DB, so at best, X=Y at all times?

I can understand if it is a current breach and the DBAs managed to stop a transfer of data mid query, but in a 4 year old database (in this case) where only hackers only have partial data for passwords, but full data for user accounts?


Disqus provides options for single-sign-on. So it's possible that some of the accounts breached didn't have a password associated with them, only OAuth tokens.


How can an independent security researcher be aware of such a breach ?


One guy in particular, I can't recall his name (Chris Vickery, maybe?), has been behind A LOT of these "discoveries"; specifically, a bunch of Amazon S3 buckets that are misconfigured (WRT permissions) and wide open to anyone who wants to download the data. (I'll admit that I'm kinda curious as to how he enumerates all of them!)

Based on the little information published, it sounds like that's what has happened in this case as well.

The same guy has also found tons of wide open MongoDB instances and such.

The guy you're referring to, Troy Hunt (HIBP) , writes about these cases but, AIUI, doesn't typically find them on his own. He's usually notified by the actual "researcher" -- this Chris guy in a lot of recent cases -- and they share the info with him.


You can enumerate S3 buckets (and other cloud operators) by DNS bruteforce, or looking at how the company name them on their website and working through some common name.

Did this recently and found 1000’s of public buckets.


Check out his YouTube channel, he explains things quite often: https://www.youtube.com/channel/UCD6MWz4A61JaeGrvyoYl-rQ

And many blog posts explains what how he does it: https://www.troyhunt.com/

Pretty sure this is the one he mentioned in his video today as having been given to him by someone else.


They have contacts and ears in the underworld I suppose :=)


It's possible the researcher managed to get into a position to download the same database snapshot.


All these leaks are making me feel anxious to go around deleting anything I no longer use and be more careful about signing up for anything without FB/Google login. It seems like only giant companies have the competence to maintain security BUT even then, Yahoo proves me wrong on that one.

That said, I'm sort of happy 1password switched to subscription billing - hopefully this lets them pay top dollar for the best security experts.


Someone posted on HN fastmail includes a very unusual and nice way to stop spam using alias, basically it's mailinator baked in: https://www.fastmail.com/help/receive/aliases.html


I pay for FastMail for this very reason. When I transferred off my Google Apps email account, I set up a unique ___@ for every website. If my email ever leaks, I'll know who leaked it (or sold it).


Same here. I simply set up a catchall alias so <anything>@somedomain forwards to foo@regulardomain; very convenient. The fact that FastMail supports multiple custom domains, at least with standard accounts, is also nice -- even more so since you can (optionally) use their own nameservers and set up whatever DNS records you want (I use that for some personal websites), while they make sure SPF/DKIM records are always correct.

(Having a separate domain is not really necessary, but I registered it through nearlyfreespeech.net and use their WHOIS privacy service; at least J. Random can't easily figure out my identity based on the email address alone).



I don't think Outlook.com supports catchall/wildcard aliases though, which makes using unique addresses for everything a bit annoying, which makes address reuse more likely.


Interesting point! I hadn’t thought about wildcards in aliases.


It's not just your passwords of course but also email and sometimes phone and credit cards you have potentially stored at sites as well.


Is the info available online somewhere? I’m curious what my password was in 2012.


Sounds like somebody left an S3 bucket wide open yet again.


SHA1 hash? You’ve gotta be fucking kidding


It's from a database from 2012. Wasn't that cutting edge in 2012? What kind of sick shit will we be encoding our stuff 6 years from now, that bcrypt will seem laughably ill-suited.


The most popular post that ever ran on Matasano's security blog was the one where I encouraged people to migrate to bcrypt. In 2007. Bcrypt, of course, is much older; Niels and David invented it as the standard password format for OpenBSD back in 1999 --- and bcrypt was a response to FreeBSD's iterated salted hash format, which also had a work factor, and is years older still.

Today, in 2017, bcrypt remains a sound recommendation. You can do better, but for password databases on websites, not materially better.

Salted SHA-1 hashes (salted SHA-anything hashes) were malpractice in 2012.


> You can do better, but for password databases on websites, not materially better.

Do you mean using scrypt? What do you mean by materially better?


No, I mean bcrypt.

Scrypt is better than bcrypt, but mostly not in ways that make much of a difference in 2017.

PBKDF2 comes close to being materially worse than bcrypt and scrypt, because it's especially straightforward on modern hardware, but even PBKDF2 is fine.

For the most part, as long as you're using anything with a KDF-like design for your password hash, a compromise of your password database is going to reveal the very terrible passwords and only those passwords; the rest will be too costly to crack.

Right now given the choice I'd use scrypt and go slightly out of my way to get it (if there was a good 3rd party library for it and bcrypt was in the standard library and I was like a "yarn add" away from having it, I'd take that step), but I would not convert a bcrypt site to scrypt.


Considering that PBKDF2 has adjustable difficulty parameters would you still say it's worse if very high difficulty parameters are chosen?


It has to do with defender's vs attacker's costs. PBKDF2, which is usually instantiated with SHA-2, even with huge amount of rounds is still a lot cheaper for the attacker than for the defender, since the attacker can use GPU/ASIC, requiring fewer transistors, running many calculations in parallel, while defenders usually use CPU. On the other hand, bcrypt, scrypt, Argon2 don't provide a lot of advantage to the attacker compared to CPU, since GPU and ASIC implementations are expensive and memory-bound.

PS My measurements show that pure JavaScript implementation of scrypt is better than fast native PBKDF2 provided by WebCrypto API or Node.js at the same running time.

PPS But yeah, if you can't use bcrypt/scrypt/Argon2, but can use PBKDF2 with high number of rounds, sure, do it.


Thanks for the reply. I appreciate the added explanation.


Yes to scrypt, but these days, Argon2[1] is the best.

"better" means "cannot be calculated much faster on GPU or even FPGA because it requires a lot of RAM".

"materially better" probably means "less than a million of hashes per second on eight Nvidia GTX 1080 running hashcat"[2], so Django and scrypt are both good (adjust work factor as needed, of course).

1 - https://en.wikipedia.org/wiki/Argon2

2 - https://gist.github.com/epixoip/a83d38f412b4737e99bbef804a27...


Thanks for the reply, and added info on being materially better.


> Salted SHA-1 hashes (salted SHA-anything hashes) were malpractice in 2012.

I'm pretty sure this is still the only option on Google App Engine. You can't upload C code, so bcrypt isn't an option.


You can't store passwords as salted SHA-x hashes. It's not OK to do that. If you have SHA-anything, you have PBKDF2; use that.


It has never been the only option on Google App Engine (GAE). Bcrypt is exactly what I used in my GAE app back in 2009, though I have since moved on to scrypt. Bcrypt can be implemented in non-C languages, and there are libraries available for all the languages that are supported on GAE. If you're worried about Python performance, then you can have the bcrypt function in a separate module written in Java/Go.


Malpractice? Someone using SHA-1 for password storage wouldn't even be a medium severity issue on a modern pentest.

I agree with you, but it was very strange to discover that password storage is basically ignored in pentests. Especially after years of you drumming it up as a big deal.


Using SHA-1 for password storage would be sev:low in a pentest. There are a lot of other sev:low things that you would certainly agree are signs of incompetence. Unsoundness of engineering and vulnerability impact are almost orthogonal.


The issue is that companies can basically ignore sev:low findings. "Malpractice" implies that they need to care; they do not.

I wish they did. It would be nice if they were forced to care. But it wouldn't block them from being declared secure by a pentest. Low-severity findings are findings, yes, but they don't have the same pull as medium or high severity vulns.

All of this is true for storing passwords in plaintext, too. If some company leaked plaintext passwords, people would be outraged. Yet pentests would still give that company a pass, because plaintext password storage is sev:low.


I understand what you're saying, but second-order findings on pentests don't get high severity, no matter how important a sign of unsoundness they are. Severity and importance are also somewhat orthogonal.


It was never cutting edge. It was half informed lazy coder homemade crypto in 2012.

SHA1 is a fast hash. It's designed to be tractable to calculate lots of SHA1 in a small time. This is independent of whether it has collisions and is considered broken. It was fast from day 1. Fast hashes are not suitable for protecting passwords. They were never suitable for protecting passwords.


I can only speak to what was mainstream. In my sphere at the time SHA1 was cutting edge, most of my peers were on MD5. The best among us recommending SHA1.


I don't want to be too much of a jerk about this because I get that this is an expert subject but if the best among you were recommending salted SHA-anything in 2012, the best among you were committing professional malpractice.

Honestly, I feel like when we wrote that dumb bcrypt post in 2007, it was already a bit negligent to be using unstretched general purpose hashes for password storage. The BSD's used better hashes in the 1990s.


It was not at all cutting edge. In 2012 I helped a number of companies move from bcrypt to scrypt.


Which is what the article says Discus did, they moved /from/ sha1 /to/ bcrypt in 2012. Same as the companies you helped, in 2012.


No, those companies did not use SHA1 in 2012 or any time close to then. They used bcrypt until they upgraded to scrypt.

SHA1 was useless for passwords long before then.


SHA-1 has been known to be vulnerable since 2005, and even in 2012 SHA-2 and SHA-3 were recommended.

Nonetheless, you have a point!


>SHA-1 has been known to be vulnerable since 2005, and even in 2012 SHA-2 and SHA-3 were recommended.

FYI, the requirements for a password hash function is significantly different than for a cryptographic hash function. the vulnerabilities you're talking about doesn't affect any of those properties. password hashes only need to have preimage resistance, and (more importantly) be slow as to limit offline attacks.


This is pretty much correct. It doesn't much matter what cryptographic hash you use to store secrets, and all of the general-purpose cryptographic hashes are bad password hashes. Salted SHA-3 would not be materially better than salted SHA-2 here.


Significantly better than using MD5 or storing in plaintext in 2012 (both of which would have been likely in 2012).

And in 2012 the current breaks from this year were not yet known. Some considered sha1 to be in its twilight, but it was not 'broken' yet at that time.


It is in fact not significantly better, for this purpose, than MD5.


That assertion is much easier to make now, with the knowledge we have in 2017, five years later.

But without knowledge of what was coming for sha1 in five years, back in 2012 it would have been a much better choice than either MD5 or plaintext storage.

However, even today, with the knowledge we now have regarding sha1, if ones choices are limited (for some strange reason) to only sha1 or MD5, sha1 is still a better choice than MD5. Yes, sha1 is weak, and it should clearly not be used for any new designs, but sha1 is still stronger than MD5.

Also note, the 2012 date was when they last used sha1, not when they started using it. That fact is somewhat critical to keep in mind. They last used sha1 in 2012. What got leaked were some leftover hashed passwords that never got updated to bcrypt that were still hanging around in their database (probably because those accounts have never logged in for the last five years and been forced through a password change).


No. For similar reasons, salted SHA-2 is also not materially better than MD5. You think this is about the strength of the underlying cryptographic hash, but that has in fact very little bearing on the strength of the password hash construction.


Clearly there is some critical piece of knowledge that I'm lacking, so please help me understand where my misunderstanding lies.

The article announcing the breach contains the term "SHA1" in exactly two places: "passwords (hashed using SHA1 with a salt;" and "password hashing algorithm from SHA1 to bcrypt".

Absent evidence to the contrary (of which the article provides no such evidence), I am reading "hashed using SHA1 with a salt" to mean they used this construction:

    Hp = H(S||P) or
    Hp = H(P||S)
    where:
    S is a salt (derivation method unstated)
    P is the plaintext password
    || is byte concatenation
    H( ) is a hash function (sha1 in this specific case)
         applied only once to the input bytes
    Hp is the "hashed salted password"
How does the strength of the construction H(S||P) (or H(P||S)) not have a direct bearing on the strength of the chosen hash? It is nothing but the chosen hash. What am I misunderstanding here?


Forget about the strength of the underlying hash. That's not how you recover passwords from hashed password databases. In reality, the way you recover passwords is to take a dictionary starting at AARDVARK and work your way to ZEBRA and every alphanumeric string in between, hashing each one and comparing it to the target password. Because MD5, SHA1, SHA2, Blake, Blake2, and SHA3 are all designed to be as fast as possible, this attack is extremely effective, and can be accelerated dramatically with GPUs.

The "password hashes" PBKDF2, bcrypt, scrypt, and Argon2 are all designed, the same way a KDF is designed, to mitigate this attack. All of them have a "work factor" that requires you to iterate the underlying hashing primitive (which might very well be SHA2) many times before arriving at the answer.

SHA1 and SHA2 aren't password hashes. That's what people here keep trying to explain. None of the well-understood flaws in MD5 and SHA1 are really relevant to the password hash setting. They're a disaster for cryptographic signature constructions, but they do not matter at all for passwords.


Sha1 hasn't been the recommended best practice for a very long time. (Really ever.) Bcrypt dates back to 1999. Even if you give it 10 years for evaluation it would have to be considered in 2009. And indeed it was recommended in 2007, 5 years before this breach. RFC2898 (PBKDF2) came out in 2000, 12 years before this breach. Scrypt was released in 2009, so I could understand not adopting it by 2012 out of concern for insufficient vetting. Sha1 would only have been acceptable between 1995 (its release) and 2000 or so. Though even then the practice of key stretching was known: IIRC /etc/shadow has done that since the beginning, running 1000 iterations of MD5 by default. Looking it up that was released in 1987. 25 years!


http://valerieaurora.org/hash.html

that's BS to think sha1 was the best hash you could pick in 2012


This is a chart of general-purpose hashes, not password hash constructions. All the hashes on Valerie's chart are bad password hashes.


I said nothing at all about sha1 being "the best ... you could pick". You read that in from somewhere.

I said it (sha1) was significantly better than MD5 or plaintext. That neither says nor implies that sha1 is best, just that it was better than other options that some might have chosen in 2012.


And that is false, sorry to say. Plainly false. The weaknesses unique to MD5 (in 2012) and SHA1 (in 2016) don't matter for password hash constructions. The weaknesses shared by salted MD5, SHA1, SHA2, and SHA3 --- each a distinct construction from the underlying hash --- matter hugely for password storage.

The problem is that MD5, SHA1, SHA2, and SHA3 are not password hashes. The password hash constructions in common use are PBKDF2, bcrypt, scrypt, and Argon2. Some of them use SHA2 as a primitive, some of them don't, but none of them work by simply concatenating a salt with a password and hashing.


It doesn't matter if it's a "password hash" if it's a cryptographically secure hash and a long enough password. If it can withstand all the attacks that give you shortcuts to finding out what the input was, given the output, it's fine.

Password hashes only help protect against brute force searches by increasing the cost to attack linearly with the cost to verify. But that isn't a great tradeoff and isn't future-proof.


All the crypto engineering that goes into password hashes is about the fact that passwords aren't long enough, so your "if" caveat makes your argument rather disconnected from the real world. People won't use passwords with the sufficient amount of entropy, they couldn't even if they wanted to (because of memorizing difficulties, typos, lack of good text entry UI on mobile devices, etc).

As long as you're using a password entry field designed for manual entry, you can't credibly counter that with "people should use password managers and autogenerated long line-noise passwords". Because you can't base your security upon all your users taking the initiative and doing the power-user non-default thing.


It's also true that even with a "password hash" your short password is not secure. It makes the attack more expensive, maybe from $10 to $10000 today, or $1000 next year. But practically that isn't something you should rely on.


Now you flipped over to the other extreme :) I'll leave it to others to argue how there are parameters that make an acceptable tradeoff for proof-of-work hashes like scrypt for many applications.


I didn't flip - my point is "password hashes" and other secure hashes are similarly secure. You need a long password to trust it won't be brute-forced.


In the 2007-2012 era, SHA1 was common. Also, the salts will slow down cracking a little for passwords not already known.


Nobody uses rainbow tables or cares to mitigate them. People care that GPU rigs get hundreds of billions of hashes-per-second[1] against a single-iteration salted hash. So all 8-char case-sensitive alphanumeric combinations can be checked in 18 minutes[2].

1 - https://gist.github.com/epixoip/a83d38f412b4737e99bbef804a27...

2 - (pow(26+26+10, 8) / 2*pow(10, 11)) / 60


If you are attacking just one password, that makes sense. But if you want to check all the compromised accounts for easy to guess passwords, a salt will increase the cost.


Salt won't save you. For checking most common passwords against stolen database, you try the top one million most common passwords against each hash, at a rate of 200,000 hashes per second.

A dictionary-based attack that tries variants and inserts digits and spends one second per hash will catch the less common passwords.


No, they won't.


Was the salt leaked?


Another important question: Did they use a unique salt per password? Their announcement is ambiguous.


That's the general idea of a salt, yes. I'd argue that a salt which isn't unique per-user isn't really a salt at all.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: