Hacker News new | past | comments | ask | show | jobs | submit login
Pwned passwords, open source in the .NET foundation and working with the FBI (troyhunt.com)
323 points by jffry on May 28, 2021 | hide | past | favorite | 72 comments

HIBP has really grown from kind of a cool toy to a serious public good. Kudos to Troy for his great stewardship over the years, you can tell he is passionate about this and is very careful about making sure this dataset is used for everyone's benefit. I'm glad he decided to head this direction after the discussions of acquisition a while back.

Yea, I feel like there should be some kind of internet nobel prize and that he should be one of the few to win it. He's definitely a badass.

The Swartz Prize? I like that namesake better than some others


"Criticism of Nobel focuses on his leading role in weapons manufacturing and sales, and some question his motives in creating his prizes, suggesting they are intended to improve his reputation."

Here in Scandinavia, the usual narrative about Alfred Nobel is that he wanted to offset the damage his inventions have contributed to the world. To acknowledge that, he probably felt that it was intertwined with his reputation and his own concsience.

Yea, people are fallible. It seems ok to me if he tried to make amends by trying to add a little incentive for people to better the world.

> some question his motives in creating his prizes, suggesting they are intended to improve his reputation.

That seems backwards to me. Part of the purpose of reputation is to incent people to do good things.

If his primary aim was to improve his reputation, wouldn't he have created the Nobel Prize before his death?

I think his brother died and newspapers thought it was Alfred and there were some very unkind reports about him that said stuff like he was a merchant of death and good riddens. That made him rethink his long term legacy and he tried to improve it.

Yea, that name definitely gets my vote.

Tim Berners-Lee should be one of the first in the queue.

Tim Berners-Lee is in the World Wide Web Hall of Fame, Internet Hall of Fame, Touring Prize among others https://en.wikipedia.org/wiki/List_of_awards_and_honours_rec...

Yeah love the guy, saved me a few times !

Brief mention in the post, but FYI to the thread - .NET Foundation is an independant non-profit (501c6) foundation that supports the .NET open source community. It's run by a community elected board and funded by member donations and a diverse group of corporate sponsors. This is a great example of the kind of work they do to support the community.

Disclaimer: I was on staff at .NET Foundation 2016-2019.

"Incorporated by Microsoft" feels like an important detail to me.

Periodic reminder that fast cryptographic hash functions like SHA-x and MD5, even with a non-secret salt (pepper?), are not designed to resist brute-force attacks on data as low-entropy as passwords.

Use scrypt [0], bcrypt [1], or argon2 [2], which are key derivation functions (KDF) built on top of pseudo-random functions (PRF) and designed to be slow.

In one interesting example, the Keybase founders deviced an experimental scheme to generate Bitcoin wallet addresses from a passphrase and a salt using KDFs [3], the advantage here being that the wallet then is fully non-custodial (note, there are better ways to implement non-custodial wallets [4]).

[0] https://blog.filippo.io/scrypt-all-the-things/

[1] https://codahale.com/how-to-safely-store-a-password/

[2] https://signal.org/blog/secure-value-recovery/

[3] https://keybase.io/warp/

[4] https://github.com/novifinancial/opaque-ke

AFAIR the reason this database is stored as SHA-1 hashes is because that's what a large amount of the original data dumps contained. Moving to a harder hash would have required cracking all of them first and wouldn't do much more to ensure the database can't be directly used as a password list for attacks.

I think you misunderstood me. I meant that developers must use KDFs and not cryptographic hash functions to store user passwords (at least until WebAuthn takes center stage), so that in the event they are pwned and have their db stolen, the brute-force attacks wouldn't be as effective.

This makes me wonder if the FBI is taking stolen credentials off these sites in order to use them against suspects under investigation. Do they have my passwords? Might they have yours?

It's written in the blog that fbi is only providing breach datasets not getting them. Through every dataset in HIBP is public already

IIRC every dataset in pwned passwords is public, not in all of HIBP

Haveibeenpwned gets their lists in the same manner anyone else can. FBI already has those lists, probably faster than haveibeenpwned does.

I would be willing to bet most intelligent FBI agents have their own stash of data breaches regardless of what agency policy is. Data breaches are truly invaluable in online investigations.

FBI agents don’t break into computers unless they are part of the forensic team.

The FBI forensic team / lab definitely has plenty of dictionaries.

is it legal to collect the hacked accounts/emails/etc. lists from the places like hacking forums/etc.? Wouldn't it be a stolen property what can make FBI go after you?

Coincidentally, just few days ago a Russian owner of a platform for such exchanges got his sentence:


"DEER.IO sold not only stolen accounts, like the gamer accounts identified in the plea agreement, but also Americans’ personal information, to include names, current addresses, telephone numbers and at times Social Security numbers. On March 4, 2020, the FBI purchased 1,100 gamer accounts, and on March 5, 2020, the FBI purchased the personal information for over 3,600 Americans. On March 7, 2020, Firsov was arrested by the FBI in New York City when he flew into JFK Airport from Moscow."

Troy Hunt has blogged a few times about the legitimacy of his service, and it's clearly been a carefully walked path to where it is today.

I think ultimately it comes down to the fact that he's not redistributing the lists of emails, and he doesn't retain pairs of (email + password hash). He designed the site to provide two useful queries ("what breaches included my email" and "has this password been seen in any breaches you processed"), which strike a responsible balance between disclosure and privacy.

Moreover, he has written about his intentions and acted with a fair deal of transparency, which is a strong contrast to some of the shady behavior you'll see from people dancing in the gray areas of the law.

That was probably a stepping stone to partnering with other organizations, which has snowballed into having the cooperation of the FBI, as well as the endorsement of multiple countries' governments.

Not a lawyer, but my guess would be that whether it is legal or not depends on what (if anything) you intend to do with the information. This is based mainly on perusal of this page summarizing a bunch of state identity theft statutes: https://www.ncsl.org/research/financial-services-and-commerc...

Of the ones with details listed it seems most of them require intent to defraud or something similar in addition to possession of personal information. I think paying for the info could also be problematic, there are a few trafficking laws in that list.

In a similar vein, possession of the info with the intent to use it to hack into something would probably run afoul of the CFAA or other anti-hacking laws.

If you collect that info with the intent to submit it to haveibeenpwned I think you would probably be fine.

If you collect that info just for fun, but don't do anything with it, I suspect that's legal, but probably not well-advised, as I suspect cops/prosecutors/jurors would have trouble believing someone did that for fun, and would interpret it as evidence that you were up to something nefarious.

I don't think the stolen property angle is an issue. Digital information can't really be stolen: usually the problem is you've committed a copyright violation. In the US at least I doubt user names and passwords would qualify for copyright protection. (Generally, collections of facts do not qualify.)

> Digital information can't really be stolen

That's not completely true. Generally, most ways you can "steal" data are illegal in their own right. Examples include computer fraud and wire fraud.

The Computer Fraid and Abuse act does actually detail punishment for the trafficking of passwords but said trafficking must be done knowingly and with an intent to defraud, so this specific usage would probably be fine

I hazard it depends on the jurisdiction - one thing I remember from my vague parsing of Australia's cyber crime laws (IANAL etc) is that possession of these lists/hacking tools was only criminal where there was an intent to use them for another criminal act (Crimes Act 478.3, probably superseded [1])

[1]: https://www.legislation.gov.au/Details/C2004A00937

I imagine the FBI strongly considers scale and intent. Troy Hunt is very obviously not reselling accounts, for example.

Though keep in mind that the FBI can go after whoever they want, sometimes without reason at all.

It is a big contribution to opensource it. The work behind the data is huge and extremely significant. Thanks Troy!

The service implementation that I did with a bit different technical requirement is here https://github.com/janos/compromised as an alternative. It is actively used behind the NewReleases.io service.

It focuses on extremely low memory usage and supporting very high request rates on a commodity hardware, cheap vps or cloud instances.

Am I the only one a bit surprised that the feed from the FBI will be NTLM and SHA1 hashes? (especially the NTLM)

Would it make more sense to break the NTLM hashes and then to rehash with something more secure (even a better SHA, like SHA256)

This is not quite as feasible for SHA1 (but it actually might be, even in bulk -- this was 9 years ago[0]!) as for NTLM, but I remember cracking NTLM hashes in bulk back in the late 90's on .3 Ghz servers, and I'm sure it would take a heartbeat to do it today.

0. https://arstechnica.com/information-technology/2012/12/oh-gr...

I suspect its not that the fbi has ntlm/sha1 hashes as their base data set, but that they dont want to give out the actual passwords and settled on these two hashes instead. I think HIBP was already giving out their dataset in this format.

Fwiw, sha1, 256 or even md5 have similar levels of security when it comes to password hashing. The security properties you want for password hashing are very different than normal hashing.

In what sense do you consider SHA-256 superior to SHA-1?

If you're looking for preimage resistance, unless your passwords contain more than 128 bits of entropy, I suspect bruit forcing your password is still faster than a preimage attack on SHA-1, and will probably remain so for at least a decade. Collision attacks aren't useful for passwords (attacker chooses two passwords such that they have the same hash... I can't imagine a threat model under which this is useful to the attacker if that hash doesn't match any target's password hash. Maybe there is such a threat model, but it has to be a very outside-the-box attack.)

If you're worried about effort needed to bruit-force the password, use Argon2 or another memory-hard password hash/KDF.

I can understand Troy's rationale, but I would prefer someone else than the .NET Foundation. The foundation should focus on .NET and not be a kitchen sink of everything written in .NET.

I think this is a good partnership. One of the goals of open source software foundations is to eliminate risks due to dependency on a small developer team. In this case, HIBP is a fantastic resource being maintained, and paid for, by one developer. That's not good for the maintainer, it's not good for the community, and it's risky to the community. If the sole developer wins a spot on the next flight to Mars or time travels to A.D. 802,701, the code becomes unmaintained and the site hosting payment expires. Software foundations governed by rotating teams and aren't dependant on a single individual. This is an example of something that's relatively low investment for an established software foundation - some legal fees and discounted cloud hosting from a sponsor - and benefits the whole community.

Issue is that .net is a language oriented foundation, not a cyber security one. Sending the project there looks like an ad for a Microsoft initiative and not something done with the best interest of HIBP in mind. Just an example, there is foundation literally called Open Source Security Foundation.1 If I write a python security tool and it is useful for the community, I'd think first of transferring it to them, not to the Python foundation.

1. https://openssf.org/

That is exactly my train of thought sans the Microsoft worries :).

Or even the Linux Foundation

Hi Jon, ... it is definitely a good thing for the project to be guarded by more than Troy's private money and legal situation. I do not disagree on these benefits. It is more about the focus of the foundation.

PS: thanks for years of "Jon loves Community"

Honest question, an email I infrequently use in own the list https://haveibeenpwned.com/

Is it safe if I simply add 2 factor authentication(edit: change password of course also) or do I need to add something else?

The only thing I'd add to the other comment (by babelfish) is: I'm not sure from your description whether your email account itself was compromised or merely an account on some site that is connected to your email (for example, a hacker news account which you used that email address to sign up with).

If the email account itself was compromised, then you should also check any account that you signed up for using that email address, to make sure that you still have access (because if someone had access to your email, they could have used it to reset the password on those other sites).

Change your password, change the password of any other site where you use a variation of that password, and enable 2FA on all your accounts. Use a password manager and change your passwords to longer randomly-generated ones over time (most password managers make this easy).

I did this on all of my accounts over the course of a month. Finally having an inventory of my accounts made it that I could change the email on all accounts over a weekend.

I'm shell shocked and now have a chemical dependency on locking things down. All of my machines now use ssh keys+passphrase and I no longer put any unencrypted traffic over LAN. Obviously there is a source of stress in my life.

2FA is such a hassle that IMHO it's only worth it for high-stake accounts. 20+ characters long random passwords are totally adequate security for most accounts and you don't get constantly harassed by 2FA prompts.

WebAuthn prompts aren't a big hassle. On this desktop I reach over and touch the Security Key, on my phone I tap the fingerprint sensor. Because the phone is entitled to set UV since it knows that's my fingerprint not somebody else picking up the phone, they could replace the password step which is more annoying.

WebAuthn is good, easy to use, quick to complete, and more secure than "enter the number we send you", so I like it. Unfortunately most services (that I use, anyway) are stuck in the "let's make you wait 1-2 minutes for a SMS" or "use our/your authenticator app". I find this especially annoying in conjunction with services that seem to use "risk-based authentication", because using an adblocker and anti-fingeprinting = extreme maximum risk for those, i.e. let's force 2FA auth for every action even after five minutes (sometimes, seconds!).

And as far as RBA goes, if they don't go full-2FA, they'll often somehow go for password instead of second factor to verify. I tend to keep my password manager locked when not in active use, so that's more hassle for me on services that DO use WebAuthn (Github, Google) than if they'd just use WebAuthn for the "high risk action" verification.

2FA is more than adequate on its own in a lot of cases: Attackers tend to go for low lying fruit.

Also, how worried you need to be depends on what you use the account for... People go off the deep end about securing every single account like it's Fort Knox, but you need to consider what is at risk if a given account is compromised, and what damage could be done with it.

Obviously look out for targeted phishing from people who know that you have a registration with a particular site. But if you're on HN, that might go without saying.

I love how the term "pwn" came from Counterstrike culture back when I was playing it at the turn of the century, and it's still in use today.

lol, pwn did not come from counterstrike. It originated from "leetspeak", which were creative misspellings that script kiddies and anarchist hackers on bbs boards (or IRC chans) would create to bypass text filters or go undetected by mods or operators. It later became part of internet culture adopted to mock n00bs.

Nope. It 100% came from Counter-strike. There was a typo in one of the messages and that's how it got famous.

This is not true. You're declaring objective history from what seems to be your personal anecdotes (and you say you're guessing in another comment).

HIBP links to this article which shows the history of the term before any of these games: https://www.inverse.com/gaming/pwned-meaning-definition-orig...

That's an enjoyable article, but geez is it light on details.

The one tangible instance mentioned is the Spice Girls hack — but follow the link to discover they didn't even use pwn, but 0wned!

I tried searching a few pre-00s bbs/usenet archives but could only find a couple of gaming-related instances. [1][2]

It'd be pretty interesting to see some hacking groups usage in the 80-90s. Maybe someone with better access to archives could see some trends?

[1] https://groups.google.com/g/alt.games.everquest/c/GN9tm8Esrw...

[2] https://groups.google.com/g/rec.games.computer.ultima.online...

Funny, WoW nerds have this exactly same origin myth about the term, and that came out 4 years after CS 1.0.

Ah, the youngsters... I started playing Counter-Strike at beta 6. You could be pwned years before that.


I'm not the person you're arguing with originally, but you may want to do some research before being quite so pompous.


>The term was created accidentally by the misspelling of "own" in video game design due to the keyboard proximity of the "O" and "P" keys. It implies domination or humiliation of a rival,[21]


>There are various theories about the etymology of pwned. >One of the more popular accounts is that it originated in the online computer game World of Warcraft, where a map designer misspelt owned (where own was intended to be used in the sense of 'conquer' or 'dominate').

What was that about being "confidently incorrect" again?

The comment you replied to is the first and only one by that account, and not the original user either.

The term "pwn(ed)" was popular before any of these games existed. Here's a much more thorough history lesson, linked from HIBP itself: https://www.inverse.com/gaming/pwned-meaning-definition-orig...

"At this point, pwn allegedly meant to demote or dethrone someone, but the slang was quickly picked up by early computer-users that exchanged messages on FidoNet, a system created in the 1980s for exchanging emails or text on digital bulletin boards. This is where pwn slowly transformed into the insult we know today."

Is there something about HIBP that makes them more authoritative than anybody else other than having 'pwned' in the name? Genuine question, not being facetious. All I can find is conflicting reports all over the place so I'm trying to understand why, if this is the "true" etymology, this isn't the explanation given in Wikipedia for example? I guess I just don't understand who I should trust, or more importantly, why.

I'm not backing any particular account (in fact the two links I provided disagree with each other), my point was just that it seems silly to be so overly confident on ANY account. Unless, again, there's something special about HIBP in this regard? I'm definitely no internet historian so you'll have to fill me in a little further. :)

>One of the more popular accounts is that it originated in the online computer game World of Warcraft, where a map designer misspelt owned (where own was intended to be used in the sense of 'conquer' or 'dominate').

I'd say the fact that the author of the text obviously doesn't know the difference between World of Warcraft (an MMO) and the Warcraft series (a set of RTS games) casts some doubt on this "popular account". Especially since the games are some ten years apart.

Likewise, my good friend.

The origin of pwn in gaming context is in quake multiplayer which predates CS by some years. I also recall seeing people on irc using it before that in the same context as cracking a system.

Back in the day, I played Counter-strike, Quake, Warcraft, Doom (I played PvP against people in my dorm using modem using IPX protocol pre-Internet). The first place I saw it was definitely Counter-strike. My guess is that it originated in CS and then leaked out from there.

1989: http://phrack.org/issues/24/11.html

Unless Y2K comes before 1989 it most certainly did not come from Counter-strike. Pwning something had existed throughout the 1990's hacker culture and likely was in use in the 80's too. There are some of us old enough to remember it being used. It probably stemmed from QWERTY keyboards and o/p being close to one another. "Pwned" is an easy typo from "Owned" and "owned by [some hacker/hacker group]" was always common way to deface hacked websites. It easily predates Counter-strike's release date by at least a decade so it could not have originated from CS.

“Phrack World News” and it’s acronym are a different usage to the word synonymous with “own” — which is not saying it wasn’t in use before counter-strike, but the link isn’t good evidence for it.

I overlooked the acronym and that is a very fair point.

Do a bit better research. It stands for Phrack World News in this context. I also read Phrack back in the day as well.

How about trying to search through all the Phrack articles to find the use of the word "pwn" that doesn't refer to Phrack World News? Spoiler alert: you won't find any usage of "pwn" or "pwned" until well past 2000. If it was so much part of hacker culture you would have seen a reference in the earlier articles, but you don't.

I saw pwn being used as a synonym for own in the mid-90's so it definitely predates Counter Strike.

That being said, it probably originated as a typo and that could have happened on multiple occasions.

You are definitely correct.

Hack/phreak BBS culture and leetspeak went mainstream though IRC and video game chat.

Everything old is new again, nothing new under the sun, etc.

I'm waiting for !!1!!!1!11 to make a resurgence.

> The first place I saw it was definitely Counter-strike

Nobody ever made that hilarious typo in a shooter before CS, lol

You are the arbiter of truth.

Ah yes, experts in open source... Microsoft

Makes me wonder if they will try nefarious things like uploading the hash of passwords they don't have to try and get people to change them.

Unless the goal is to make Microsoft rich with storing all 2^256, anything they actually can guess would be worth changing

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
