Hacker News new | past | comments | ask | show | jobs | submit login

The utility of the hash for a user and some potential attacker is the same. Yes, passing 8 fewer bits means that they have 256 times more possible sites you might have visited. But it also means that you get 256 times as many false positives of websites being labeled as phising. This is what GP meant by it being problem without a good solution.

One possible option is to do what haveibeenpwned does, where you give fewer bits and then locally check. That would be a good improvement to the system's privacy, but you probably want to avoid downloading the hashes of every malicious website that starts with the given 3 bytes (I'd assume the list is quite large) for every page load.




SafeBrowsing already does the thing you're describing as a "possible option".

Say I go to https://fakebank.example/security/login and Google has decided all of fakebank.example is a phishing site.

My browser computes [among other things] SHA256('fakebank.example') and then it snips off the first four bytes and compares that to a large dataset it got from Google. It fetches updates to this dataset every few hours. Sure enough the four byte prefix is present in the dataset.

So, we've got an alarm - it calls Google, but it doesn't tell them it's thinking about https://fakebank.example/security/login at all, it just tells them the 4 byte prefix. Google responds with a list of full SHA256 hashes beginning with that prefix that it considers _right now_ to be phishing. The list might be empty (maybe fakebank.example was actually a Greek yoghurt company subject to a PHP 4.x attack, and they upgraded PHP and removed the phishing site so now it's fine) but if it has the entire SHA256 hash we calculated then I get an alert telling me that my browser thinks this is a phishing site and I might want to not visit.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: