Hacker News new | past | comments | ask | show | jobs | submit login

Also, from the description of the Safe Browsing feature (as linked on the above page), it seems that it doesn't actually send (and thus leak) URLs; rather, it downloads a blacklist from Google periodically (~30min), and checks URLs against it locally... https://support.mozilla.org/en-US/kb/how-does-phishing-and-m...

(Though, for file downloads, some meta information seems to be sent if I'm reading correctly.)




I've worked with safebrowsing (v2) data/api before. It downloads and maintains lists of hash prefixes, which it checks locally for hashes of various versions of the url. When it gets a match, it downloads a chunk of all full hashes that start with that prefix, to see if the full url/variant hash is one of them. Usually that list very short, just one or two full hashes for a particular prefix.


Note that it does indeed recheck against the remote copy on a blacklist hit.

So Google could indeed easily track URLs by adding it on the periodic check and then returning false on the specific check.


I'm quite late with this, but this is incorrect. At no time is the URL sent to Google; in fact, at no time is even the hash of the full URL sent to Google. I'd suggest you re-read the safebrowsing protocol.

As ploxiln notes, if a hit is found with a matching prefix to the (canonicalized) URL, a request is made for all hashes of URLs beginning with that hashed prefix. The hash of the current URL can then be checked against that list locally.


I was oversimplifying, sorry. You are correct that the URL isn't ever explicitly sent.

However.

Generally there are only one or two URLs that start with the hash prefix that is explicitly sent to Google. Which means in practice it may as well be leaking the actual URL to Google.

Especially as there are multiple hashes per URL (5 worst-case?).

If Google wants to track a URL, they can do so.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: