Hacker News new | past | comments | ask | show | jobs | submit login

If it were me, I'd make third-party font sources require a SHA hash. In pseudocode:

    url("https://fonts.googleapis.com/comic-sans", sha="abcd1234")
This way:

- If my browser has comic-sans cached, no request is made

- Caching works even if the same resource is sourced from multiple places (e.g. I can host comic-sans locally, but if they got it from a CDN, they don't need to get it again)

- If a malicious site replaces a resource, that's flagged

I think the trick would be to make this optional (but bandwidth/privacy-saving), and gradually to make this increasingly mandatory for different types of resources. AJAX calls obviously can't have SHA hashes, but JavaScript libraries can.




Sounds like you're basically reinventing SRI: https://en.wikipedia.org/wiki/Subresource_Integrity

One issue with cross-site caching, though, is that it may enable timing-based attacks on privacy.


No, I'm not reinventing it, but extending it by:

1) Mandating it for certain types of resources

2) Extending caching to cover the cross-site case.

Can you please explain the proposed timing-based attack?


Websites can use whether or not a resource is cached (one way to measure that is how long it takes to load) to uniquely identify your browser and track you across the internet.

Another attack is to determine if you visited $popularWebsite by checking if resources it uses are cached (this could be useful to, for example, the Chinese government for surveillance on its citizens).


Thank you. I've been thinking about your comment for 3 days, believe it or not.

It seems like:

- Only standard resources ought to be cached (e.g. D3, common fonts, etc.). Perhaps these could be a free registration with the browser maker (e.g. I can always get them from cdn.mozilla.org or something), with some constraints (e.g. minimum number of users, some delay, or similar). As a user, I ought to have the option to cache *all* of these (which is helpful in bandwidth-constrained settings), either on my machine or on a proxy. If I'm at caltech, I can repoint my browser to grab these from localbox.caltech.edu.

- These shouldn't offer a unique fingerprint, since it only works once. If I needed to load comic-sans.ttf, I won't need to load it next time.

- I might be able to set a fingerprint (e.g. ask you to load 25 resources, and check if they're cached), but that's really for cross-site tracking (for which there are easier mechanisms), and it only works once. Once you've cached a resource, it's cached nearly forever. Your fingerprint changes each time, so it's not really traceable.

So the more I think about this:

(1) You raised a valid (and hard!) problem

(2) There seem to be reasonable solutions


I had a similar idea. In addition to caching and detecting if it has been unexpectedly changed, there are other benefits:

- The end user could have the option to enable/disable caching, and to clear the cache. Further configuration is also possible, e.g. to enable same-origin caching only.

- The end user could have the option to replace resources with their own regardless of where the files come from; there is one table keyed by hash and the value is the file to use instead, which might or might not be the same file (so the hash does not necessarily need to match the file that is being used instead).

- Features specific to the browser to make it more efficient could also be used when the user configures replacement of resources, e.g. if it can somehow implement jQuery in native code, or uses a different font format which is more efficient on the computer that it is running on.

- If archived copies of parts of web sites are being made, it can efficiently check if it already has some file which is being used in such a way.

However, requiring a hash probably should not be made mandatory.





Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: