Hacker News new | past | comments | ask | show | jobs | submit login

Does Chrome send the unique identifier with Google Fonts API requests? If so, they don't need cookies.



Are you talking about the x-client-data header (which isn't unique, but is relatively high entropy at <= 13-bits)? [1] that is used for evaluating the effect of experiments that Chrome is running on other Google services, which does include ads. But it is not used for personalization (I wish they would say that publicly).

For example, when I look at a Google Fonts request in Chrome developer tools I see:

    x-client-data: CKe1yQEIkrbJAQiitskBCMS2yQEIqZ3KAQiVocsBCOeEzAEIhKvMAQjys8wBCL+1zAE=
    Decoded:
    message ClientVariations {
      // Active client experiment variation IDs.
      repeated int32 variation_id = [3300007, 3300114, 3300130, 3300164, 3313321, 3330197, 3342951, 3347844, 3348978, 3349183];
    }
Each of those numbers represents an experimental treatment that is currently active for my Chrome instance. (It looks like more entropy because it's multiple values, but they're all derived from a single 13-bit per-instance seed.)

[1] https://www.google.com/chrome/privacy/whitepaper.html#variat...


> is relatively high entropy at <= 13-bits

That is only true if-and-only-if we pretend those 13 bits are the only identifying information being sent to Google when requesting a font. The HTTP request is almost certainly being sent to Google wrapped inside an IP protocol packet. For most[1] requests, there are at least 24 additional bits (why 24? see: [3]) of very-identifying data in the IPv4 Source Address field. More fingerprinting can be probably done on other protocol fields, and IPv6 obviously adds an additional 96 bits. Yes, IP addresses are not unique, but ~13 bits is easily sufficient to disambiguate most hosts on a private network behind a typical NAT. Correlating the tuple {IPv4 Src Addr, x-client-data} received on a font request is trivial: it only requires a user to login to any Google webpage that includes a font request.

>> re: your [1]

    A given Chrome installation may be participating in a number
    of different variations (for different features) at the
    same time. These fall into two categories:

       Low entropy variations, which are randomized based
         on a number from 0 to 7999 (13 bits) that's randomly
         generated by each Chrome installation on the first run.

       High entropy variations, which are randomized using
         the usage statistics token for Chrome installations
         that have usage statistics reporting enabled.
How many users have 'usage statistics reporting' enabled, and are there for a "High entropy variation"? Is it enabled by default and thus will only be disabled by the minority of people that know how to opt-out?

[1] Google reports[2] they currently see about a 60%/40% ratio of IPv4/IPv6.

[2] https://www.google.com/intl/en/ipv6/statistics.html

[3] my previous posts on this topic - re: x-client-data https://news.ycombinator.com/item?id=23562285 re: 24-bits-per-IPv4 https://news.ycombinator.com/item?id=15167059




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: