Hacker News new | past | comments | ask | show | jobs | submit login
Google Fonts Analytics (fonts.google.com)
228 points by rememberlenny on Feb 19, 2020 | hide | past | favorite | 185 comments



I was part of the team that founded this, about 10 years ago if memory serves. I'm still loosely involved, though no longer at Google. Inconsolata was one of the 20 fonts for the original launch, and we're now in the process of launching it as a variable font, with both width and weight axes. I'm also getting funding from them for Rust-based font tools.

Feel free to AMA, though no guarantees I'll have good answers.


> Inconsolata was one of the 20 fonts for the original launch

Am I understanding this correctly to mean that you were part of a group that invented this font? Independent of the great work you obviously did on everything else, if this is correct, I just wanted to give a specific thank you for that! It's been my go-to font for my terminal, text editor, monospace font in the browser, and even the font my Linux machines use when they boot up before launching the GUI for years now.


No, I drew Inconsolata myself, as well as inventing spiral-based tools for designing fonts, as an alternative to Béziers. I started work on it far in advance of working on the "Google Font API" as it was called at launch, so it was very natural to include it, along with other great fonts by other talented designers.

I'm glad you like the font!


> No, I drew Inconsolata myself, as well as inventing spiral-based tools for designing fonts, as an alternative to Béziers.

Man, I love your phd thesis! I use it regularly to show my students an example of what a PhD is supposed to be; and, independently, to teach them some planar differential geometry. Your study of Euler's elastica is the best written account of it that exists (and I have read many more geometry books that would like to admit).



> No, I drew Inconsolata myself, as well as inventing spiral-based tools for designing fonts, as an alternative to Béziers

Oh, wow! Keep up the amazing work!


Thanks for your work, as a programmer with minor dyslexia Inconsolata, was heaven sent to me.


Curious, have your spiral based tools been documented/published somewhere?


Best place is probably my thesis: https://levien.com/phd/phd.html . All of the code is available under permissive open source license. I've been continuing to poke around with this and hope to have a newer set of tools before long.


Let me recommend that anyone interested in fonts, 2d graphics, CAD, etc. read Raph’s thesis. It’s great!


Was this created in order to track users across different sites or because google wanted to make the world a better place? :)


The latter.


Do you think it's still used for the latter rather than the former now?


Heh, that's a better question. I have no reason to believe that Google Fonts are being used to track users, but I don't have hard evidence on this either way. I am deeply concerned about the way the modern Internet is turning into a surveillance machine and its potential for abuse, but if I were evil, I'd probably go after much lower hanging fruit than fonts.


Thanks for this honest answer.


either way, loading fonts from google for every website is a stupid idea... it is why an extension like decentraleyes is a good idea (because most website don't think about the consequences).


I think Google might have better ways to track people. Fonts are also cached by CDNs and the local browser so I’m not sure if they provide good tracking data (they get loaded only for the first page visit for example and popular fonts like Open Sans are probably already in your cache when you open a page).

If you’re concerned about tracking as a website operator you can simply load the fonts from your own server, most of them can be found on Github and have liberal licenses (I think all fonts on fonts.google.com)


You can even install the fonts locally to your machine and your browser will favor the local copy. There are some tools like Skyfonts that let you install the top X from Google Fonts and it can be a fun way to speed up a lot of the web; with the tradeoff that not a lot of people have the top X installed for useful large versions of X and that becomes its own fingerprint for surveillance software (testing how long text takes to render in an off screen canvas, for instance).

It's been suggested that Browsers or Operating Systems choose a useful X value and install the top fonts everywhere like they used to install the classic "core web fonts".


Also there's a addon called "decentraleyes" that will cache fonts, amongst other things. It's one of 3 extensions I install in every browser I'm allowed to. The others are uBlock origin and https everywhere. All 3 have good privacy/security/performance benefits with effectively no breakage, in their default configurations.


Decentraleyes is an excellent add-on that improves browsing speed and privacy with no negative side effects. Currently, the only font cached by Decentraleyes is Noto Sans. It also caches Google's Web Font Loader script and many major JavaScript frameworks.

https://decentraleyes.org

https://git.synz.io/Synzvato/decentraleyes

Web developers who are concerned about privacy exposure from Google Fonts can use google-webfonts-helper to extract the font files and self-host them.

https://google-webfonts-helper.herokuapp.com/fonts

https://github.com/majodev/google-webfonts-helper


Correction: Decentraleyes doesn't actually cache any fonts. Noto Sans is just used for the add-on interface. Font caching would make an interesting feature if the add-on developer is willing to implement it, especially if it bypasses fingerprinting.


I stand corrected!


> You can even install the fonts locally to your machine and your browser will favor the local copy.

Note that some browsers are going in the opposite direction and actively hiding installed fonts from websites due to the potential for fingerprinting.


Delivery of fonts aren't what's used to track people. Rendering of fonts is.

Different machines render fonts in a unique way. Consider different GPUs (hardware differences), different drivers (kernel/driver differences), browsers (user-land differences), and sizes (user settings) can all contribute to a font being rendered differently per-user and increasing entropy of a particular unique font id.


I am slightly concerned, yes. This is why I chose to host the fonts for my 20-things.com sideproject for as long as I can afford that luxury.


I currently work on it as well, so happy to pitch in.


Any estimate when Inter can be expected?

https://github.com/google/fonts/issues/1455#event-2995287982


Does Google use this to fingerprint and track users across different websites?


The Google Font FAQ includes the question: What does using the Google Fonts API mean for the privacy of my users?

> "...your requests for fonts are separate from and do not contain any credentials you send to google.com while using other Google services that are authenticated, such as Gmail."

>"Google Fonts logs records of the CSS and the font file requests, and access to this data is kept secure."

https://developers.google.com/fonts/faq#what_does_using_the_...


So a lot of fluff and no actual reply. Users can be tracked without cookies being sent, while "access to the data is kept secure". Call me a cynic, but I lost my trust in google a long time ago.


To answer "track users across different websites", I think they pretty clearly say the opposite:

> The Google Fonts API is designed to limit the collection, storage, and use of end-user data to what is needed to serve fonts efficiently.

> When millions of websites all link to the same fonts, they are cached after visiting the first website and appear instantly on all other subsequently visited sites. [...] The result is that website visitors send very few requests to Google: We only see 1 CSS request per font family, per day, per browser.

I guess, what would you want to see that would assuage your concerns, beyond what is written in the FAQ?


They do not say the opposite.

Apparently they need to collect and store end-user data for serving fonts efficiently. Wonder what that could be...

And if that information happens to be enough for further tracking then it seems to be fair game!


Couldn’t it just be for edge server location determination?

Not a Google fan, but at their volume of traffic seems like it could be something they’ve optimized for.


Could be!

But they could have said so. And they could also have said that the information is not correlated with anything else.

All we know is that they have, very carefully, written something vague that they could do pretty much anything they wanted with.

And we are left with the question, why would they do that?


Probably to not have to consult the lawyers every single time someone creates a new analytics aggregation.


"Users can be tracked without cookies being sent"

How? Stylesheets can't use fingerprinting or Flash cookies or anything like that, only scripts can.


Stylesheets can fingerprint with the help of a server to track what resources get loaded or skipped, and a few clever media size queries etc.


The Referer header will leak what page you were on and you probably already have connections to Google from the same client IP address. Even if you have Referer blocked, the particular font requested could indicate information about what page you are on when combined with other data.


TLS Session resumption (tickets / caching)


Multiple people (I know as my upvote just now didn't even get you back in the black) downvoted you (and now me ;P), but you are absolutely correct that that quote didn't really have anything to do with the question.


Can you still be tracked if the fonts are delivered from your server?


No. A font file served from your own server is just like any other static asset. (I increasingly tend to do this for my sites, as I've found that using FontSquirrel to create WOFF/WOFF2 files that contain subsets of fonts and/or "collapse" font features -- e.g., stylistic alternates -- can make for very small, efficient files if the subsetting meets your needs.)


Thank you for your service!


Rust based font tools? Do you mean local font manipulation type scripts or some sort of web service?


Mostly local font manipulation tools, including a GUI editor, but I can imagine some of the work ultimately becoming useful inside web services.


Not font manipulation, but the Alacritty project has had an RFC [1] open for a while now to try and make font configuration better. It turns out I barely understand fontconfig every time I need to touch it. I'm just curious what (if any) your opinions about the de-facto Linux fontconfig framework are?

[1]: https://github.com/alacritty/alacritty/issues/957


I haven't dug too deep into it, but from what I've seen it seems complex, messy, and not up to solving real problems such as CJK Han unification. I wrote a bit about these difficulties in https://github.com/linebender/skribo/blob/master/docs/script... , and what I think was most disappointing is that I didn't get any actual feedback at the time. So it feels like about the same level of dysfunction as most Linux UI work.


Re: variable fonts.

What’s the rule of thumb on file size ... is 1 variable font smaller than 1 normal font? Or is it more like 1 variable font is smaller than 3 normal fonts combined (eg bold, italic, regular).


A coarse rule of thumb is that each axis doubles the font size. So one that has just a weight axis is twice a normal font. One with a weight and width axis is four times. So if you actually use multiple instances from the design space, it's a win.

This is only a rough guide, don't take it as gospel. I'm also researching ideas (radial basis function interpolation) to make it more sparse for larger numbers of dimensions.


For some anecdata, compare the file sizes in the latest Cantarell release tarball from the GNOME FTP server.

The compression ratio varies depending on how the variable font was made though. E.g. Noto Sans clocks in at 1.5 MB because it is made from 8 masters for two axes. Reducing it to 5 masters results in a 900 KB binary.


In anticipation of "ugh, designers are ruining the web custom fonts", I'm glad that variable fonts are starting to percolate out.

https://developers.google.com/web/fundamentals/design-and-ux...

Less overall file size, less HTTP requests. Combine that with preloading and font-display, we're getting to the point where webfonts aren't a giant bandwidth suck.

https://developers.google.com/web/fundamentals/performance/o...

(More room for ad tracking scripts instead!)


Oh wow, what's old is new again.

I remember when Adobe launched Multiple Master fonts [1] in 1992 (same concept)... and then killed them in 1999 because not enough people were using them.

I'll be fascinated to see if there's actually demand for it this time around. I personally have had tons of times I've wanted a semibold where none existed, or something halfway between regular and condensed.

But I'm not convinced they're going to save bandwidth for most sites. After all, they incorporate two sets of font outlines along each dimension instead of one, right? And you rarely see a webpage that uses three variations of a font along a single dimension (e.g. sans-serif regular, semibold and bold). A "third" font and beyond is more often a different style (italics) or typeface (serif) entirely. So I don't see the savings in most cases...

[1] https://en.wikipedia.org/wiki/Multiple_master_fonts


I do a fair bit of design in my free time, and am friends with people majoring in it. People are excited about variable fonts in academia and in design circles. I use them nearly exclusively now between Inter, Merriweather, Bodoni* by Indestructible Type, Hepta Slab, and the IBM Plexes. And this is all just for people using it in desktop design programs, since it's being embedded there at the source (designers make it before it gets to the frontend folks), it's looking promising.

I'm 99% sure it'll catch on this time around, in short :)


I agree variable fonts won't be some magic bullet for file size, but I suppose cached versions of variable fonts means two different sites could use different axis values and the user would only have to download it the first time round.

So different sites can more specifically tailor font rendering based on their own needs at no extra cost to the user.


As I understand it, Type 1 multiple master was simply superseded by OpenType variable fonts back in the 90s, which is all that Google link is describing. As a user it was a relief to have, afaict, all of the features of Type 1 with all of the convenience of TrueType.


Huh, I’m wrong. Apparently variable fonts were added to OpenType only recently: https://en.m.wikipedia.org/wiki/Variable_fonts

(I think the features I was thinking of were ligatures and lowercase numbers)


Wow, if you used Google Cloud CDN to serve up that data, it would cost you more than $1 Billion USD.


Got a source for that, or did you just make it up? Doing some very quick estimation I can't see it being more than $40-60 million, and that is using retail pricing.


I'm getting ~$130 million with these assumptions:

* each font load is 170 KB (I downloaded Roboto and that's the file size of one weight)

* 36.3 trillion font loads (per the source)

* 36.3 trillion * 170 KB/font load = ~6171 PB [0]

Plug 6171 PB into the GCP calculator [0] with all egress via GCP Cloud CDN to N. America and the bill comes out to just shy of $130 million.

OP is off by an order of magnitude by my napkin math, but it's closer than I was expecting.

[0]: https://www.google.com/search?q=36.3+trillion+KB+*+170+in+PB...

[1]: https://cloud.google.com/products/calculator#id=f75c7bdd-4c4...


You’re not factoring in worldwide distribution - not everyone lives in the USA, and costs worldwide bandwidth costs vary a lot.

Also we don’t know the number of http cache check requests. I’m assuming they number is way, way higher than the number of fonts served? That can cost you 160 million alone, with no bandwidth.


I assumed 22 kb (compressed) font size. That's the average of the top 10 on their list. (Roboto is 35kb uncompressed for me, not 170kb).


35kb sounds like the basic (mostly latin-1) subset. The size will vary a fair amount depending on which subset is selected, etc. In any case, fonts will almost always be served compressed, with WOFF2 being the vast majority of requests.


You are correct. I used the same match, except I used a 1mb gzipped Roboto font family package. I now realize that what google actually serves up is a subset of that.


I used the GCP pricing calculator. However, I used 1mb gzipped as the file size. But I now realize that is the entire Roboto font family in both otf and ttf, so I assume that what google fonts serves up is a smaller subset of those.


Curious, what would Cloudflare fronting OVH / Scaleway / Hetzner / S3 / Backblaze cost? I guess, apart from the major expense signing up for Cloudflare's enterprise plan, not much?


Hetzner bandwidth was 2 euro per TB last I checked :)

Edit. It's now apparently 1 euro /tb without vat


Get the raw JSON stats & font info with below links. Use httpie for color/formatted results (replace curl with http):

  curl https://fonts.google.com/metadata/stats
  curl https://fonts.google.com/metadata/fonts


The way google tracks fonts is so annoying. It took me furever to track down this bug: https://github.com/google/fonts/issues/2345

Basically, one out of a thousand requests delivers a broken font that breaks document.fonts.load().


It looks like its lumping Android in with desktop linux. Since they are differentiating platforms by the user agent that seems like an easy thing to fix?


The OS table has entry for X11 with maybe 2-3% shaer; I'd bet that's desktop Linux.


That's an impressive figure. Macintosh (desktop Mac?) is slightly less than double the X11 figure.

https://gs.statcounter.com/os-market-share has Linux usage at just under 1%.


I would hazard a guess that more Linux users are blocking the ad networks that StatCounter utilizes to collect their data than Linux users that block requests to fonts.google.com.


Lots of "browse the web from this phone box" and "the departures board in the train station that shows a webpage" will be Linux.

I bet many of them use Google Fonts, and with them auto-refreshing every 5 seconds all day long, they might get seriously overcounted.


I doubt that the fonts get reloaded when the page is refreshed. According to other posts in this thread, the cache TTL is 24 hours.


Most of these boards use kill -9 to restart the browser every half hour or so, and likely nuke the cache each time.


Desktop Linux is a rounding error


Why are people still bothering with Google fonts? They were useful when font formats hadn't been standardized, and you needed WOFF, TTF, and SVG for some Apple products. Now, WOFF works everywhere, and you can offer WOFF2, which is compressed more. So why not just put the fonts on your own site and avoid a side trip to Google?


Google Fonts is free. Buying a licence to use a font on the web can be expensive.

Also, Google Fonts does more than just delivering a font to the browser - the text= and font subset features can reduce the download size a lot, which you couldn't do if you were serving the Woff2 yourself on a dynamic site (eg you didn't know what characters were needed before the page rendered on the server or client).

Lastly,for common fonts like Roboto and Open Sans, there's a non-zero chance it'll already be in the user's cache. That's a win.


I'm surprised that Linux is the most popular client for this.


I'm guessing that's android.


That's probably because Android isn't listed separately.


Interestingly iPhone and iPad are listed separately.


Most of the crawler bots also mentions Linux in its user agent.


I don't think many crawlers will be downloading fonts though?


The operating system stats are interesting.

Looks like Macs have roughly a 17% share of the windows + mac total requests. The overall market share per wikipedia (https://en.wikipedia.org/wiki/Market_share_of_personal_compu...) is something like 7% for Apple.

Fonts are probably used by most "normal" computer users in some way so I wonder where that difference comes from

- windows breaks faster and must be replaced more often, increasing the number of sold windows systems per year

- windows users use computers differently? Must be a pretty big thing to cause such a difference

- more windows systems not connected to the internet?

I kinda doubt that the typical enterprise windows setup with strict firewall rules etc. will have a meaningful impact on web fonts and IE 6+ seems to support google fonts (https://developers.google.com/fonts/faq#what_browsers_are_su...)

Any ideas?


The stats are interesting but do not warrant such a condescending/misrepresentative view; they could easily be twisted around, for example, to say

- Macos has broken browsers and doesn't cache, resulting in multiple font downloads per website browsing session.

- Macos use computers differently? Do they refresh the page multiple times racking up a download count?

The first odd thing about the stats is Linux's share; with a 2% market share it's racked up 10T downloads. A major part of that is Android based browsers, but a decent part of that will be Linux desktop users, developers, automation tests and so on. Overall we should expect users' browsing habits playing a part; there may be a large shift towards mobile/tablet web consumption and reduced desktop web consumption causing a stats skew.


Any time CI hits a site with these fonts integrated an ephemeral session (no cache) probably has to download them.


There are Windows machines for high end and low end, but macOS machines only for high end. Low end includes all your grandmas, unused workstations, and people apathetic about computers because they don't use them much. All your laptops kept folded under dorm beds and used only for assignments.

The same effect shows up to a greater degree on mobile, where Android has something like three times iOS’s units sold yet the same amount of mobile traffic and half as much app store revenue.


Webpages in non-latin alphabets probably use way less fonts from Google Fonts. And Macs are way more popular in English-speaking countries than anywhere else.


Windows computers cost half as much, more or less, for the same specs so the replacement rate is probably much higher. Like for the same price you can buy a PC laptop every 2 years vs 1 Mac laptop every 4 years.

When I worked at Apple, the plastic MacBooks outsold everything else because they were cheapest. It was weird that we did all our power/performance benchmarks on a $2000 15" MacBook Pro when they only made up 10-15% of Mac sales.

From that time, we also knew that Macs were extremely popular for home use, IIRC something like 30-40% penetration, but people were using Windows for work or being given a Windows laptop from work for personal use.

One of the reasons Apple never made a Netbook was that it was an inferior good. People, when asked, prefered a larger screen and trackpad but couldn't afford it. The bottom-tier 11" MacBook Air and the iPad were a response to upsell that demand.


Price has nothing to do with replacement rate. There are many bad laptop brands, though.

30% penetration in the USA. Here we are talking about global usage. In many other first-world countries Macs are nowhere to be seen, eg 3%.


> 30% penetration in the USA. Here we are talking about global usage.

Then later in this same thread you say:

> Webpages in non-latin alphabets probably use way less fonts from Google Fonts.

So uh, which is it?


I don't follow. Precisely because we are talking about the global usage you have to consider webpages in non-latin alphabets which increase the ratio of Macs for latin alphabet countries.


I think it's almost certainly due to average browsing habit differences between Mac and Windows users. The Mac userbase is likely wealthier on average and more likely to be in a design/product/publishing profession, which influences the type of website they would frequent.


> windows breaks faster and must be replaced more often

Or they are replaced more often just because they are cheaper.

There is a second hand market for Apple stuff, on the PC land that's unthinkable.


> More windows systems not connected to the internet?

More Windows devices are in corporate/enterprise environments and are less likely to be used for casual web browsing?


Anecdotally, people use their Macs a lot longer than Windows users. In part because of the price difference.


How to track everyone, everywhere.


Pretty much. Provide a basic service to the Internet and get the benefit of sucking-in all that usage data. Here's to hoping we somehow figure out how to use something in the future without handing over our data at the same time. And don't tell me "we did this to ourselves because we didn't want to pay for it with money, so we pay for it with privacy", you pay companies and they still do this. It's just too tempting for companies to miss it. We need regulatory protections for individuals' rights over corporations' rights and it needs strength behind to be enforced.

Answers? Oh, I don't have them, I am just pointing out the problem.


decentraleyes plugin, although I wish there was an easy way to add local sources


Can someone provide insight into why Google is hosting fonts for free?

Anytime an advertisement company decides to do something for free, my cynicism goes on high alert. I am genuinely curious though - what's the business model? Are they collecting massive amounts of data while offering fonts?


The reason I self-host fonts (that I paid for) on my personal website is that I don't want third party tracking via fonts on my website. It's all first party and I want to keep it that way. So there is your answer: It's an easy way to track websites via another vector other than Google Analytics, which many people block or which is not installed everywhere.


raphlinus had this to say in response to a similar question: https://news.ycombinator.com/item?id=22370494


Does anyone know whether this service generates any revenue for Google? I can’t see selling fonts data as something desirable by 3rd party companies but maybe that’s just me?


Google doesn't directly sell data in the first place - they leverage the data they have to improve the value of their ad network (in the form of enhanced targeting and attribution capabilities). It's actually in their best interest to _protect_ the data they have on you, as it's a primary competitive advantage they have over other ad networks.

As far as the unsavory interpretation of how they could conceivably use Fonts data to further that end: calls to download their fonts are another touchpoint with the Google ecosystem, and a theoretical vector for further tracking the browsing behavior and device graph of an individual.

That said, traditional font licensing for commercial use is absolutely bonkers[1][2][3]. Even if Google is using the data they collect from serving font files to feed into their user tracking, they've done a service to the web[4].

[1] https://designshack.net/articles/typography/what-is-a-font-l...

[2] Working at a creative agency, I learned that we can't so much as legally use a client-dictated (and licensed) font in a mockup without paying thousands of dollars for a license ourselves.

[3] Webfonts tend to be licensed on a per-pageview basis. One client got hit with a temporary flood of scraping/bot traffic, and the biggest economic impact was the unexpected six-figure font bill that month. We convinced them to put the site behind Cloudflare and used a Worker to strip out the font include for suspected bot traffic and inadvertently lowered their licensing cost by more than our annual retainer.

[4] I can't say how it is everywhere, but working at one of the top three marketing agencies, Google Fonts are the only approved open-source fonts we're allowed to use (since they're primarily free for commercial use, as well). Not every client is able or willing to absorb a 5-6 figure font line item for every engagement, so without Google Fonts we'd (and I presume other major agencies) would be stuck with using system default fonts everywhere.


It does not, and they don't sell the data or use it for any tracking.


Their FAQ says that they don't (currently) use cookies, but it also includes the sentence "Google Fonts logs records of the CSS and the font file requests, and access to this data is kept secure", so they could still do some IP-based analytics.

More importantly though, it gives Google accurate insight into web traffic (many users block Google Analytics, but almost everyone loads web fonts), and it allows them to crawl websites more easily – before web fonts, many websites used pre-rendered PNGs to show web-unsafe fonts, which made crawling impossible.

Google has poured a lot of money into this, the fonts are free and open source, and the users aren't the product. Overall, I think it's a rare win-win story in this age of dystopian adtech.


I assume that Google uses Analytics and Fonts as part of its algorithm for determining if a site is popular when creating search engine results. Am I wrong? Does anyone know on or off the record?


> Google has poured a lot of money into this

Why, though? Not charity, I'd assume. The only other answers I can come up with are pretty damn nefarious.


I can speak to this a little, in terms of pitches we made to management to get resources for our little project. None of this is authoritative.

1. A lot of text was rendered into PNGs, and that made the web less searchable, as well as less accessible, slower to load, and less mobile-friendly. All 4 of these factors do have economic impact at Google scale.

2. Fonts were one of the few features that Flash had that HTML5 was lacking, and we wanted to accelerate that transition. Again, mobile was one of the major driving factors.

3. For a while, we were organizationally funded under Google Docs. Again, fonts were one of the major missing features compared with Microsoft Office, so filling that gap was strategic. Here, our open source approach really paid off, otherwise dealing with proprietary font licensing in the context of documents that can be shared and copied would have been nightmarish.

4. To the extent that you are able to make the case that fonts make ads better (or advertisers happier), getting modest amounts of funding ceases to be a problem. To be clear, when I was on the team this was more of a glimmer of future abundant resources than day-to-day reality.

Lastly, while "charity" isn't exactly the right word, the motivations of the people working on the team are/were basically that we love fonts and want to make the Internet better. At Google scale, we were able to sell the project using basically a combination of the above arguments.

Never once when I was on the team were we asked to implement any form of individual user tracking, nor did I hear a suggestion of such a thing. All our work on collecting analytics was to improve performance and quantify our impact. I have no reason to believe things have changed on that front since I was directly involved.


Wow, that makes a lot of sense. Clearly, my imagination was lacking. Thanks!


Are you the Raph who built Advogato?

I loved that site.


Thanks! Yes, I've had quite a varied career so far :)


The parent comment said:

> it allows them to crawl websites more easily – before web fonts, many websites used pre-rendered PNGs to show web-unsafe fonts, which made crawling impossible


> it allows them to crawl websites more easily – before web fonts, many websites used pre-rendered PNGs to show web-unsafe fonts, which made crawling impossible.


I'd like your thoughts on this comment: https://news.ycombinator.com/item?id=22394730


They likely do get some data from it, their own analytics page has data they pull from it.


yet.


In addition to some data from websites which do not use Google analytics, they may like to have the power of pushing the "comic sans ms" font or having a * { display: none !important;} for non chrome browsers.


How much data traffic could google save if it would just bundle all fonts with chrome?


how much space would packing all those fonts take up?


With some extra metadata, about 428MB.

https://github.com/google/fonts

Since the browser hands font rendering off to the operating system, however, the most pertinent browser-specific adjustment would be updating the (configurable) defaults (Serif, Sans-serif, etc...) for each supported platform - Georgia instead of Times for a default, for example.


You bundle the top 10 most popular fonts and save about half the traffic: counter is ticking about 20M per minute or 201B per week, top 10 fonts were used 136B times this week, Chrome is about 3/4 of users. Bundling would save (136/201)*(3/4) = about 1/2. However, Google would lose about half of their tracking events so I'm sure they've already thought of this and figured the data is worth more than the traffic expenses.


that to me is the real reason behind this service. it might not make money, but the statistics and insight they get are worth way more than money.


Yeah, let's just install the Internet Archive on everyone's computer and be done with it. </snark>


Might be a bit of a dumb question. But when I load google.com and look at the requests. It has fonts.google.com says 200 and 'from memory cache' is that actually sending a request in any way to Google? Or is that actually handled locally? Most of the time I am getting this, so not seeming to make actual requests to them the majority of the time. Is it actually hitting their servers for these tracking stats, or is something like that not even making it to these stats?


200 means it's hitting your cache and not doing any network requests. The analytics are based on actual HTTP requests. At the time I was there, we were only able to make rough guesses about the fraction of queries that were successful in the client cache, we didn't have any systematic way to dig into that.


It depends on the cache control headers sent by the original asset. If you want to make sure you don't hit the CDN, install the Decentraleyes browser extension.


At the time I was there, we aggressively set the cache control headers to minimize the number of requests. Among other things, the font binaries were all versioned and served so each URL was immutable forever. (There's some subtlety around this, it's a 2 stage process with CSS first then the font binaries, and the CSS was generally served with 24 hour expiration).

One motivation was sharing the cache among all sites that linked the fonts, which at the time I think was a big win, but as has been mentioned elsewhere, this is going away.


I'd like to view web pages as the authors intend them, but I also don't want to send details logging to google on every page I visit.

Is there a way to somehow proxy a subset of popular fonts locally rather than block the CDN entirely?


I think this should be added to the Decentraleyes extension (https://decentraleyes.org) which does this for popular JS frameworks. There is already an issue for this: https://git.synz.io/Synzvato/decentraleyes/issues/387


I love Decentraleyes but how would they redistribute most web fonts without running afoul of their licenses?


Download batches on initial extension load, and periodically thereafter, and cache them locally?


You could download the fonts to your machine(s). That way they will be served locally rather than connecting to fonts.googleapis.com

https://fontsplugin.com/how-to-download-google-fonts/


For my business website, I downloaded the webfont files and host them with the other website assets. Google made it difficult to do this. Of course Google made it very easy to just use their hosted version and send my users' data to them.

See https://www.cozydate.com/style.css


If anyone else wants to do this, a very handy tool exists that makes the process quite easy: https://google-webfonts-helper.herokuapp.com/fonts


Nice! I added a comment with this link to my style.css file. I will use it next time I need to add/update a web font.


Cool. I'm curious about how they counted it. Do they have opted-out setting?


Number of hits on the web server.


What is GSA? A little searching hasn't brought up anything I could see being a browser.

The closest I could find was Google Search Appliance but I wouldn't think that is applicable to these stats.


Google Search App. The app called "Google" on Android and iOS.


Amazing! And a great link to add to the resume of those designers!


The top OS for the year is - Linux (15.3T) and i dont know how to interpret this.. Is this due to the web server?


The vast majority of those clients are probably Android


In my "humble" oppinion only three most significant (decimal) digts count if you are looking at growth.


Yeah, 36 trillion requests slowing down the loading of websites. Way, unnecessarily, overused.


The sterile world you desire is not one I'd like to live in.


It's like my desire for quality CLI apps and APIs before any sort of fancy schmancy GUI. The important part is reminding one's self that the other 99% of people want and need the GUI version of the app.


If google embedded the top fonts in the browser this wouldn’t be an issue. I don’t understand why the font selection in browsers is so limited.


Practically there's no difference between a browser that lazily downloads builtin fonts, and a global URL for a font file across websites with an extremely long TTL. And you don't want the browsers to be even more bloated than they already are!


> font file across websites with an extremely long TTL.

Caches are no longer shared across domains because doing so leads to privacy leaks: https://www.jefftk.com/p/shared-cache-is-going-away


-


The parent comment's point is that this cache is partitioned by frame origin (which the link details).


Then why are there 36 trillion requests?


A lot of people using a lot of fonts from a lot of different browsers. I'm sure the caching techniques aren't perfect but they're probably quite good.


It's been a while now, but at one point there were some numbers that came in that showed that a much higher fraction of user agents arrive at a web site with a completely empty cache [than you might expect].

Enough so that if you rely on caching for user experience, your average user is not going to have a good time.


There is a HUGE difference:

- not all sites use CDN, even if using the same font

- not all sites use the same CDN

- users clear the cache (clearing history does that on firefox)

- incognito mode is a thing

- first load matters a lot

- browser have a limit of the disk space they use for cache, and they evict older entries when they reach it. Given sites are now bloated, this fills up fast.

- this repeats for each browser. I have 5 on my laptop, 3 on my mobile, 2 on my tablet.


Font selection in browsers isn't limited. Browsers don't have embedded fonts—they rely on whatever fonts are installed in the operating system.

The challenge is that to ensure consistency, you need to use the "lowest common denominator" of fonts that are installed by default across all operating systems. Which leaves you with (like) Arial, Times, Courier, Verdana, Georgia, Palatino, and (hahahaha) Comic Sans.

The real answer here is why web designers use Google Fonts as opposed to embedding their own fonts. To which the answer is: it's so much easier. (Tech, licensing, formats, compatibility, etc.)


> Which leaves you with (like) Arial, Times, Courier, Verdana, Georgia, Palatino, and (hahahaha) Comic Sans.

Not Palatino.

And most of those fonts have licenses that are inconvenient at best. The only thing that allows them to be packaged for Linux distributions is Microsoft's '90s-era "Core Fonts for the Web" initiative. This initiative is long discontinued, and so the fonts cannot be downloaded from Microsoft anymore. Only the '90s versions of the fonts are free. Worse yet, the license forbids packaging the fonts in any way other than with their original 32-bit Windows installer, which means that hacks like cabextract are necessary to install them on any other system.


I think what they mean is why don't browsers just include the most popular fonts, so the underlying OS doesn't matter?

Instead we're restricted to only a few fonts that actually have decent cross platform support.


The only way to do that would be to include particular typefaces as part of the CSS specification, rather than just the generic font-family keywords like “sans-serif” and “monospaced”. But who should pick them, which ones would they pick, and how would they be licensed?

It’s kind of the same problem as saying that the browser should include common images. Which images? Why? How many?


> It’s kind of the same problem as saying that the browser should include common images. Which images? Why? How many?

Images follow a different usage distribution than fonts. I'd say that the top 100 fonts are enough to render most web content, for images I'd say this is obviously different, the top 100 images might appear often but not as often as the top 100 fonts.

Google is already distributing the equivalent for text, in the form of the brotli corpus which ships in every Chrome installation.


> why don't browsers just include the most popular fonts, so the underlying OS doesn't matter?

Well, pontificating here:

1. Because many applications don't ship with additional fonts, and it's an additional layer of complexity. Some do—Microsoft Word comes to mind, IIRC.

2. Because you're still left with the same problem: unless all browsers can agree on an additional set of standard fonts, you as a web developer will only want to use those installed by Chrome and Edge and Safari and Firefox.

3. Because licensing for desktop application may (?) be more of a pain than licensing for web usage. Which may not matter for Google Fonts, since they may be the license holder for all anyway. I don't know.


Loading fonts is tracking and data collection. It's what Google does. If they build the fonts into the browser, then tracking/data collections calls are not made.


They could embed it in Chrome. If someone is using Chrome, google doesn't need any more points of data. They know everything you do.


yeah but you can sort of opt out of tracking by chrome, so its best to have fallbacks to be able to track all the things!


Last I checked, browsers don't come with fonts. Browsers use the OS's fonts if a webpage doesn't have them. I'd rather browsers not maintain their own fonts.


That sweet, sweet traffic data.


The top fonts aren't the issue for page loading, they'll be cached 99% of the time. It's more obscure fonts that actually hit the server causing the issue.


So many things about the web would be better if browsers shipped basic functionality.

I cringe when I think about how many petabytes of jQuery has been sent over the wire over the years.


But they do. Javascript itself has evolved significantly over the past decade obsoleting many features that used to solely be provided by add-on frameworks.

HTML5 as a living standard brings the majority of the needs that used to be served by plugins – interactivity, dynamic pages, two-way communications, multimedia.

WebSockets provide for realtime communications with the browser, something again not possible without 3rd-party plugins.

So, what you are saying is in fact happening. But that doesn't mean it's not an issue with sliding goalposts. It's just that it doesn't happen as quickly as we all may like it, but that's because "basic functionality" is a constantly moving target with different definitions depending on who asks.


Decentraleyes is a browser extension that locally stores the most popular CDN-hosted JS libraries, including Angular, Backbone, and jQuery. https://addons.mozilla.org/en-US/firefox/addon/decentraleyes...


CSS, HTML, images, text content all slow down websites. Oh wait they ARE the websites!


Well, you are free to block those and, differently from Javascript, will miss no functionality.


What's a better solution if you want to use a non-standard font?

At least in this case a lot of the requests are cached across sites.


The problem si that number of "standard" fonts are thining. It sucks having different widths of fonts on different platforms. You will have to load some webfont if you want consistency.


It's long past time for p2p networks to make this number meaningless / unknowable!


Anyone know what's up with "Slabo 27px"? Looks like its recent usage is much lower than other fonts with similar totals (it's at 800m views in the last 7 days while the fonts above and below it are all ~4b). Also the only font I haven't heard of in the top 20, though maybe I just happened to miss it every time.


Lobster is the new Comic Sans.


So much wasted bandwidth


Well, I have disabled web fonts on my computer when I initially configured it.


it should link to the font so we can see them



[ctrl]+[home] keyboard shortcut doesn't work on this page in FireFox. I had some trouble with [ctrl]+[end]. It's quite disappointing to see basic webpage/document functionality broken on a site claiming to improve the web experience. Maybe I'm just an old-fashioned web user.


Highlighting text and scrolling is broken too.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: