Writing a Mini-CDN to Learn Nginx/Prometheus/Grafana/Lua

jay6282 · on Dec 26, 2022

The hard part of building a CDN is to know when you need it. 99.9% of all websites with CDN do not need it. Serving static files consumes so little resources that a single server can serve billions of users as long as you dont use script for serving the file. The most cost-effective with also the lowest latency solution is to never use CDN. If your webserver provider charge you a lot for traffic you are better off using another provider.

nine_k · on Dec 26, 2022

Static files can be relatively large, like pictures, or sounds, or even videos. At busier moments, the p99 latency can become pretty bad if you only have one virtual NIC serving them.

Also, geography: ping time from Singapore to the US can never be negligibly short, which is pretty noticeable during TLS handshakes.

But yes, likely these considerations do not matter for 90% of websites (not 99.9% though).

411111111111111 · on Dec 27, 2022

It's definitely 99.9% if the websites aren't filtered at all. A lot lower if you're only taking about the top 100 visited websites according to Alexa or apple however...

sokoloff · on Dec 27, 2022

In OP’s premise, they were filtered by “websites who chose to be served by a CDN”.

youngtaff · on Dec 26, 2022

> The most cost-effective with also the lowest latency solution is to never use CDN

The lowest latency solution is to put the content near the user and a CDN is probably the easiest way of doing that if someone needs to server a geographically dispersed audience

latchkey · on Dec 26, 2022

> The most cost-effective with also the lowest latency solution is to never use CDN.

CloudFlare is free at my tier and gives me the ability to have the lowest latency.

simplotek · on Dec 27, 2022

> The hard part of building a CDN is to know when you need it. 99.9% of all websites with CDN do not need it.

What exactly leads you to believe you can tell what 99.9% of all websites need?

Unless you believe 99.9% of websites are only accessed by your upstairs neighbors, CDNs provide a couple of important business and operational advantages.

Just to illustrate how wrong and misguided your personal assumption is, CDNs are primarily used to cut down latency, which in accesses from other regions can easily go beyond 300ms. Unless you somehow think that it's ok for your users to be subjected to a bad user experience, a basic CDN service is all you need to employ to lower those latencies by an order of magnitude.

archerx · on Dec 27, 2022

Thats interesting because when I added a CDN to one of my projects to serve the assets (mostly images) my latency went up and I would preferred not serving them through the CDN but didn’t have a choice because of the large number of files.

I accept that my images are being served a bit slower but with the compromise that I can server a lot more images.

I also think the person your are replying to is correct. The CDN makes the user experience worse but saves me money, that’s an unfortunate sacrifice.

hardwaresofton · on Dec 26, 2022

It would be nice to discuss the common approaches to global name resolution —- anycast vs geo-routing.

wrigby · on Dec 26, 2022

IIRC the industry standard is to serve your authoritative DNS with anycast, and have those servers do geo-based dns resolution to shift HTTP traffic to a nearby edge POP.

mnutt · on Dec 26, 2022

This is nicely written, and a lot of it mirrors my experience using nginx as a pseudo-cdn. Another area worth exploring might be http3, ssl session caching, and general latency/ttfb optimizations.

sandGorgon · on Dec 26, 2022

this is very very cool! One thing i would definitely like to see is domain name resolution. Shopify, Dukaan, Vercel all make a big deal out of it ...going all the way to BGP.

https://twitter.com/subhashchy/status/1536769406801309696

nnadams · on Dec 26, 2022

Is it possible for CDNs to cache per URL per user? I'm thinking of something like /favorites where one URL would list something different for everyone. When I've setup caching on backend it was keyed off the user.

This was a very informative read!

rmetzler · on Dec 26, 2022

I don't know why you want to hurt yourself.

If these are public, put them on /favorites/$USERNAME or something similar. If they are private, don't cache them.

You can cache with specific headers as cache keys, but I would advise against doing this too much / abusing it. It really makes caching complicated. And from a data privacy standpoint it's better to opt-in into caching. I've witnessed incidents where visitors saw the private profile page of another user, because it was cached in the CDN.

Matthias247 · on Dec 26, 2022

You can configure whether the cache key includes a particular header or query parameter in a lot of CDNs. So as long as your user identify is transmitted in one of those, it would work.

jay6282 · on Dec 26, 2022

User-aware CDN would require scripting of some kind to handle sessions. However, if the data is not sensitive you could use random string uris to publicly available files. That way it is difficult to guess/brute force the url to the files. (sensitive=person identifiable data)

mnutt · on Dec 26, 2022

Many CDNs support caching based on a particular cookie value, incorporating it into the cache key. I’d just be extra careful, the worst case for many server settings is an inoperable service but choosing the wrong cache key can easily result in a data leak. (serving one user’s response to another user)

nesarkvechnep · on Dec 26, 2022

You can use the `Vary` header.

xmorse · on Dec 26, 2022

The hard part of building a CDN is scaling it. The best approach imo is to use fly.io to host an anycast IP (with horizontal scaling) and store cache files on disk

Fly.io also has a Grafana dashboard built in for your machines

berndinox · on Dec 26, 2022

Agree, Fly.io is great for such usecases. Is there any CDN/Proxy solution or guide available for fly?

iampims · on Dec 26, 2022

https://fly.io/blog/the-5-hour-content-delivery-network/

vitorbaptistaa · on Dec 26, 2022

Beautifully written! Thanks for sharing, Leandro.

dreampeppers99 · on Dec 26, 2022

chrsig · on Dec 26, 2022

I'm curious if any HNers have opinions on prometheus vs other time series databases like influxdb?

I periodically consider a grafana & backend setup for when datadog becomes cost prohibitive for metrics with several tags.

firstSpeaker · on Dec 26, 2022

Go with Mimir. It is Prometheus compatible and horizontally scalable for read/write path separately.

Mimir: https://github.com/grafana/mimir

flyingsky · on Dec 26, 2022

You did not answer OPs question tho'. prometheus vs influxDB.

xiwenc · on Dec 26, 2022

We have been using prometheus at a client for little over a year now. Since we need to keep metrics for years, prometheus cannot seem to be able to deal with it well. One behavior we observed is it crashes consistently in k8s. We couldn’t pin down the root cause but suspect it’s the amount of metrics we collect continuously and keep (archive).

Now we are considering to switch to thanos or mimir.

beardedetim · on Dec 26, 2022

At $dayjob we're considering replacing DataDog with Grafana and friends, already using it elsewhere to great affect.

Haven't used influxdb yet so can't speak as a comparison but from my usage, I'm sold on Grafana, Loki, Prometheus, and friends over DataDog. It mixed with OTel have been a real pleasure to use.

therealdrag0 · on Dec 27, 2022

We’ve done that migration at $dayjob. Has taken a lot of getting used to. The data model is different (perhaps poor setting choices) causing some wacky query requirements for our charts which devs often don’t understand to misleading charts. We’re slowly switching to OTEL which should solve that problem. Otherwise there’s also much to be desired in how our grafana behaves interims of saving preferences (ours doesn’t; stateless head? I forget why) which is annoying. Also the tracing/APM is missing out of the box though some teams are working on getting that going too.

So a classic build vs buy story I guess. Probably will pay less in the long run but for a worse and rocky road experience so far.

daniels1006 · on Dec 26, 2022

Great content, helpful and inspiring.

Thanks!

sodez117 · on Dec 28, 2022

Good read. Is there something similar for building a DDoS protection feature? Like Cloudflare?

zspace2 · on Dec 26, 2022

Very good project. thanks for sharing

friendlyHornet · on Dec 26, 2022

Thanks for this

dreampeppers99 · on Dec 26, 2022

my pleasure

asjkaehauisa · on Dec 26, 2022

Why didn't you use varnish for that?

tecleandor · on Dec 26, 2022

I guess it's "...to Learn Nginx/Prometheus/Grafana/Lua".

Per the first line of the link: "The objective of this repo is to build a body of knowledge on how CDNs work by coding one from "scratch". "

jeacken · on Dec 26, 2022

Another example of a project duped into thinking Lua is “powerful”. It is small. That is it. Lua has near zero useful functionality and makes the developer repeatedly reinvent functionality over and over and over again.

https://media1.giphy.com/media/TFO2mwVPIFoOJcuTSC/giphy.gif

klelatti · on Dec 26, 2022

Would you like to expand on why you think Lua is a bad choice for this particular project and what you would have used instead. That would be much more helpful than a generic attack on the language itself.

giraffe_lady · on Dec 27, 2022

I think they were as specific as they need to be: "Lua has near zero useful functionality and makes the developer repeatedly reinvent functionality over and over and over again."

That may feel uselessly broad if you're not familiar with the language or ecosystem but I've used lua professionally for years on projects of all different sizes and it's a truth I recognize.

berkut · on Dec 26, 2022

It's small, fast, and doesn't have a GIL lock, so concurrent executions are trivial.

giraffe_lady · on Dec 27, 2022

Yes it's good if you're writing a C app and need to embed a concurrent scripting language with long-running tasks, and for some reason you can't write that part in C and just expose it to the script env, and also you can't use one of the three or four alternatives that are still a better choice for this specific scenario.

sciurus · on Dec 27, 2022

Which alternatives would you recommend?

giraffe_lady · on Dec 27, 2022

So, it depends, and a lot of my hostility to lua is because creators of C/C++ apps will just drop lua in and consider that part done, leaving the end user/programmer to handle all of the exposed complexity.

In this situation, usually what you're embedding lua for is defining and exposing a DSL to handle complex configuration or mediate some automation.

So these days I use janet for this because I like lisp, its data structures and core functions are predictable in a way lua's are not, and it has a good standard library that is easy to select subsets of for embedding. Lisps are particularly well suited for making DSLs and it has macros if you need them though I never have.

TCL is another excellent choice. It is small and embeds very simply like lua, but is more suited for making DSLs and even GUI config if you need to go that far. And this is subjective but I think it's less hostile to the probably non-professional programmers that are likely the end users. It has a similar C-centric embedded history so is similarly optimized for that case, but with more focus on users not needing to learn the whole language to effectively use a part of it.

For some cases where what the end user will define is not configuration but processes or procedures over unknowable-at-build-time data, forth is an unusual but strong choice. There are some incredibly lean implementations built for embedding. The paradigm is most likely to be unfamiliar to the users but for some use cases it's such a good fit it's worth it.

The cases where I would use lua are where you expect broad and long-lived community development with high complexity and dedicated system builders involved. Lua's metatables and module system are a good foundation to build a powerful ruby-like OO/FP hybrid environment if it's worth adding all that weight and maintaining it over time. You see something like this in mud client scripting where lua is conventional and I think appropriate, and the clients themselves function more like platforms for lua development than apps that run embedded scripts.

The main thing imo is just to think about what the reason for embedding actually is, who is going to be using it, and for what. Lua isn't the worst default, but its flexibility makes developers think they don't have to make this choice at all if they include it, and that's an error.

xcdzvyn · on Dec 27, 2022

Your proposed alternatives to using Lua being a LISP and Forth says a lot about why people will in fact use Lua.

giraffe_lady · on Dec 27, 2022

lua haters club rise up. I truly don't get why this language gets nothing but admiration on HN. I can only assume people have only used it for tiny personal scripting projects, or else just fucking loooove hand writing the for loops to do exciting niche things like parse csv.