The slowness of page load mentioned by folks here is the reason why I think caching at the HTTP level (ex: Varnish) is much more efficient than caching at the service level (ex: Memcached), which is much further down the stack and is bound to be latency-sensitive. Because it's much less entangled in your code and deep into your infrastructure (less technical debt). A hybrid approach can work too but only if it's light and unobtrusive.
By the way, and I'm going out on a limb with my shameful plug, I built a Varnish-as-a-Service kind of infrastructure called Cachoid ( https://www.cachoid.com ). But to my own defense, I'm putting my energy, time, and money where my mouth is.
To my understanding Reddit serves a highly customized content to each logged in user. Can you help me understand how HTTP level caching will solve this problem more efficiently than their service level caching does right now?
You cache non-logged in users to start with. And then you cache based on sessions (logged in users) because you don't really need to show fresh votes on each visit and right away (admitted to it in the article). Plus there's lots of room for ESI.
Logged out users see a "snapshot" of the page updated every so often.
And I really don't think that caching pages per session would really help with their load all that much. Why not just use HTTP cache headers at that point?
Plus while you don't really need to show votes ASAP, logged in users will want up to date comments.
By the way, and I'm going out on a limb with my shameful plug, I built a Varnish-as-a-Service kind of infrastructure called Cachoid ( https://www.cachoid.com ). But to my own defense, I'm putting my energy, time, and money where my mouth is.