Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: How much does it cost to run Hacker News?
135 points by hgarg on Dec 29, 2022 | hide | past | favorite | 85 comments
How much cost to run Hacker News monthly?

These costs could be cloud costs as well as any engineers assigned to upkeep.




Pretty sure the most expensive thing is @dang's time.


Deservedly so


Do they really only have a single moderator?

What happens when @dang has to sleep?


Legend has it that dang can moderate even while asleep.

On the other hand, I've read here previously that dang is not human but a lizard.

We may never know.


I believe it still runs on one box. Xeon E5-3567 (3.5Ghz). Stack is FREEBSD with nginx as front-end.

From a quick look at M5 hosting price, cost is probably $500-$1000/month (6M requests).


I wonder how much it would cost to run if it were designed and run with 2022 modern best practices of Kubernetes, Helm, Node + React, on a fleet of cattle servers hosted on AWS, paired with expensive add-on services because running PostgreSQL is too hard.

The cynic in me says it would cost at least 10x that, require a full time team of 5 devops, and we'd suffer at the very least the same amount of outages, if not worse.


I think your cynicism is probably true to the mark. Everything is f*king over engineered now.


Buuut... The Aws+kubernetes approach is very scalable.

If overnight, HN became the new Facebook, the single server model wouldn't work, and the business opportunity would be lost.

However, the kubernetes model probably would scale (probably with a few hiccups still, but doable), and HN would suddenly be a 100x business.

That is why business leaders pick the more expensive 'best practice' setup.


Reading your comment and the answers, I have a hard time telling if you're being ironic or serious here. I'm going to be charitable and assume yours is some very sharp wit that flew over the heads of some that replied to you :)

Indeed it is a terrible idea to optimise any piece of software that's put online in the rare occurrence it becomes the next Facebook. No, it will not happen overnight. No, you can run on an autoscalable k8s on AWS setup and it will still break, you will have to completely re-architecture and write a lengthy "what went wrong" post-mortem anyway.

I feel the "premature optimisation is the root of all evil" mantra is not yet being taught in DevOps school.


> If overnight, HN became the new Facebook

Begone, Satan!

> That is why business leaders pick the more expensive 'best practice' setup.

Realistically, how many companies reach the scale of FB? If successful, they'll reach a few thousand concurrent users, which can be serviced by the monolith model, and even scaled traditionally. HN is a testament of how well this model can work.

Building an entire infrastructure around the 1e-9% chance it will have to service a FAANG-level scale of traffic is the definition of premature optimization.

Instead, setting up and maintaining this HA infrastructure is a huge resource sink, which at an early stage of a startup can make or break the project.

It's insane that this model of building for burstable scale where everything is a microservice has become so prevalent in the industry. Cargo culting whatever tech is produced by FAANGs is doing immeasurable harm. These are not best practices.

EDIT: Ah, if you were being sarcastic... Touché.


FANG is hyperbole. There are a TON of companies between HN and FB that have scale or complexity that requires more than a single service to run.


I like it that HN has a built-in protection against an eternal September :)


No. Bored engineers pick it because it’s more fun to build complex systems being able to handle a wider array of scenarios.


I don't believe business opportunities are lost that quickly. Sure if you're down for 2 months, yes. But use a solid dbms and web framework and you can transition to multiple servers quite easily, in the order of hours. Doesn't even have to be k8s, but call in an expert the next day if you have to.


> If overnight, HN became the new Facebook, the single server model wouldn't work, and the business opportunity would be lost.

I know you're probably being /s but just in case ... all they'd need to do as a stop-gap hack whilst they worked it through is spin up another couple of servers and stick a load-balancer in front. If it does have a filesystem back-end then move that into cloud or network storage. Not ideal, but it would probably survive long enough for a more thought-through solution to arrive.


I lol-ed at this. A few years ago, someone said it runs as a single process on 1 core and use no database, everything is written to filesystem.


Two years ago I emailed HN to ask why favoriting posts is rate limited. Got the reply, "HN's main software runs on a single core."

They also said they are working on an overhaul of Arc and HN, which will massively improve performance.



Nothing in that thread or linked threads provides anything that contradicts that statement that I could find.

Please be more descriptive so that people actually get your point.


NGINX does the heavy lifting, and can utilize many cores.


HN started as a project to experiment [1] with the Arc Language [2], so it doesn't surprise me that it used static files as a storage.

I wondering what is the stack these days though.

[1] http://paulgraham.com/hackernews.html [2] http://arclanguage.org


Is that supposed to be a good thing?


Would it make your experience as a user better if the data was stored in a schema-less distributed database instead of saved to file? Would you browse it more often? If you had to pay for it, would you pay more money?

Because at the end of the day, none of those decisions matter, and sometimes the crappy, inelegant solution is cheaper and more understandable to the ones running it for almost two decades.


Oh, I'm not debating the impact such a decision has on users, but rather on development experience and ops.

> schema-less distributed database instead of saved to file

I was actually going to suggest SQLite.


If it provides the correct features and runs correctly, simpler is better. To me, simpler to run, understand, backup, deploy, recover goes in the right direction.


In my opinion, yes. It's simple to run, simple to back up, simple to migrate to a new server, upgrade path is straightforward (find the fastest single core CPU).


That is what I'd expect it to still be tbh, if the responsiveness is anything to go by.


It's curious how "best practices" would cost ~10x and require 5 devops.

HN is crazy simple. Even $500/m sounds a bit high imo.


Crazy simple and thus misses a few rather basic features - like proper scaling (the icons for upvote/downvote and buttons for actions are too big for mobile, and also too small on big screens), and some basic late 2000s AJAX to avoid page reloads for basic actions such as adding/deleting a comment.


Never underestimate the tendency of many SWE's to make things more complicated than they need to be.


Never underestimate the ability for a company to chase customers which turns into feature requests which turns into complexity which turns into revenue which turns into profit and nice salaries.


Minimalist design obfuscates the maximalist manifesto.


Then why do you refer them as "best practice", can we use "common practice" instead?


In some circles of our buzz-word fuelled world what is considered “best practise” amongst risk-averse management is “what everyone else is already doing”. Anything else is too different: if it is such a good idea not to over-engineer, why is everyone else over-engineering?


With that pattern I always go with: I should be asking you that question.


How does one find the best tools for the job? I want to build something that is performant and maintainable, but I also don't want an over-engineered mess.


Read what others are doing, especially if it seems everyone is doing something similar, but, and this is the vital part, apply critical thinking & specific knowledge of your use case. And keep an eye on new ideas that are less common, and keep in mind old ideas that are out of fashion, and apply said critical thinking to them also (or use them & lessons learned using them in the past to judge against the other/newer ideas). Basically: avoid being part of the cargo cult, except on those occasions where it actually finds the right direction.


Thank you, this is good advice.


There's a Dilbert (rather than an xkcd for that).

When everything is described by a superlative, really, it's just 'average'.

Sites like Trip Advisor or Amazon, or anything with a rating system. The "average" person nearly always puts 4 or 5 stars for average products or establishments.


I heard that on Uber and AirBnB, 4 (out of 5) stars is understood as bad.

When I worked at a large non-American internet company, I know there were some internal discussions on how to deal with different cultural interpretations of numerical ratings (still unresolved AFAIK).


The actual ratings matter less than the relative percentile. that being said, Amazon ratings (or most other similarly dependent site's) are not a simple straight average.


Best practices align with scale. Technologists rush to implement the shiny shiny stuff you mentioned, but in most cases simplicity is the best solution until there is an actual scale problem to address.

That said, I bet at HN scale, it could be cloud-hosted pretty cheaply with ECS/Cloudfront. Maybe even just in S3 with a small RDS instance. Or in cloudflare as was discussed.


Not complex enough. Reckon you could easily fit in a few more steps there.


Needs more Redis, and clearly RabbitMQ would be best to pass the incoming comments to K8s.


Native mobile apps come to mind.


But look how big the team is! Must be pretty important.


“Best” practices.


Not to mention the site and policies would de-evolve into reddit with admins


OOH, I'm cheering along with everyone else at this comment.

OTOH, I wish there was a reference book or something that takes this to a usable level.

Everyone who works on software knows that similar tasks can be completed by a 6 workers, a 6 figure budget or 600 and an 8 figure budget. A similar feature costs 6 hours, or 6,000 hours... in dissimilar circumstances. Some reasons for this are probably unavoidable. Deploying a moderation feature on Twitter is just different to deploying on a 10,000 user app. Some reasons are complex, and mysterious. Overall though, it is very difficult to fathom "Why." How can efficiency possibly vary by so much?! Should we even believe our eyes? Maybe it doesn't vary by that much?

I think one reason this is a nasty problem is that there are so many unrelated ways to answer it. They all yield different insights, and the different insights barely intersect... neither negating or supporting one another.

You could treat this as an engineering issue. How is HN's stack, code and such engineered. That'll give you an answer.

You could go one step before engineering: how do decisions/specs work? For HN, the same few head(s) decide what the site do, how to code it, moderate it, host it, etc. It doesn't change often, or much. A "minor feature" like christmas colours is probably conceived of, specced, approved, coded, released and operated by the same person. You don't suddenly discover that one of the steps is hard. An engineer doesn't have to explain to an idea person why X is harder than Y or why A creates more technical debt than B. They don't need to negotiate, compromise, or communicate in a least common denominator language.

This sort of management/organisational explanation doesn't negate the engineering explanation, but it does trivialize it. When I'm in "organisational brain" mode, I think of engineering decisions (eg. writing everything to the file system) as derivative. Those decisions happen because of organisational reality. This isn't strictly true, but it probably applies in most RL situations. A bank isn't going to engineer like this. Neither is a startup team of 50 newly hired ninjas. It's tempting to forget about engineering at this point... but that's obviously wrong.

Meanwhile, you can probably answer these questions 10 more ways. The team of 50 newly hired ninjas can be explained by funding. It's just what happens when VCs pour large sums into incubator startups en masse, in a tight labour market. We can go a step further: Maybe the VC stuff is derivative of monetary policies and macroeconomics... It's not untrue.

You might prefer a cultural explanation. IIRC, pg like this kind of explanation. Academic culture, FOSS, or government consulting cultures will get you to different code, stack, financial structures, etc. Hiring a LISP team will get you fundamentally different software than hiring a Java team. Much more different than legibly explainable by the differences between Java and LISP. It's technically possible to do the LISP thing in Java or vise versa. It won't happen though.

It just goes on and on, gets hairier and nastier as you go. I find that the deeper I go, the more I touch base with the aggravating starting point: the cost/efficiency of software varies by several order of magnitude: far more than reasonable.


I just and an aneurysm


That comment exactly.


6M requests per month can be served by a $20 VPS.


Sorry, typo. Processor is E5-2637.

Source: https://news.ycombinator.com/item?id=16076041


I don't think they rent an instance, it's more likely a physical machine over at YC since they had that time based simultaneous SSD failure thing recently that likely wouldn't have happened in a datacenter.

So more than the power it takes to run it I doubt it's much of a monthly expense.


Can't find that processor model, probably wrong number?


>Xeon E5-3567

that processor does not exist


as the internet was intended


Last dang talked about hardware it was on a pretty modest xeon so probably not all that much

https://news.ycombinator.com/item?id=28478379

Unsure if anything has changed since


(Edit: probably not) I think they also are behind Cloudflare and they use Algolia for search

So not everything is running there


    dig a news.ycombinator.com +short
    209.216.230.240
Not Cloudflare's address.


They had a hardware failure this year and were for a really short time on a cloud instance and behind cloudflare.


While most of your statement is true, it was direct-from-AWS (at least from my testing, so I'm not sure if some users were routed through Cloudflare).


Oh you might be right I could've sworn traceroute said cloudflare when I probed it.


> Algolia

Are you referring to https://hn.algolia.com/?

pretty sure Algolia runs that for free (for marketing)


Hacker News' secret sauce: no fat image nor video nor sound file serving, just the text, ma'am.

Something that would have made Gopher protocol proud to coexist along in this exclusive class.

Probably could save even more money in bandwidth cost with mostly unsecured HTTP.


E5-2637 (32nm) is from 2012, I can not encourage people enough to buy the last 1151 Xeons or 8 core Atom on Mini-ITX SuperMicro motherboard (might be hard to find new soon): In particular I have E-2124 (2018), E-2224 (2019) and A2SDi-8C-HLN4F (2017); all 14nm, those you'll probably never have to replace for peak performance... the only question is if you will be able to afford powering them without revenues (75W/25W)...


Why get those instead of modern Ryzens that are much faster at the same power levels?


Because I trust Intel and I don't see Mini-ITX motherboards with server grade quality for Ryzen... The only AMD I have ever had was a Thinkpad X100e and that convinced me to never buy anything AMD again. That said, X86 and ARM are peaking together so it does not matter what you buy, the M1 might save you some electricity but at peak performance below what a Xeon can deliver. For parallelism on the same atomic memory more than 4 cores will invalidate too much cache. And if you have a embarrassingly parallelizable problem you might as well use Raspberry 4.


Making decisions based on opinions that are 12 years old is stupid in any niche, let alone in tech where things change quickly. AMD have been the best bang for buck for what now, 5-6 years? Brand loyalty here doesn't make much sense.


I don't know where you have been for the last 10 years but NOTHING has improved, my Ivy Bridge CPU with DDR3 RAM has lower latency than anything released today.

SSDs peaked in 2011 with X-25E at 65nm and 100.000 writes per bit.

From C64 to today you have 1.000-10.000x per watt... if your Ryzen is 2x better than Intel it's irrelevant in that context.

The hard thing now is longevity, since the hardware does not improve.

It's over, time to start writing better software!


In microATX form factor Asrock has had a number of AM4 motherboards with IPMI and ECC RAM available for a long time. For example:

https://www.asrockrack.com/general/productdetail.asp?Model=X...

AMD is now firmly ahead in performance/watt, and in peak performance on most server-side benchmarks (see phoronix). Xeon superiority is a thing of the past. The only area where Intel is competitive is in AVX512 workloads.


Micro-ATX is too big.

This is how you build a 75W server that lasts 100 years and can do AI too :D, just remove the GPU for no AI: http://move.rupy.se/file/xeon_1030.png


Text-only, No media hosting/ops — less than $500 on a humble setup.


For the answer to be relevant I think you first need to answer how many pageviews per day.


I would be surprised if it was more than $100 per month


Pretty sure dang can't afford rent on that


Of course he has good salary to rent anything he wants.


Stricly for the hosting I meant. Doesn't HN famously run on a single Computer?


He must have made more than enough decisions by now to train a bot.


danGPT


> Pretty sure dang can't afford rent on that

#dangdeservesbetter


I think you might be off here by the order of ~10


Real cost is probably closer to $10/month, given HN is apparently hosted from an on-prem server built in 2012.


A request optimized HN could cache pages and essentially only dynamically load votes and comments. This would reduce the need for a strong system.


i imagine it funny if you limit voting to min 2000 max 3000 point accounts.

could be an interesting formula in all kinds of hierarchical systems. Above some salary you are banned from the bike shedding discussion.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: