That being said, is the actual production infra archi for HN described somewhere ? Curious how simple it can afford to be.
We laugh at people piling layers and layers and artifacts on their sites, all in the hope of adding redundancy, handle "webscale" load, and avoid an outage (ironically increasing the chances that _something_ will break).
However, if a single hard drive crashing somewhere can cause your site to be down for minutes or hours, some non-tech people (managers, shareholders, customers) will wonder if the site is "professionnal" enough - and I can sympathize with them.
> We’re recently running two machines (master and standby) at M5 Hosting. All of HN runs on a single box, nothing exotic:
CPU: Intel(R) Xeon(R) CPU E5-2637 v4 @ 3.50GHz (3500.07-MHz K8-class CPU)
FreeBSD/SMP: 2 package(s) x 4 core(s) x 2 hardware threads
Mirrored SSDs for data, mirrored magnetic for logs (UFS)
We get around 4M requests a day.
> If you had an auto scaling kubernetes cluster with multiple redundancies using rust and 3 JS frameworks outages like this wouldn't surprise your users anymore.
Irony aside, what's the point? In theory, yes, it could work better. In practice though, HN with its two baremetal boxes has better uptime than 99,99% of the Web, including the biggest ones - just because complexity has its price.
There is a good chance that it is (or was!) an actual spinning hard drive. Whatever it is, it lives in one of our boxes at M5 and it's in their hands for the moment.
People guess the origin of our name often. Maybe this will give you even more of a chuckle. I was not aware of the name of this computer when I named the company. https://en.m.wikipedia.org/wiki/The_Ultimate_Computer
Our Diablo disk goes on the fritz, but who needs a disk when you can netboot? Ken demonstrates the Alto network capabilities, connects to Google, and has the Alto calculate and display a Mandlebrot set. Ken's in-depth blog entry including the fractal demo source code is found here:
We begin our very gentle and progressive power up of the seminal Xerox Alto. No magic smoke, but one power supply is faulty. Opening it up reveals that it had a tough life, having suffered a catastrophic short of some sort, hastily repaired, and some traces almost entirely corroded through. But the source of the malfunction seems to be a somewhat classic case of bad electrolytic capacitors, way too far gone for any hope of reforming. After replacing them and repairing the supply, we turn our attention to the Diablo disc drive and cartridge, and have a bit of a surprise.
Many thanks to my CHM restorers colleagues Ron Crane, Ken Shirriff, Carl Claunch and Luca Severini.
See previous video introducing this historically significant machine:
No apology necessary, but I'm curious how a hard drive failure caused an outage. No RAID or mirroring? No hot spares? No clustering or distributed systems?
It was part of a mirror of identical SSDs on an LSI MegaRAID RAID card. We see occasional "spectacular" drive failures that take the machine down with a single disk failure. Usually it's just a reboot to come back up, and a disk replacement, then some hours of time to rebuild the array and get back to situation nominal.
Today it has been raining heavily, and the only reason I went to my co-working place is because I checked HN in the morning, and figured my internet was down at home because HN didn't work.
I heard from an ISP owner that a common support call was when kids deleted the browser icon on the desktop, the parent thinks internet is down and calls up support. These days almost everyone is using a phone or tablet so it happens less.
Right, HN is the site I check when I'm not sure my connection is working, because it loads so fast. It HN doesn't load, then I'm more likely to walk away from the computer than to check another site.
For decades now pings to yahoo.com to check connectivity have been my only direct interaction with them. If they go away or start responding to pings I could use news.ycombinator.com, but the URL is longer.
I am at gmt-5. I cannot sleep, my wife is out of town, I lost mi kindle the day before yesterday and HN was down. If I turn in the lights to read a book I will not sleep for the rest of the night so I listened to our cats fighting over what I assume was some big moth as my only distraction.
Argh. And if you run it in Docker like I do, I have to relearn the lesson that taking down the container, losing internet, then trying to do a ‘docker pull’ is a a bad idea.
Sometimes I feel the attraction of some spyware mesh system for home networking.
This absurd inaccuracy cannot stand. There is a single turtle holding up the disk, Great A'Tuin. Now, if you were to claim the bottom-most elephant had died, we could talk!
If we (foolishly) ignore the one true source and read the wiki:
Stephen Hawking incorporates the saying into the beginning of his 1988 book A Brief History of Time:[20]
A well-known scientist (some say it was Bertrand Russell) once gave a public lecture on astronomy. He described how the earth orbits around the sun and how the sun, in turn, orbits around the centre of a vast collection of stars called our galaxy. At the end of the lecture, a little old lady at the back of the room got up and said: "What you have told us is rubbish. The world is really a flat plate supported on the back of a giant tortoise." The scientist gave a superior smile before replying, "What is the tortoise standing on?" "You're very clever, young man, very clever," said the old lady. "But it's turtles all the way down!"
This is what the internet was supposed to be ! Correcting the others on how many turtles is holding up everything :D :D
Dammit I missed Terry Pratchett :( - The man had a wicked crazy mind and flair to focus it to produce the most amazing stories, scenarios and humour.
Like having a "Thieves Guild" - "If you going to have crime you might as well have organised crime"
For those that don't know, in this scenario. The Thieves guild" got a 'budget' of how much they can steal per year (?) and if you got robbed the thief would give you a receipt. Oh and any 'unlicensed thieving' was dealt with and 'policed' by the guild itself :)
Same. Posted "Hmm, HN is down ?" on screened somewhere irc session and then jumped to checking my inet connection... Such things should't happen, it confuses peoples ;)
And yes, reloading, pinging, other subdomain checkin too :>
The last time we had a hard drive failure, we were down for a couple days. That was early 2013.
After that, in the spirit of 'never again', Nick (kogir) set up the failover server system that we still use today. That's why you missed it tonight; you wouldn't have back then. Sorry, I guess?
Edit: by pure coincidence, Nick was in town tonight and we met up for the first time in years. Two hours later HN goes down. almost as if the server was overcome by nostalgia
Browsing to HN was the first thing I tried to do this morning after accidentally sleeping in. When it failed to load, my panicked thought was: "Is this how it all ends? Did Putin finally press the red button?"
Very high uptime overall. Credits to PG and his lisp/arc inspired codebase! And to dang (who I assume does a bit of coding on it from time to time from previous comments I have read..sorry if I read wrong)
Sorry everyone!