Hacker News new | past | comments | ask | show | jobs | submit login
Hacker News was down (onlineornot.com)
199 points by rozenmd on July 8, 2022 | hide | past | favorite | 103 comments



Looks like a hard drive stopped working. We switched to the failover server.

Sorry everyone!


If you had an auto scaling kubernetes cluster with multiple redundancies using rust and 3 JS frameworks this wouldn't happen. ;)


That being said, is the actual production infra archi for HN described somewhere ? Curious how simple it can afford to be.

We laugh at people piling layers and layers and artifacts on their sites, all in the hope of adding redundancy, handle "webscale" load, and avoid an outage (ironically increasing the chances that _something_ will break).

However, if a single hard drive crashing somewhere can cause your site to be down for minutes or hours, some non-tech people (managers, shareholders, customers) will wonder if the site is "professionnal" enough - and I can sympathize with them.


From 2018: <https://news.ycombinator.com/item?id=16076041>

> We’re recently running two machines (master and standby) at M5 Hosting. All of HN runs on a single box, nothing exotic: CPU: Intel(R) Xeon(R) CPU E5-2637 v4 @ 3.50GHz (3500.07-MHz K8-class CPU) FreeBSD/SMP: 2 package(s) x 4 core(s) x 2 hardware threads Mirrored SSDs for data, mirrored magnetic for logs (UFS) We get around 4M requests a day.


> FreeBSD/SMP

Good choice. :)


"Why are they using a freeware OS?" (: /s


> If you had an auto scaling kubernetes cluster with multiple redundancies using rust and 3 JS frameworks outages like this wouldn't surprise your users anymore.

FTFY


Irony aside, what's the point? In theory, yes, it could work better. In practice though, HN with its two baremetal boxes has better uptime than 99,99% of the Web, including the biggest ones - just because complexity has its price.


Imagine a Beowulf cluster of those!!111!!eleven!!1!


Or a simple RAID array (but of course the controller should keep working).


Personally I haven't seen a server without a RAID since time immemorial[0]. Of course HN has it, too, as explained here:

https://news.ycombinator.com/item?id=32024989

[0] early 90s, that is


... but many other horrible things might


Three.js on HN? Interesting thought.


It could be like SGI's fsn[0] file manager but for tech news.

"It's a Unix system, I know this."

[0] https://en.wikipedia.org/wiki/Fsn_(file_manager)


Raymarched SDF pyramids implemented with GLSL shaders for the voting arrows! They could be so shiny!

(It's a wonder that anyone lets me near the frontend of their websites really.)


It wouldn't surprise me if someone managed to implement a 3d voting arrow in less than the 407B transfer size of the existing .gif


huh, all this time I thought the arrow was just a unicode character


HN Dashboard in 2012: https://i.imgur.com/oymP2UW.jpg

Does this still exist?


Alas no.


Well, that's ok... thanks for being up to fix it.

It's not an actual spinning hard drive, is it?


There is a good chance that it is (or was!) an actual spinning hard drive. Whatever it is, it lives in one of our boxes at M5 and it's in their hands for the moment.


It was an SSD. A 1.6TB SAS3 SSD. (M5 CEO here)


Stop making stuff up guys, I just know that someone at the YCombinator HQ tripped over the power cable of the Raspberry Pi you're hosting this on.


> one of our boxes at M5

Read that as MI5 and it gave a chuckle!


People guess the origin of our name often. Maybe this will give you even more of a chuckle. I was not aware of the name of this computer when I named the company. https://en.m.wikipedia.org/wiki/The_Ultimate_Computer


Probably a 2.5 MB one-platter Diablo hard disk drive cartridge running on a restored Xerox Alto.

https://en.wikipedia.org/wiki/Xerox_Alto

Diablo Systems Incorporated Series 30 Disk Drive Maintenance Manual

http://bitsavers.org/pdf/diablo/disk/model_30/81503-02_Serie...

Restoring Y Combinator's Xerox Alto, day 4: What's running on the system (righto.com):

https://news.ycombinator.com/item?id=12197591

http://www.righto.com/2016/07/restoring-y-combinators-xerox-...

Xerox Alto Restoration Part 16 - our disk goes down, the Alto connects to Google and draws fractals

https://www.youtube.com/watch?v=adEr2aRwHnI

Our Diablo disk goes on the fritz, but who needs a disk when you can netboot? Ken demonstrates the Alto network capabilities, connects to Google, and has the Alto calculate and display a Mandlebrot set. Ken's in-depth blog entry including the fractal demo source code is found here:

http://www.righto.com/2017/06/one-hour-mandelbrot-creating-f...

Xerox Alto Restoration Part 1 - power supply restoration, disk drive surprise

https://www.youtube.com/watch?v=xPyqQXFC2yw

We begin our very gentle and progressive power up of the seminal Xerox Alto. No magic smoke, but one power supply is faulty. Opening it up reveals that it had a tough life, having suffered a catastrophic short of some sort, hastily repaired, and some traces almost entirely corroded through. But the source of the malfunction seems to be a somewhat classic case of bad electrolytic capacitors, way too far gone for any hope of reforming. After replacing them and repairing the supply, we turn our attention to the Diablo disc drive and cartridge, and have a bit of a surprise.

Many thanks to my CHM restorers colleagues Ron Crane, Ken Shirriff, Carl Claunch and Luca Severini.

See previous video introducing this historically significant machine:

https://youtu.be/YupOC_6bfMI

For much more details and references, see Ken Shirriff's blog entry corresponding to this video here:

http://www.righto.com/2016/06/restoring-y-combinators-xerox-...

A 1970s disk drive that wouldn't seek: getting our Xerox Alto running again

http://www.righto.com/2018/03/a-1970s-disk-drive-that-wouldn...

Identify It Challenge for 7-26-2012 Answer

https://reinventingscience.wordpress.com/tag/diablo-systems-...

ARTIFACT DETAILS: Series 30 disk drive

https://www.computerhistory.org/collections/catalog/10266694...

Description: "Not working cards missing heads may be bad" is handwritten on black marker on a sticker attached to the top of the machine.


You need to rewrite Hacker News in Rust to prevent these sorts of things!


No apology necessary, but I'm curious how a hard drive failure caused an outage. No RAID or mirroring? No hot spares? No clustering or distributed systems?


It was part of a mirror of identical SSDs on an LSI MegaRAID RAID card. We see occasional "spectacular" drive failures that take the machine down with a single disk failure. Usually it's just a reboot to come back up, and a disk replacement, then some hours of time to rebuild the array and get back to situation nominal.


I’m curious as to what time zone you’re in. Or if there’s multiple people behind your account. It’s pretty impressive how omnipresent you are :).


For two hours I thought I was finally blocked.


And just as I was on the train to work. Worst. Website. Ever.


If this site being down was that impactful to your commute, it seems like you actually feel like it has a lot of value.


This is one of a few lightweight, reliable sites where any problem makes me assume that my internet connection must somehow have gone awry.


Today it has been raining heavily, and the only reason I went to my co-working place is because I checked HN in the morning, and figured my internet was down at home because HN didn't work.


I heard from an ISP owner that a common support call was when kids deleted the browser icon on the desktop, the parent thinks internet is down and calls up support. These days almost everyone is using a phone or tablet so it happens less.


Right, HN is the site I check when I'm not sure my connection is working, because it loads so fast. It HN doesn't load, then I'm more likely to walk away from the computer than to check another site.


Or there's some hot topic with more than 2 replies that's making the server pull its hair out...


I worked a place that used to use ‘is HN up?’ as a proxy for ‘does the internet go?’


Time was I would use sun.com to check if the network was up. These days, like you, I use HN.


For decades now pings to yahoo.com to check connectivity have been my only direct interaction with them. If they go away or start responding to pings I could use news.ycombinator.com, but the URL is longer.


ping 1.1.1.1 is my go to for the command line


It was pretty horrible. I thought where I’d read about it and couldn’t think of any place that wasn’t down.


But the gates is up?


I am at gmt-5. I cannot sleep, my wife is out of town, I lost mi kindle the day before yesterday and HN was down. If I turn in the lights to read a book I will not sleep for the rest of the night so I listened to our cats fighting over what I assume was some big moth as my only distraction.


Reading a book will keep you awake for the night, but staring at a screen won't?


Probably in dark mode... Whether this really helps is another matter...


Sounds horrible, my condolences

Good that we are past the bump now


I went digging into what was wrong with my network. I'm deep into updating my pihole now, for better or for worse. Wish me luck!


Argh. And if you run it in Docker like I do, I have to relearn the lesson that taking down the container, losing internet, then trying to do a ‘docker pull’ is a a bad idea.

Sometimes I feel the attraction of some spyware mesh system for home networking.


That's why you run two pihole instances on independent hardware :)


You’ve worked out how I ended up with 3, virtualised and on different hardware.


Good luck!


Asterix & Obelix:"The sky is falling on our heads"

Discworld:"The bottom most turtle died"

PlanetEarth:"HN is down"

These all seem like the same thing, it just depends where you are in the erm "multiverse" !


> Discworld:"The bottom most turtle died"

This absurd inaccuracy cannot stand. There is a single turtle holding up the disk, Great A'Tuin. Now, if you were to claim the bottom-most elephant had died, we could talk!


But but didn't the lady said it's "Turtles all the way down ?" See cosmology is tricky no matter where you are :D

PS. According to historic records (DISCWORLD 1-41) you are right :P


If we (foolishly) ignore the one true source and read the wiki:

Stephen Hawking incorporates the saying into the beginning of his 1988 book A Brief History of Time:[20]

A well-known scientist (some say it was Bertrand Russell) once gave a public lecture on astronomy. He described how the earth orbits around the sun and how the sun, in turn, orbits around the centre of a vast collection of stars called our galaxy. At the end of the lecture, a little old lady at the back of the room got up and said: "What you have told us is rubbish. The world is really a flat plate supported on the back of a giant tortoise." The scientist gave a superior smile before replying, "What is the tortoise standing on?" "You're very clever, young man, very clever," said the old lady. "But it's turtles all the way down!"

https://en.m.wikipedia.org/wiki/Turtles_all_the_way_down


This is what the internet was supposed to be ! Correcting the others on how many turtles is holding up everything :D :D

Dammit I missed Terry Pratchett :( - The man had a wicked crazy mind and flair to focus it to produce the most amazing stories, scenarios and humour.

Like having a "Thieves Guild" - "If you going to have crime you might as well have organised crime"

For those that don't know, in this scenario. The Thieves guild" got a 'budget' of how much they can steal per year (?) and if you got robbed the thief would give you a receipt. Oh and any 'unlicensed thieving' was dealt with and 'policed' by the guild itself :)


Was Pratchett making a dig at taxation? The thieves guild sounds rather like government taxation.


Lol maybe !


>"Turtles all the way down" is an expression of the problem of infinite regress. < : https://en.wikipedia.org/wiki/Turtles_all_the_way_down


I would love to see that phrase used in some serious math proof or paper :)


I was able to post an angry tweet about my ISP, then I realized Twitter was working.


As one of the people furiously hitting reload, I apologise.


I was playing with my DNS settings and Hackernews was one of the sites I used to test. I picked the wrong day...


Ditto! It took me a while to realise my DNS settings were ok.


What percentage of HN readers are doing recreational DNS configuration at any given time?


Probably close to 100%.


I definitely thought my network was down and was actually confused when other sites were available.


Same. Posted "Hmm, HN is down ?" on screened somewhere irc session and then jumped to checking my inet connection... Such things should't happen, it confuses peoples ;)

And yes, reloading, pinging, other subdomain checkin too :>


Will there be "I survived the Hacker News outage of July 2022" stickers available? :D


Look Out for the merch posts for T-shirts, coffee-mugs and stickers on HN :D


ᶦ ᶠᵉˡᵗ ˢᵒ ˡᵒⁿᵉˡʸ ᵇᵘᵗ ᶦᵗ'ˢ ᵇᵉᵗᵗᵉʳ ⁿᵒʷ


Looks like the 50GB hard drive in the 2008 Dell server at the colocation data center finally gave out.


I noticed! My internet felt slow so I pinged HN since it's my baseline. Bad luck lol.


It's like /r/nosleep, but for nerds.

I'm glad I slept through this horror!


I just automatically assumed I was having internet issues instead of checking other connections like I usually do when other sites have issues.


I read https://twitter.com/eastdakota/status/1545259828972584961, came to HN to see if there was any discussion, then saw HN was down… spooky


Time to setup up an offline rss reader. Thanks for the reminder hn - you never fail to deliver.



I suspect dang was asleep and couldn't tweet about this: https://twitter.com/HNStatus/with_replies


In the heat of the moment, I forgot. Sorry!


dang doesn't sleep


Good news, it's up now!


The last few weeks I’m continuously getting white pages on dynamic pages: login, profile, comments. What is going on? (Safari, iOS)

Things seem to work on Firefox on mobile. Something with internet relay perhaps?


Ended up on TRUTH social, please don’t let this happen again =P


archived downforeveryoneorjustme link: https://archive.ph/NRnjs


I visited HN by VPN, when requested failed, I thought my VPN connection was broken... Actually, VPN connection often do not work well in China.


This caused me a small but concerning amount of anxiety that I'll probably meditate on as I fall asleep now that HN is back up.


But why?


To keep us on our toes.


It was terrible, had to resort to reading CLRS in the meantime.


I had to resort to lobste.rs. Still, it gave me a chance to revisit all those stories I read on HN a fortnight ago.


I've wanted to re-visit StumbleUpon before it became an ad-ridden shell of its former self. Push button, get random website based on your interests.


Awww, I missed it.


The last time we had a hard drive failure, we were down for a couple days. That was early 2013.

After that, in the spirit of 'never again', Nick (kogir) set up the failover server system that we still use today. That's why you missed it tonight; you wouldn't have back then. Sorry, I guess?

Edit: by pure coincidence, Nick was in town tonight and we met up for the first time in years. Two hours later HN goes down. almost as if the server was overcome by nostalgia


A+ for Timing! - Quantum Entanglement? - The Mind and The Machine.


Not to worry; you just got a second chance ;)


  >Awww, I missed it.
It was a day that will live in infamy.


Missed it


worst seconds of my life


Browsing to HN was the first thing I tried to do this morning after accidentally sleeping in. When it failed to load, my panicked thought was: "Is this how it all ends? Did Putin finally press the red button?"


Very high uptime overall. Credits to PG and his lisp/arc inspired codebase! And to dang (who I assume does a bit of coding on it from time to time from previous comments I have read..sorry if I read wrong)




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: