"... the networking congestion impaired our Service Health Dashboard tooling from appropriately failing over to our standby region. By 8:22 AM PST, we were successfully updating the Service Health Dashboard."
Sounds like they lost the ability to update the dashboard. HN comments at the time were theorizing it wasn't being updated due to bad policies (need CEO approval) etc. Didn't even occur to me that it might be stuck in green mode.
In the February 2017 S3 outage, AWS was unable to move status icons to the red icon because those images happened to be stored on the servers that went down.
Hasn't this exact thing (something in US-east-1 goes down, AWS loses ability to update dashboard) happened before? I vaguely remember it was one of the S3 outages, but I might be wrong.
In any case, AWS not updating their dashboard is almost a meme by now. Even for global service outages the best you will get is a yellow.
Yeah, probably. I haven't watched it this closely before during an outage. I have no idea if this happens in good faith, bad faith, or (probably) a mix.
Sounds like they lost the ability to update the dashboard. HN comments at the time were theorizing it wasn't being updated due to bad policies (need CEO approval) etc. Didn't even occur to me that it might be stuck in green mode.