Hacker News new | past | comments | ask | show | jobs | submit login
[flagged] Atlassian Exceeds 99.9999% of Availability Using Sidecars, Fault-Tolerant Design (atlassian.com)
17 points by stelliosk on Sept 29, 2022 | hide | past | favorite | 26 comments



Interesting.

* Atlassian: We estimate the rebuilding effort to last for up to 2 more weeks: https://news.ycombinator.com/item?id=30990697

* Inside the longest Atlassian outage: https://news.ycombinator.com/item?id=31015813

* Atlassian products have been down for 4 days https://news.ycombinator.com/item?id=30973808

* Post-incident review on the Atlassian April 2022 outage https://news.ycombinator.com/item?id=31210469


Yes, but you see, that wasn't a full outage because not all customers were affected. So therefor it doesn't count as downtime according to the SLA... :')


Just checking the Atlassian status page and there is an active incident:-)


An availability of 99.9999% means a maximum of 31 seconds unavailable per year. The usual "five nines" is 5 minutes, and that's a tough target for anyone.

Given that their outage was from April 4 to April 19 this year, they should reach their target availability on average at the earliest in the year 45222. If they keep perfect uptime in the meantime, that is.


lol. They just had a multiple week outage this year. No, they cannot claim this level of availability until around May 2023. This is marketing nonsense trying to cover their massive April mistake.


"fault tolerant design" == "we knew the design was faulty, we tolerate that"


This part of the system didn't go down.

They just deleted a shit ton of customer data, and had to manually restore it. The system itself was still available if your data wasn't part of the deletion script.


Well technically their system was up and running . Except that they did not have data to work on. /s


> Atlassian Engineering recently published how it exceeded 99.9999% of availability with its Tenant Context Service (TCS).

What a misleading and cynical headline. Literally all Atlassian products I work with have some unexpected downtime every now and then.


The title is very misleading, it is just one of their micro-services that has that uptime.


Yes, exactly. It's their Tenant Context Service. The headline is misleading, but really only for those who don't bother to read the article.


Headline: "Smoking does not cause cancer."

Article: "This study proves that smoking does not cause skin cancer."


Misleading Title.

Should be: "Besides that Mrs. Lincoln, how was the play?"


Exactly which part of their system has 6 9s? It certainly hasn’t been Jira.


It's mentioned in the article.


This is the worst attempt at corporate propaganda I've seen in a while.

https://www.atlassian.com/engineering/post-incident-review-a...



Can you change the title too to something that doesn't seem like they are trying to mislead people? The real title is "Here’s how one of Atlassian’s critical services consistently gets above 99.9999% of availability"


If someone can suggest an accurate, neutral title, preferably using representative language from the article itself, we'll happily change it.

(I'd do it myself but am just being pulled away)


Atlassian's status pages have had "active incidents" for the last two days straight: https://status.atlassian.com/

Six nines of availability means no more than 30 seconds downtime per year.

Maybe the fault tolerance of one system isn't such a big deal if you depend on 30 other systems?


Didn't Atlassian irreversibly lost Confluence data of some of their clients this year after weeks-long outage?


I think this is relevant regarding the very misleading availability percentage in the title: https://rachelbythebay.com/w/2019/07/15/giant/


Is JIRA not included in this calculation? They were down many times last year.


>achieved this high availability by implementing highly-autonomous client sidecars, able to proactively shield themselves from complete AWS region failures.

complete region fail? How often does that happen?


Actual title: Here’s how one of Atlassian’s critical services consistently gets above 99.9999% of availability


Escaping confluence and transitioning to a competing service was the highlight of my summer.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: