You're still going to need n+1 sysadmins to maintain the rest of your infrastructure. Maintaining a storage system like this really isn't much different for the day-to-day operations. So, yes, it's one more system, but you're already going to need those admins anyways.
I don't believe maintaining a high availability system with proper checking is less than 3 hours a day of work, or ~10% of someone's time on duty if we're talking around the clock.
So we're talking about $75,000-90,000 a year in salary to maintaining the cluster if you want 24x365 coverage (which Amazon provides), as you'll need 2 people per shift, 3 shifts per day at a minimum to have people in house, even if they're only spending 10% of their time actually working on that particular issue. In reality, these are unrealistically small numbers. Each employee will have a cost to the corporation of $125,000-150,000 a year. Employees of that caliber spending 10% of their time on supervising a cluster for a year is $75,000-90,000. I'm amortizing the amount of work over your 3 datacenters by imagining you already have the staff and only counting out the number of hours needed for just this work.
So the reality is that you left off something like $225,000-270,000 in actual cost of running your own cluster from your analysis, because while it's not "much different", it is a few hours a week from at least 6 employees if you're really talking about running managed, highly reliable storage.
I feel like these comparisons oversell the level of support that actually comes with Amazon. Yes AWS as a whole very rarely goes down, but instances have problems all the time, and who do you call?
When you have your own servers and staff, even if they are just on-call with a pager, you know they are going to work for you on your problem until it is fixed.
In comparison, the sentiment about AWS is this: better build redundancy into the application because at the server layer, you get whatever you get.
This is simply a perspective on systems engineering.
My argument would be that part of the system is always down, the question is which part and how long, and how that impacts the system performance on the whole.
AWS services work well if you build a stateless system which is in some senses "embarrassingly parallelizable", because you can talk about the capacity of such a system, and the impact of non-functioning components is easy to predict. This is how most standard engineering is done, across disciplines.
Traditionally, it has not been the case in computer systems, but most modern techniques advocate using such systems, because they're MUCH more reliable.
You just sound like you need a safety blanket for emotional reasons, not that you're making sound engineering points about how to most cheaply engineer a high availability system.
I mean, do you really believe AWS engineers aren't working hard to keep their system fully functional?
The first post was a back of the envelope estimation, excluding externalities on both sides, to show why someone would want to keep all that "old, obsolete hardware". For Amazon, I didn't put in the costs of pushing data to amazon in terms of API calls used, bandwidth, etc. Additionally, I posted the cost estimate using their slowest storage system with the least amount of flexibility. The cloud doesn't always save you time and money.
I'm just saying that you omitted a major component, as no one would argue that the trade-off in engineering time isn't one of the main cost-benefit components of considering AWS (as we can see here, where it weighed in at a substantial fraction of your estimate).
It's like forgetting to count the price of hardware, and only talking about the relevant cost of electricity.