Making accurate calculations of data center cost is of course complicated and rarely manages to take everything into account, but the common knowledge I’ve heard is that there comes a point when a company is so large that going on-prem is what actually saves them a lot of money.
If FedEx is not large enough to be one of those companies, how large do you actually have to be? Or has cloud pricing changed to the point where this is no longer true, and even huge corporations can save money in moving away from their own premises?
It's not only about size, there's also workloads to take into consideration.
Take FedEx, their operations are highly dynamic - per day but also per period. The number of packages sent, and being transported, vary greatly between a random Tuesday 15:00 and the weeks before holidays.
In such a scenario, with your own infrastructure, you need to overprovision, by a lot, to be able to handle the heaviest load possible, and then some margin on top. And that capacity is wasted what, 90% of the year?
Meanwhile if you subcontract that part on AWS, it's their problem, and they can afford to handle it, and you only pay for what you actually use when you use it.
Someone, somewhere, has to actually buy the computers. "Only pay for what you use" isn't magic, whatever fraction actually gets used (across all customers) has to end up paying for the whole computer.
Taxis are "only pay for what you use", but owning your own car is often cheaper (even if you only use it, what, 1/12 of the day?) and more convenient.
Taxis are not scalable, you can't make one human drive 100 taxis. Meanwhile you can add entire new data centers with barely any extra burden on the software teams that manage them.
Not just routing. When a plane or truck arrives and is unloaded, thousands of packages have to be scanned, which triggers a cascade of messages to different systems. In addition to the routing and whatnot, customs declarations might have to be sent, track and trace gets updates (both from arrival scan and customs), there's billing etc. All this flurry of activity happens within an hour or so after the plane lands, and then it quiets down until the next plane or truck.
I could easily see FedEx can save money by dynamically scaling capacity to track the arrival of their planes or trucks.
May I point you to the DynamoDB docs? Costs scale along the dimensions of storage and activity. A 1 entry table is essentially free if you use on-demand RCU/WCU provisioning.
Ever heard of the traveling salesman problem because that is worse than linear, it is factorial. Granted, that is probably overkill for most people and my team (Not FedEx) just does NxN so it is only quadratic but it is definitely not constant nor even linear.
> The same routing algorithms have to run either they have one package or 1 million packages.
Certainly you know that running a routing algorithm 1 million times because you need to route 1 million packages is going to need 1 million times more CPU time than one package, right?
No, it will take as much time and as many cpus as you have available, unless you manage to find the globally optimal solution earlier (practically never, for the problems that FedEx solves).
The number of variables is not a predictor of time and average complexity for integer programming.
We can solve some problems with millions of binaries in minutes. There are other problems with a couple of hundred of binaries that we cannot absolutely solve to optimality.
I'm wondering if we're talking about different things.
If it takes 1 second of CPU time to process the path for 1 package, then it will take 1 million seconds of CPU time to path find 1 million packages. You can add more CPUs to run multiple path finds in parallel (ie, use 4 CPUs to cut the wall time to 250,000 seconds), but there's still 1 million core-seconds being spent.
The flip side is the approach that Amazon took with AWS. Maintain the server capacity but invent some sort of mechanism to sell any extra server capacity during the down time.
Perhaps there isn't the mindset/ability to actually execute this though.
That's a myth, and obviously false on the surface: What are they supposed to do when they need that capacity back for Black Friday? Kick out all their customers for a day?
Maybe this is true for public cloud pricing but it hasnt been true for some time when it comes to enterprises on cloud. Large enterprises come in and ask for a quote for X cpus/ram/storage over X years. They sign a multi year contract with a guaranteed minimum spend, and a discount on every line item. Then they "lift and shift" their physical data center into a cloud region, in a very static way (no autoscaling etc).
This works great for everyone, the hyperscalars have much lower costs than even the biggest enterprise customers so the company get a good deal. And the cloud company makes some revenue off the minimum spend (making capacity planning a lot easier) and they know once the compute and storage there it will be very likely the company starts using extra managed services.
Running your own data centers is a form of vertical consolidation, which is a strategy. Specialization is the opposite strategy. A single company may strategically decide to specialize some things and vertically consolidate others.
The dollars-and-cents are details. Those matter, of course, but don't base your entire analysis there. Staying vertically integrated may help fedex be 4.713% more profitable today, but might miss out on important growth areas and become a dinosaur. Or maybe the important growth areas involve data center innovations, and staying vertically integrated allows them to exploit those opportunities. So it's not always an easy decision, but it involves more than the present-day costs.
It's not just money, it is also how you spend a limited quantity of organisational focus. You cannot do it all, this is actually even more important when you are that big of a company.
Technology scales faster than fedex's data needs. Maybe somewhere in the future the reverse trend happens as their needs can be met with very small servers. Though I've always been wary of putting your data somewhere because getting your data to a competitor costs alot of money.
I don't think it's the server costs they care about it's stuff like db maintenance, security, security, and security. Security professionals are expensive, and it's very hard to get right in house on metal.
Moving away from on-prem data centers does not reduce security costs. At best, it's a lateral move, and depending on the cloud setup, may actually be more expensive.
Ultimately, this is an opex vs. capex decision. Data centers are expensive, and require a lot of up front capital to go into the ground. FedEx is worldwide and moving more technology to the edge, so buying land, designing and constructing buildings, and then operating them for 10 years or more in a large number of locales requires a huge investment. They can take that money now and solve their problems immediately.
They are also not saying what they mean by "closing its data centers". FedEx announced a 10-year partnership with Switch last year to build and operate edge facilities. FedEx may be moving out of data centers it operates on its own into facilities built and run by others, and using "cloud-native" designs deployed on hardware in those facilities.
The transition to “cloud” is often more about the culture and capabilities of a company than the technology. If you have been on some shitty mainframe and a bunch of other technology from the 60s you cannot bank on that carrying you for the next 20-30 years. The people that know that stuff are ancient, expensive, and retiring. “Cloud” is a very good opportunity to jettison the legacy burden and reinvigorate the technical competence of the org.
If FedEx is not large enough to be one of those companies, how large do you actually have to be? Or has cloud pricing changed to the point where this is no longer true, and even huge corporations can save money in moving away from their own premises?