Hacker News new | past | comments | ask | show | jobs | submit login
FedEx to close data centers, retire mainframes (datacenterdynamics.com)
492 points by taubek on July 5, 2022 | hide | past | favorite | 559 comments



Ultimately companies that abdicate their informatics operations like this will give their profits to their data-center operators, who will be empowered to charge them whatever price they want. Because what's their BATNA? Migrating from Azure to AWS when Microsoft doesn't want to let them?

Renting your information infrastructure is a great way to reduce startup costs, but down the road, that information infrastructure runs your company. Trying to outsource it is like trying to outsource upper management.

To be clear, I'm not saying that the optimal amount of cloud services for an established company like FedEx to buy is 0. They bring in management consultants, too. But it sure isn't 100%.


Without a doubt, a future generation of executives will revisit and reverse the decision to rent all information infrastructure, but that will likely be many, many years down the road. In the meantime, the current generation of executives who made this decision will look very smart for saving the company lots of money for a good number of years. And they stand to benefit personally from it. They're doing the rational thing!


I used to work with a very smart man that I'm sure was some kind of secret genius. He's was that sort of tech gofer. Hardware, software, didn't matter, if there was a problem he'd solve it. Sort of guy you'd see carrying a thick ass SQL book around because he 'needed to learn it' to solve just one little problem. He built whole entire solutions for the company I worked at in his spare time that the company once tried to sell for 500k and at a previous company I heard he figured out a way for the pain mixing machines to save on paint or recycle it or something saving them 1.3 Mil a year. When Raspberry Pis first came out he was one of the first people I saw tinkering with them and he was in his 50's doing it just for fun, I think he ended up using it to open and close his garage door from work or something just to scare his wife.

That sort of guy. Well he once told me something about executives and upper managers working for corporations that I have never forgotten. He said to me, and of course I am paraphrasing:

"Change gives the illusion of progress". I asked him what he meant and he responded with something to the effect of "They have the habit of changing big things every 5-10 years on purpose to make it look like they are productive, and to justify their own roles, one guy will come in and 'cut costs', the new guy after him will 'invest'".


"A new CEO was hired to take over a struggling company. The CEO who was stepping down met with him privately and presented him with three numbered envelopes. “Open these if you run into serious trouble,” he said.

Well, six months later sales and profits were still way down and the new CEO was catching a lot of heat. He began to panic but then he remembered the envelopes. He went to his drawer and took out the first envelope. The message read, “Blame your predecessor.” The new CEO called a press conference and explained that the previous CEO had left him with a real mess and it was taking a bit longer to clean it up than expected, but everything was on the right track. Satisfied with his comments, the press – and Wall Street – responded positively.

Another year went by and the company continued to struggle. Having learned from his previous experience, the CEO quickly opened the second envelope. The message read, “Reorganize.” So he fired key people, consolidated divisions and cut costs everywhere he could. This he did and Wall Street, and the press, applauded his efforts.

Another year passed and the company was still short on sales and profits. The CEO would have to figure out how to get through another tough earnings call. The CEO went to his office, closed the door and opened the third envelope. The message began, “Prepare three envelopes...” "

https://duckduckgo.com/?t=ffab&q=the+three+envelopes&ia=web


Is this a quine where the processor is a CEO?


I think it's more like malware. A virus that spreads from CEO to CEO.


CEO is the virus that spreads from one organisation to another


This is how governments run too, but they add a few more envelopes as they have more avenues to peruse. War, plague, etc.


I'm 53, and I've worked for 3 Fortune 250's. Can confirm. I've seen this happen over and over again. Senior management makes some broad pronouncements, and the mid-level lieutenants have meetings with consultants, and then implement new, expensive projects that "will surely fix 'it' this time." Five to ten years later, after the dust settles, and we figure out that we've strapped YET ANOTHER LAYER of technical debt on top of everything else, and things are worse than ever. But at the height of the project, when everything is still rosy, the managers in charge update their resumes, and hit the bricks.

IN PARTICULAR, at one Fortune 250 (which no longer exists), we implemented OneWorld to replace the mainframe. After 7 years, we still had the mainframe, AND a badly-implemented version of OneWorld. Then the company got bought by another Fortune 250, moves were made to "commonize" the IT systems, and then the parent company sold everything. But I'm positive that everyone involved in the project to retire the mainframe made the project look very successful on their resumes.


8-10 years is as long as senior executives last, if they don't make a big change then thee is no way to take credit for their vision. Even if the company had a "perfect" org chart (as if such a thing is possible), they need to make changes otherwise someone will say that they old org chart is the cause of success and they as a leader were not worth anything.

I don't know if the above fear would actually play out, nobody is willing to not make changes to find out.


I think we are giving these people too much credit. Being the head of the organization is like inheriting someone elses filing system. the only way for them to actually understand wtf is going on at the company is to reorganize some things in a way that makes sense to them.

High level management is fundamentally hard to stay on top of over time. It's about as easy as thinking chess is easy because you can theoretically know every move your opponent might make. There are so many moving parts to an organization that having visibility to them isnt enough to perform well. They have to influence most of the business pretty indirectly. If changing things gives you the confidence necessary to keep things running.. that's what you're going to do. Everything is a gamble, so doing nothing is kind of unacceptable leadership behavior unless they are actively taking up the mantle left by previous management and understand it very well already.


I would compare this to asking a dev team to support a big code base without embarking on a major redesign or re-engineering.

It's hard to keep a good (confident, ambitious) team from re-engineering. All the same dynamics apply: In your mind the disadvantages of change are small because you don't know them, but the advantages are large because you planned them. Making change gives you more control over your fate because you are executing your own plan as opposed to staying the course. Finally, how do you keep people motivated to show up every morning if you don't have a vision for change in the future?

I don't think its that different for managers and engineers. There's a lot pushing people to try something, even if the objective odds of success aren't great.


That’s not the only way but “come in and change things to establish dominance” is a commonly taught business school chant.

Management are contemporary clergy, spewing high minded ephemera, only to go home unable to point at anything net new left behind by their effort.

I grew up in farm land; we had no middle managers. Somehow food still got grown, harvested, and sold; somehow a Linux kernel and other wildly popular open source exists without them.

Post-WW industrialism needs to wind down. Militant minded people came home and forced their PTSD on workers. We spend a lot of resources equipping people to output nothing in deference to traditional economic memes. America of recent decades necessarily built itself into a production powerhouse to resupply a destroyed world. Such memes are outdated given automation and unsustainable given real material costs.


Yes but they ultimately live materially better lives because of their position, so you can't hand wave away criticism of the value or lack thereof their actions take, especially when those actions can have a negative effect on those below them and even to the external environment.


criticism is fair game. it just usually doesnt account for the reality of what they are doing, and puts them to blame for not directly controlling things that in reality were outside their control. Taking responsibility for the failings of the company i part of the job description, so by all means hold them responsible. I am just saying all of that is tangential to the root of the problem - which is that no matter how much someone gets paid, they are still human.

It's an area that is pretty tough to make meaningful criticism. its kind of a show dont tell type situation imo. If upper management seems under-qualified and overpaid to you, then maybe you have a calling to go perform better and get paid more.

Company management is in underdeveloped game-theory territory. Sure we can isolate one part of their job and describe how they are failing on it, but we dont know the trade-offs being made on a daily basis with their time and focus. a lot of which is going to be company secrets, if it even leaves their personal thoughts. Any criticism that comes down to saying "they should have done more" is likely out-of-touch, for example. Unless you can prove they were actually being lazy.. which is usually not the case, since they are often workaholics (ime). But we usually cant tell if something is a good move or not until it plays out on the market. So making criticisms based on hindsight is weak, as is making criticism that lack the full picture of the organizations goals and the time / energy they actually have on hand to accomplish them


What parts of these generalized arguments have anything to do with CEOs? As a mental experiment, put them into the mouth someone making excuses for why a crew of expensive painters did a horrible job painting your apartment.


So true. Reading their over generalized screed left me thinking if CEOs are really that useless anyone could be a CEO and their position isn't special, which sort of torpedoes the entire point.


My point is that properly criticizing a CEO is exhausting so it is rarely done properly.

CEO is a very general job role. I dont understand your point with the painters. That would be an operations issue, so I actually wouldnt criticize the CEO of the painting company at all for it. thanks to him/her, I was able to contact a painting company, they showed up, they painted the apartment, and left. I would think the operations team (the painters) deserve to be fired and held accountable for claiming they know how to paint a room when they clearly didn't. Nothing about their job is general or consists of trade-offs. If I said "you only have 10 minutes to paint the room and then leave" then yeah, it might come out like shit and the "excuses" would be valid. which is the kind of time pressure CEO's are often under with respect to things they are actually doing on a day-to-day basis.

i would hold the CEO responsible with respect to resolving the issue and refunding me, etc, since the CEO role is to be responsible for the outcomes of the company.. but he's not the one painting the room. Just like with any company, the CEO does not literally run the company. If service is poor, it is usually because people are finding their way past hiring filters to get jobs they aren't qualified for. Let's not forget that people everywhere are often advised to lie their way into employment, fake it till you make it, baffle them with bullshit, reword your resume to sound more impressive, etc. These people line the mechanisms that CEO's use to accomplish anything at any company.

Of course, the CEO position is no exception to this and I am not saying it is literally impossible to build a case against a CEO. I am saying it needs to fully encompass the position or else you're likely assigning criticism to the CEO for some culmination of lower level operational incompetence that they simply failed to overcome. If a director over-promises to the CEO and the CEO signs off on the basis of trust with the director, then when the bar is not met later of course the CEO will be held responsible but the reality of fault sits with the director, or maybe a subordinate to the director who convinced the director that the over-promise was doable. You then have to get into the weeds of whether or not there were signs the CEO should have seen as to not trust the director, or if they had reason to overrule the directors approach, etc. you then have to do similar things across all areas of the company to derive a valid criticism that the CEO is the common denominator in it all.

Leadership is significantly harder to criticize appropriately than operations. Personally, I would like to stop reading meaningless criticisms from people who want to complain and be heard but dont want to do the work necessary to make a valid complaint.


I think you misunderstood the previous commenter a bit. They were not saying look at the painting example from the standpoint of the CEO, they were saying "Put the excuses" in the mouths of some painters. Also, there's quite a bit of depth and generalization in painting. There are many different types of paint that are better for certain tasks, eg eggshell vs satin finish. Painting walls is different then painting ceilings, then you add in moving furniture, some walls have edging, some walls will be multiple colors, plaster vs drywall vs wood, etc. That's just painting, lets say the company they work for has a CEO/manager that is demanding more jobs completed, now they have to deal with someone telling them "Do it in one day" and all the compromises that must be made to do so. Almost all fields have a lot of depth and generalizations.

So if had a horribly painted room that you just paid extremely well to have completed, and the painters came up to you and gave you a laundry list of reasons that they failed... Would you hold off on criticizing them until you had a complete understanding of what it takes to be a painter?


yes, if I claim that the painters did a bad job and they gave me a laundry list of professional reasons for it.. I would consider those reasons before criticizing. would you not?

trying to use painting as an analogous situation like that isnt transferable to the point i am making though. Putting the excuses in their mouth doesnt even make sense. We are presupposing that the painters did a horrible job.. while discussing how to decide whether or not a CEO did a horrible job. The only reason you know the painters did a bad job is because we are saying they did. the only reason we can use painting as an example is because most people can imagine a terrible paint job. i.e. we do have a full scope understanding of what it takes to be a painter. I am saying it is much harder to imagine the role of a CEO and what good results would look like than it is a painter.

Maybe my wording was fuzzy, but I am not saying you need a complete understanding of the CEO's role, but it does need to be of full scope. I see that reads near synonymous, so in other words it may be infeasible to account for the total depth of their role, but at the very least the entire breadth of the role should be looked at. If you default to "i gave you a lot of money to make it happen so it should be perfect" type logic; you're just being a "karen". the cost of something has nothing to do with the results, directly. Money needs to be converted into something that helps the work, and in that process we are all still limited by reality; diminishing returns, supply chains, quality of communication, availability of resources, etc. A CEO is at the focal point of all of this, and is human. Whether they get paid nothing or everything doesnt change how effective they can reasonably be.

But they do deserve criticism. it just needs a lot of work to do it right. you have to provide some sort of evidence that across all scopes of work the trade-offs do not make sense. maybe the CEO sacrifices on every front in order to provide the fastest service in the business and is successful in that. If you leave speed of delivery out of your criticism it becomes a meaningless criticism. "They charge a lot for poor quality". "These painters did a terrible job.. (even though I called them this morning, and they were done by lunch which allowed me to do a walk through with a potential tenant)".

All i'm really trying to emphasize is that we absolutely can criticize a CEO, but if you dont do it properly it is very easily washed away by the many unknowns of the position. however, if it is done right - it would be very damning as they cant default to company policy or directives from above as a scapegoat since they are the ones creating such things.


> they ultimately live materially better lives because of their position

This is a bullshit reason based on jealousy, not reason.

> when those actions can have a negative effect on those below them and even to the external environment

This is the real reason it’s fair to be very critical of their maneuvers.


>This is a bullshit reason based on jealousy, not reason.

Reality is not zero sum, but neither are resources infinite. Is it really bullshit to critique more thoroughly that to which more of the finite resources are dedicated?


>Is it really bullshit to critique more thoroughly that to which more of the finite resources are dedicated?

this kind of aligns with my point tbh. We give $10 critiques to million dollar positions as if they hold weight.


> This is a bullshit reason based on jealousy, not reason.

If this is trully BS, then allow lead developers to write checks for their whole team using the company's bank account.

they hold power in organisation, they can increase their own renumeration in a way that's rank and file staff cannot. Executive compensation has skyrocketed in the past 20 years.

If their management is ineffective, then they don't deserve top comopensation.


I would expect it is just as much a hedge in case things go wrong. When your job is to steer the course of a company, it won’t look good if you crash and your hands weren’t even on the wheel


counter-examples: all of FANG/MAGA, even when you exclude founders

Satya: 1992, CEO in in 2014 Tim: 1998, CEO in ~2009 Sundar: 2004, CEO in 2015 Andy: 1997, CEO in 2021 Ted: 2000, CEO in 2020

They were all senior executives well before assuming their CEO roles.


Oh man...Dell is terrible about this. Not sure what the policy is anymore (esp since they went private), but to get promoted you had implement a significant cost savings project, which ironically lead to multiple implementations and reversals of policies...they all showed a cost savings but it depended on your perspective.


I don't know anything about Dell's promotion criteria, but their consumer ordering process has to be one of the most hostile ever — I'm guessing anywhere between 50-75% of their orders get auto-cancelled by some overzealous anti-fraud and anti-reseller algorithm.


In SOFTWAR Ellison called it a sort of fashion cycle.


> "Change gives the illusion of progress". I asked him what he meant and he responded with something to the effect of "They have the habit of changing big things every 5-10 years on purpose to make it look like they are productive, and to justify their own roles, one guy will come in and 'cut costs', the new guy after him will 'invest'".

That's a very succinct way to put it. I think that observation also applies to consumer technology (e.g. regularly re-doing UIs to "improve" them when the changes are either in fact regressions or just different but not better). We've had it drilled in our heads throughout the modern era that new == better (e.g. the ubiquitous "new and improved!" marketing language), but that's not actually always true. Change for change's sake justifies itself through that misunderstanding.

In the recent past, we had a really high rate of genuine technological progress. But at some point we'll have picked most of low hanging fruit and will enter a period of slower progress, where faking progress will become more an more tempting for producers than the real thing.


right now, they can't retain the people or afford the people to do this work. When you can go work elsewhere for more money you move. And, running a main frame isn't just having it, it's keeping the people running it, plus paying for the electricity and space.

Depending where, the real-estate and energy prices are nuts in most places. And, the engineers are expensive right now, and the services are cheaper.

It's not about giving it up, or change for the sake of change, it's about seeing the writing on the wall. These managers see the larger trends, rising energy costs, maintenance costs rising, hiring difficult, and retention of existing engineers impossible. Once you see those trends and a department is underwater and it's getting worse, you have to move. At another point in the market, when you may find engineers are less expensive, easier to hiring, technologies and space are less costly and easy to deploy. You move back.


> These managers see the larger trends, rising energy costs, maintenance costs rising, hiring difficult, and retention of existing engineers impossible.

But how will the cloud providers avoid these trends? They won't. They will have to do the same thing as anyone else: pay more. And therefore charge more. There are economies of scale, but those savings are logarithmic and a company like FedEx is already pretty far out on the X axis.


> a company like FedEx is already pretty far out on the X axis

AWS and Azure probably run hundreds of Fedex-sized clouds and it's their core business. I don't think Fedex is very far compared to them.


> But how will the cloud providers avoid these trends?

Innovations are more likely to happen if it is someones priority to fix a certain thing. I think they hope that savings from innovation and better methods at what ever company they hire out to are passed onto them. It is naive if they are unable to change clouds though that they will see these savings and as long as one relies on vendor specific features on is in that position.


They aren't playing the same game. Facebook designed its own servers, they were chassis-less, didn't have a mains power input so no switch-mode power supplies, instead they had a 12V DC feed, they had no rack-wide large UPS instead each server had a small battery in it, they were not built for massive redundancy like a Dell server with dual PSUs and redunant networking, because they were disposable nodes in a larger software cluster, e.g. [1] [2]

Things that aren't Fedex's core competency.

Or, Microsoft's roofless datacenters[3], or locating datacenters in remote and colder climates, things the big players can do with economies of scale beyond buying things cheaply, they can customise the entire datacenter. Microsoft has experimented with underwater datacenters[4] and modular containerised datacenter extensions[5] which could be datacenters no human needs to be near to work on, or which could be dropped off somewhere with cheap land and power and internet, and picked up three years later and retired from use, or etc. Ideas which are not FedEx's core competency and need large scale and software clustering on top.

While FedEx would be hiring ordinary IT employees to work in a standard datacenter in cheap business park - not very enticing - Amazon could be hiring datacenter workers to work with Amazon's undersea cabling connecting their worldwide datacenters; more enticing work for skilled employees.

Google has been known/rumoured to migrate heavy batch processing workloads around the planet, following the day/night cycles to take advantage of regional cheaper night-rate electricity all the time. Something which reduces their energy costs but which FedEx may not be big enough to do.

[1] https://engineering.fb.com/2016/03/09/data-center-engineerin...

[2] https://engineering.fb.com/2019/03/14/data-center-engineerin...

[3] http://www.500eco.com/exhibits/microsoft-roofless-datacenter...

[4] https://news.microsoft.com/innovation-stories/project-natick...

[5] https://www.datacenterdynamics.com/en/news/microsoft-develop...


When The Facebook started designing their own servers (which, by the way, have lots of switch-mode power supplies in them, and always have) the game they were playing was "be a better MySpace". They were running a bunch of PHP pages. At the time you could have made the same argument about The Facebook vs. Rackspace: "While Facebook would be hiring ordinary IT employees to work in a standard datacenter in a cheap business park, Rackspace could be hiring datacenter workers to work with Rackspace's BGP peering connecting their worldwide datacenters."

But The Facebook decided to make informatics their core competency, to the point of building their own servers with 12 volts running to the rack, same as Google before them.

There were surely industrial companies in 01922 who decided that management wasn't their core competency (though they used different words), and if they needed help with management they'd contract out to management specialists like Taylor or Gilbreth. They met the same fate that will meet companies today that decide that informatics isn't their core competency.


it's called economies of scale


"retention of existing engineers impossible"

So where are all these mainframe engineers going to go work? Our company has 3 admins for our two mainframes, and these guys while expert level admins for a mainframe, have trouble with Linux and Windows. Same with the developers writing code for the systems. When all you've worked on is a mainframe, then everything looks like a batch cycle...


Retiring?


In this sense, Oscar Wilde was the executive's executive: "I have spent most of the day putting in a comma and the rest of the day taking it out."


""Change gives the illusion of progress". I asked him what he meant and he responded with something to the effect of "They have the habit of changing big things every 5-10 years on purpose to make it look like they are productive, and to justify their own roles, one guy will come in and 'cut costs', the new guy after him will 'invest'". "

That's deep and 100% makes sense, I can see how the cloud is both of these "things."


Meet the new boss, same as the old boss.


I work in a bank. This is literally what upper management does every 4-5 years. Always some new "initiative" that was preached to them at a conference somewhere, and will now presumably change everyone's lifes.


That’s definitely how the IT department at my company works (I bet a lot of other roles too). Every few years a new exec comes in, sets a new strategy, claims tons of savings, creates “excellence” initiatives (everything that has “excellence” in its name triggers my BS detector). This lasts for a few years until the next guy comes in and goes through the same process but different direction.


Never confuse movement for action :-)


As everybody who saw people, and orgs, rotating in place at incredible speeds can confirm.


This isn't always easy to determine, especially if you are part of the movement. Often, we move by dead reckoning and only after some amount of time can we determine if what we did was "movement" or "progress".


Good anecdote!

In exchange I offer two relevant quotes from my quote file:

> The empire long united must divide, long divided must unite; this is how it has always been. (Luo Guanzhong)

> When cuffs disappeared from men’s trousers, fashion designers gave interviews explaining that the cuff was archaic and ill-suited to contemporary living. It collected dust, contributed nothing. When the trouser cuff returned, did it collect less dust and begin at last to make a contribution? Probably no fashion designer would argue the point; but the question never came up. Designers got rid of the cuff because there aren’t many options for making trousers different. They restored it for the same reason. (Ralph Caplan)


Other formulations of this I’ve heard are “movement doesn’t mean progress” and “fire and motion”, the latter offers by Joel Spolsky in one of his better blog entries.


A better analogy is a ship sailing against the wind: "To reach its target, sailors that intend to travel windward to a point in line with the exact wind direction will need to zig-zag in order to reach its destination. This technique is tacking." https://www.lifeofsailing.com/post/how-to-sail-against-the-w...


> the pain mixing machines

That sounds ... dystopian.


The pain is mixed, then applied in the pain booth.


Obviously he's been working with javascript recently.


definitely sounds like webpack to me


This is what grabbed me to kee pop reading his story


So what's the moral of the story?

Do you think it's "don't change"? Cause that gives right up on progress anyway. So good luck on getting large shareholders to give you the reins with that message. They're looking for ROI, not the status quo - if you don't evolve, your competitor will, barring monopolies and such. And even there... Microsoft of today is much less dominant than the MS of 25 years ago, and arguably could be worth a lot more had they made better moves. If that's not a compelling example, maybe check out Sears...

Maybe most people in these positions know that change doesn't guarantee improvement, but they know that sitting still is the same as just waiting to be defeated. So maybe there's something less-than-stupid about these "short-sighted" "illusory" changes.

But maybe if you want to be a wise executive, the key is to recognize that change might not just fail to improve your position - it might actually actively harm it. So the good executive is the one who chooses to try to change things according to reasonable calculations about both potential upside benefit and downside risk...


>"Change gives the illusion of progress".

This is one of the biggest reasons many jobs are so miserable. You want to be in the job at the beginning of the"invest" decision. Unfortunately most people get little say in when they join because the information about this is kept internally to the company.


> "Change gives the illusion of progress".

That itself is based on the belief that (historical) progress is an improvement.

Mostly true over the last few centuries - aspirin is nice - except for those occasions where it was mass murder.


First "pain mixing machines," and now "asp(i)rin is nice."


So? People can't spell and typos are a common phenomena.


Just noting humorously related typos, that's all. I'll add smilies here so you can comprehend. :-) :)


there used to be an allegory of how to identify a bad, new CIO... if your current system was massive, networked printers that were shared amongst floors.. the new CIO would come in and give everyone individual printers... or vice versa .. because 'change'


It sounds a bit like the bodybuilding method of bulking and cutting.


Bodybuilders make progress though :)


Isn't there a ceiling? Or if we lived as long as those turtles, would you go to the gym and see spheres of muscle?


Some of the extreme pictures I've seen look like people that can't move, let alone lift, but I guess it works.


The returns are diminishing each cycle.


That seems to be a common take on HN - cloud is too expensive.

I'm curious whether the folks claiming that have any data center ops experience.

Because, personally, I'd rather retire than deal with Dell, HP, Cisco, fibers, cooling issues, physical security, hardware failling... And that's just the hardware. Then you still need to pay VMWare for a decent virtualization platform, monitoring tools, etc.... Seriously, no amount of money would make me work in a DC again.

I believe companies selling bare metal as a service are a happy compromise of cost and convenience, though.


ML workloads definitely cost a lot of money. Even for a preemptible VM, A100 GPUs cost $0.88/hr/GPU. That's $624 a month for a single GPU and only the 40GB model. Want a dedicated 8 GPU machine in the cloud to do training with? That'll run you around 16 grand a month. Do that for 2 years and you may as well have bought the device. Want to do 16/24/40 GPU training? Good luck getting dedicated cloud machines with networking fast enough between them so that MPI works correctly, and prepared to give up your wallet.

Also, that's just compute. What about data? Sure cloud accepts your data cheaply, but they also charge you for egress of that data. Yes you should have your data in more than one location, but if you depend on just cloud then you need it in different AZ which costs even more money to keep in sync and available for training runs.

I think for simple workloads and renting compute for a startup, cloud definitely makes sense. But the moment you try to do some serious compute for ML workloads, good luck and hope you have deep pockets.


The other thing is nVidia try and sell GPUs with similar performance at two very different prices. One price for data centres and a quite different price to kids. If you do the job yourself you can often get away with using the much cheaper gamer grade cards for AI work (unless you need a lot of VRAM), whereas such as AWS can't do that and are required by nVidia to use the considerably more expensive cards. If your workload will fit on a gamer grade card there's no contest on price between an on-prem system and the cloud.


That is a really good point, and the 3090s have a surprising amount of VRAM on them. For many smaller models this is sufficient. However, where I work without going into a lot of specifics, because of the size of the models, the amount of VRAM is crucial, as well as the infrastructure of the PCI lanes connected to it, the speed of the local storage, and the networking between both cards on the same node as well as between nodes.

The moment the model gets to be bigger than the size of any one GPU's VRAM, the higher by orders of magnitude of difficulty in the process of training that model.


A lot of that is just good old fashioned marketing.

Here's the list of ingredients for Excedrin Migraine:

Active Ingredients: Acetaminophen - 250 mg (Pain reliever), Aspirin (NSAID) - 250 mg (Pain reliever), Caffeine - 65 mg (Pain reliever aid) Inactive Ingredients: benzoic acid, carnauba wax, FD&C blue #1, hydroxypropylcellulose, hypromellose, light mineral oil, microcrystalline cellulose, polysorbate 20, povidone, propylene glycol, simethicone emulsion, sorbitan monolaurate, stearic acid, titanium dioxide

This is the list of symptoms that Excedrin Migraine claims to treat:

- migraines

And now here's the ingredients for Excedrin Extra Strength:

Active Ingredients: Acetaminophen - 250 mg, Aspirin (NSAID) - 250 mg, Caffeine - 65 mg Inactive Ingredients: benzoic acid, carnauba wax, FD&C blue #1, hydroxypropylcellulose, hypromellose, light mineral oil, microcrystalline cellulose, polysorbate 20, povidone, propylene glycol, simethicone emulsion, sorbitan monolaurate, stearic acid, titanium dioxide

This is the list of symptoms that Excedrin Extra Strength claims to treat:

- headache - toothache - a cold - arthritis - premenstrual & menstraul cramps - muscular aches

And while SOME places have normalized the prices between the two, they can be often found on shelves at two different price points.


Re data, I think egress rates are going to start disappearing over the next few years.

The part that’s always missing with these rent vs buy analyses on HN for some reason is that it’s totally ignoring the opex cost of operating your own hardware which is going to be non 0. Sure, it won’t be quite as expensive (no profit margin) but it’s not an order of magnitude. Additionally, most companies don’t run the HW 24/7 and, if they do, it’s not a level of people they want to hire to support said operations. Not just running it, but you have to invest and grow something that’s not a core competency to get economies of multiple teams loading up the HW.

If the next revolution in cloud comes in to cause companies to onsite the HW again, it’ll look like making it super easy to take spare compute and spare storage from existing companies and resell it on an open market in an easy way. Even still, I think the operational challenges of keeping all that up and running and being utilized at as close to 100% as possible and not focusing on your core business problem will be difficult because you won’t be able to compete with engineering companies that have a core competency in that space.


> The part that’s always missing with these rent vs buy analyses on HN for some reason is that it’s totally ignoring the opex cost of operating your own hardware which is going to be non 0.

Effectively hiring, retaining, evaluating and rewarding competent staff is hard. Even at a big company the datacenter can be a really small world, which makes it hard for your best employees to grow. Things are especially hard when you don't have a tech brand to rely on for your recruiting, and the staff's expertise is far outside the company's core business, making it harder to evaluate who's good at anything.


> Re data, I think egress rates are going to start disappearing over the next few years.

I'm not sure why you think that. AWS hasn't budged on their egress pricing for a decade (except the recent free tier expansion), despite the underlying costs dropping dramatically. GCP and Azure have similar prices.

Fact is, egress pricing is a moat. Cloud providers want to incentivize bringing data in (ingress is always free) and incentivize using it (intra-DC networking is free), but disincentivize bringing it out. If your data is stuck in AWS, that means your computation is stuck in AWS too.


Disclosure: I work on Cloudflare on R2 so I’m a bit biased on this.

I think we’re going to put real pressure for traditional object storage rates to come down. Since Cloudflare‘a entire MO is running the network with zero egress. As we expand our cloud platform it seems inevitable that you will at least have a strong zero egress choice and if we do a good job Amazon et all will inevitably be forced to get rid of egress. Matthew Prince laid out a strong case for why either scenario is good for us in a recent investor day presentation (either we cannibalize S3’s business and R2 becomes a massive profit machine for us because they refuse to budge on egress or Amazon drops egress which is an even larger opportunity for us).

Products like Cache Reserve help you migrate your data out of AWS transparently from any service (not just S3) - you just pay the egress penalty once per file.

Anyway. I’m not saying it’s going to disappear tomorrow but I find it hard to believe it’ll last another ten years.


> totally ignoring the opex cost of operating your own hardware which is going to be non 0

Early in my career I worked at a company and we had a DC onsite. I remember the months long project to spec, purchase and migrate to a new, more powerful DB server. How much that costed in people-hours, I have no idea. I upgraded to a better DB a couple months ago by clicking a button...

Don't even get me started with ordering more SANs when we ran out of storage or the time a hurricane was coming and we had to prepare to fail over to another DC.


>> I think egress rates are going to start disappearing over the next few years.

Compute costs generally drop over time. Do you have any data points to confirm egress will soon go to zero?


Cloudflare Bandwidth Alliance and R2. S3 felt some pressure just because of our pre launch announcement. It’ll be interesting to see how they adjust over the next couple of years.


It's probably worth remembering that a company the size of FedEx isn't going to be paying the listed prices.


They actually probably be paying more that the average listed prices, as Mainframe (basically a on-premise PaaS, where IBM rents you high performance, distributed and redundant hardware cluster on a pay-as-you-use manner) users are often dependent on very high reliability, high uptime and low latency.


The ability to scale up experiments is really nice in cloud. In my experience you need to be quite large before you’re using your own GPUs at a utilization percentage that saves money while still having capacity for large one off experiments.


There are a few different ways to run a data center, a subset of which are much less expensive than the cloud but require a level of competency that some organizations will never have. It can also be relatively pain-free when done well. Some workloads are inherently inefficient in the cloud because of the architecture.

Data center ops is ultimately a supply chain management problem, but most people don't treat it as such. That was my primary learning from doing data center ops at a few different companies. If you get the supply chain management right, and are technically competent, there can be a lot to recommend running your own data centers.


To a company you need to pay those costs no matter what.

If the AC breaks at 3am it neds to be fixed. It doesn't matter if you have your own HVAC people on sight 24x7, your own people on call to service it, a local HVAC service to come in, or you outsource the entire operations and so you have no idea how that is handled. In the end the important part of this story is that whatever you are doing with the AC continues to work. Different operations demand different levels of service (I doubt that anyone keeps HVAC techs on staff 24x7, but if the AC is that critical it is mandatory). The only case where the CEO is up at 3am is if the CEO is the owner of the local HVAC service company, not the CEO of the building with the problem.

Once you realize that to management the cost is outsourced no matter what the only question is do it with your own people and HR, or outsource it. There are pros and cons to both approaches, but for most companies it isn't their business and so the only reason to do it in house is they can't trust any company they hire.


The thing is, the cost for the HVAC 24x7x365 support for a datacenter will be roughly the same for a given location... but it makes a difference if it is you paying the whole bill (=you're self-hosting in your own datacenter), you are splitting the bill with a bunch of other customers indirectly (=you're self-hosting in a colo DC), or if you're splitting the bill with a shitload of other customers (=you're using some service on one of the big public cloud providers).

The downside for saving the costs is that you're losing control with every step taken away: as soon as you go into a datacenter of any kind, you simply cannot call up a HVAC company and offer them 100k in hard cash if they're showing up in the next 60 minutes and fix the issue. With a colo DC you can usually go and show up there to see if the HVAC, UPS and other systems are appropriate to your needs, but with one of the big cloud providers you have to trust their word that they are doing stuff correctly.


> so the only reason to do it in house is they can't trust any company they hire

Now I'm deeply confused. Any company hired either has a profit margin (plus enough to fund an "Oh shit" fund in case times turn bad) or will not stick around longer than a few years. At which case why not just hire people directly and cut out the other company's profit margin? Assuming you hire similar people at the same rate, using your own existing and already-paid-for HR, how is that not cheaper?


You need to deal with overhead. Nobody does their own HVAC in house because you rarely need them, and would have to pay to train people on that despite them not using it.

In some cases you can even get a discount. Utilities are. Big customer of tree trimming, the companies doing that work can give a great deal because the utility doesn't care that they take a week off after a storm for high profit margin consumer trimming.


Lots of places have their own HVAC techs in house, if they have enough HVAC work to justify it. Even if it's not their core line of business. They will do whatever costs less, +/- some amount of subjective "hassle factor."


Especially when it's "line critical" to their business, or if the person can do other things as well.

Larger hotels often have dedicated staff for things like HVAC, etc, because the importance of getting things fixed quick if possible is worth the cost of having someone onsite/available.

And you see similar things with colleges, etc; they often have a maintenance deportment that can be pretty large (though no doubt they've spun it off and brought it back in-house for the same "change is progress" reasons).


I have dealt with a large number of retail colo providers, wholesale data center providers and corporate owned data centers across the US over the last 20 years and all of them used contractors for HVAC and electrical. I'm not saying dedicated staff never happens but it is definitely not the norm.


Doesn't this logic apply to pretty much everything? Why hire external anything then? Why not do your own deliveries, hire your own trucks to transport goods etc?

There is a cost to taking on things that aren't part of your core business too.


All I can come up with is "Because economies of scale". I work for a transportation company, but we employ plumbers, carpenters, electricians, elevator repairmen, and many more that I'm not aware of, because we have enough locations / work to justify them. The pizza place has enough work to justify hiring a fleet of drivers, Amazon ships enough crap to justify having their own trucks (when they can't sucker another company into taking the unprofitable routes).

Similarly, Google doesn't ship enough stuff worldwide to justify drivers, insurance, trucks, jets, etc. - Fedex has the size and scale to make every package a couple cents cheaper, so it's just not worth it for Google.

The only other argument I can think of is the challenge of keeping every plate spinning, in good times and in bad. This is where your point of having a cost to take on something outside your core business comes in, but we seem to be in an era of mega-corporations - I'd expect lots of companies to snake tendrils into whatever will save them a fraction of a cent every time they have to do something.


Not really. Time to service depends on SLA and redundancy. If you have no redundancy your time to service must be less or equal than your SLA. If you have redundancy it can be longer.


I've got some experience with a big academic data center - >1 acre floor space, >10MW, ~$100M construction cost. I've also worked for commercial companies of various sizes.

If your compute installation is big enough that payroll is a small fraction of the operating cost, then it's way cheaper than cloud. (that payroll has to include people who actually know how to build and run a huge compute installation)

The problem is that people come in integer units, you need a bunch of them to cover a bunch of different areas of expertise, and the particular ones you need are expensive. If you've got $1M worth of computers, you're almost certainly better off scrapping them and going to cloud, although the folks you're currently paying to run them might disagree. If you have $100M+ worth of machines it's a whole different ballgame; I'm not sure where the exact crossover is.

Note - that's assuming a single data center, and that you're big enough to build your own data center instead of renting colo space. If you need your machines to be geographically dispersed, you'll need to be even bigger before it's cheaper than cloud, and I'm not sure whether you'll ever hit crossover if you're renting colo space.


1000% this. HN loves to talk about Dropbox. I spent most of my (short, praise God) career at Dropbox diagnosing a fleet of dodgy database servers we bought from HPE. Turns out they were full of little flecks of metal inside. Thousands of em, full of iron filings. You think that kind of thing happens when you are an AWS customer?

If you are sophisticated enough to engage an ODM, build your own facilities, and put hvac and electricians on 24-hour payroll, go on-prem. Otherwise, cloud all the way.


That's not quite where I would draw the line, I don't think. I used to work for an ISP and we were kind of split between AWS and on-prem. Obviously, things like terminating our customers' fiber feeds had to be on-prem, so there was no way to not have a data center (fortunately in the same building as our office). Moving our website to some server in there wouldn't have been much of a stretch to me, at the end of the day, it's just a backend for cloudflare anyway.

Like most startups, our management of the data center was pretty scrappy. Our CEO liked that kind of stuff, and we had a couple of network engineers that could be on call to fix overnight issues. It definitely wasn't a burden at the 50 employees size of company (and that includes field techs that actually installed fiber, dragged cable under the street, etc.)

We actually had some Linux servers in the datacenter. I don't know why, to be completely honest.

So overall my thought is that maybe use the cloud for your 1 person startup, but sometimes you need a datacenter only and it's not really rocket science. You're going to have downtime while someone drives to the datacenter. You're going to have downtime when us-east-1 explodes, too. To me, it's a wash.


I mean, you did want to manage bare metal servers, right?

AWS almost certainly gets batches of bad hardware too. And if your services are running on the bad hardware, you can't have a peek inside and find the iron filings. For servers, this is probably not too bad, there used to be articles about dealing with less enthusiastic ec2 vms since a long time, and if you experience that, you'd find a way. AWS has enough capacity that you can probably get vms running on a different batch of hardware somehow. With owned hardaware, if it was your first order of important database servers and they're all dodgy, that's a pickle; HPE probably has quick support? once you realize it's their hardware.

If your cloud provider's network is dodgy though, you get to diagnose that as a blackbox which is lots of fun. Would have loved to have access to router statistics.

There's a lot of stuff in betwren AWS and on-prem/owned datacenter, too.


> If you are sophisticated enough to engage an ODM, build your own facilities, and put hvac and electricians on 24-hour payroll, go on-prem. Otherwise, cloud all the way.

I imagine the entire sentiment of the comments is because FedEx is one that really should be sophisticated enough.


Not really a meaningful dichotomy.

There is a smooth curve between cloud and dedicated DCs, which has various levels of managed servers, co-location, and managed DCs. (A managed DC can be a secure room in a DC "complex" that shares all the heavy infrastructure of DCs.)

Primarily, the FedEx managers are committing the company long-term to Oracle/Microsoft platforms. Probably mostly to benefit their own careers.

Outsourcing hosting and management of DCs would have been something different, and probably healthier for FedEx and the industry.


> You think that kind of thing happens when you are an AWS customer?

You bet it does! But as the AWS customer you'd never notice because some poor ops dude in AWS-land gets to call up the vendor and bitch at them instead of you. It ain't your problem!


Why do you buy servers with metal flakes in it? No quality controll on your side?


Are you saying that part of the expected savings from going on-prem is that you will have to disassemble equipment bought from major OEMs and examine it for microscopic metal dust?

That doesn't sound like it will save much money, honestly.


They’re saying it’s a surprise to hear that Dropbox doesn’t know what QC and order acceptance means. And it is, I agree. That you spent the time investigating it, implying those servers were in production, is a shibboleth to those of us that know what we’re doing when designing hardware usage that Dropbox doesn’t. It is, however, your self sourced report and we don’t have an idea of scale, so maybe they do and you’re just unlucky.

And no, operators don’t disassemble to perform QC. And no, I could hire an entire division of people buying servers at Best Buy, and disassembling them, and stress testing them, and all of that overhead including the fuel to drive to the store would still clock in under cloud’s profit margin depending on what you’re doing.

You’re of course entitled to develop your cloud opinion from that experience. That’s like finding a stain in a new car and swearing off internal combustion as a useful technology, though, without any awareness of how often new cars are defective.


Many hardware problems do not surface at burn-in. Even at Google, the infamous "Platform A" from the paper "DRAM Errors in the Wild" was in large-scale production before they realized it was garbage.


Filings from the chassis stamper, which yours certainly were given the combination of circumstances and vendor, are present when the machine is installed. If you’re buying racks, your integrator inspects them. If you’re buying U, you do. It’s a five minute job to catch your thank-God-my-career-was-short story before the machine is even energized, which I know because I’ve caught the same thing from the same vendor twice. (It’s common; notice several comments point to it.) Why do you think QC benches have magnifiers and loupes? It’s a capital expenditure and an asset, so of course it’s rigorously inspected before the company accepts it, right? That’s not strange, is it?

You can point at Google and speak in abstracts but it doesn’t address the point being made, nor that your rationale for your extreme position on cloud isn’t as firm as you thought it was. Is Dropbox the only time you’ve worked with hardware? I’m genuinely asking because manufacturing defects can top 5% of incoming kit depending on who you’re dealing with. Google knew that when they built Platform A. The lie of cloud is that dismissing those problems is worth the margin (it ain’t; you send it back, make them refire the omelette, and eat the toast you baked into your capacity plan while you wait).


Are you saing you just buy some server unpack them and throw them into production.....oh man...the lost art of systemadmin, if your system is not stable (in testing) you for sure disassemble it, or send it back. How much money have you lost playing around with your unstable database? Was it more then test your servers for some weeks? Do you buy/build software and throw it into production without testing?

You can test your stuff and be still profitable henzer aws etc would make no money otherwise....you know they test their server much more (sometimes weeks/month)


Did they pass typical memory/reliability tests and so on?


Maybe in the first day's they survive it, but the flakes are 99% from the fans/bearings, that's why you test servers at max load for at least 1 week and HD's for 2-4 weeks.

But i don't think they made even a initial load-/stresstest.

Unpack it, trow it into the rack, no checking of internal plug's just nothing...pretty sure about that.


Metal chips is squarely in the long tail of failure modes that you can't really anticipate (but of course really easy to be smug about in hindsight). It is also extremely unlikely the bearings, most likely these are from chassis frames assy not cleaned up properly.


I had some metaldust and it was from bearings, but op said something flakes and then microscopic particles. Particles = bearings, flakes = chassis or even stickers, but anyway just because of transport you dont trow a server into production without testing and inspection.

I am beeing smug about not testing your hardware as you do it with software....shitty testing is shitty testing, counts for software hardware firmware and everything between. Even for your diesel generator ;-)


I heard tale of a banking centre that had a diesel generator installed by a local company.

Load and simulated power failure tests all passed.

Then some time later there was a total power cut and that's when they realised the generator had an electric start wired to the mains supply.


And there is also the true story when "someone" forgot to fill the tank after 5 years of regular monthly tests, then the real thing happened.

> had an electric start wired to the mains supply.

But that's a good one, humans being humans...but it worked every time before today ;))


Wait, are you saying that an org needs expertise to QC all of the the hardware they procure? How expensive is that? How easy it is to hire that type of QC?

Do you see how these costs all start to add up?


Well, are you saying that an org needs expertise to inspect faulty cars, like, by calling a mechanic?

Is that like too much these days for companies that owb fleets of cars? is opening a server harder than checking whars wrong with a car? like a cable comes loose and that's gane over?


If I procure a fleet of cars I expect none of them to be faulty...how about you?


>I expect none of them to be faulty

So you don't even test the car's, you just expect that the tire pressure is correct, tank is full?

Expect that something "just" works is exactly why pilots have checklist's.

Expectations are the main point for disappointments, you would never do that with software right?


The point, which you seem so dedicated to avoiding, is that "in the cloud" these steps are not my problem. Inspecting a literal shipload of computers for subtle defects is a pain in the ass. Amazon does it for me. When I get on an airplane I do not personally have to run the checklists. The airline does it for me.


>The point, which you seem so dedicated to avoiding

Not true the point was you pay for it (cloud), or you do it yourself (but then do it right, and not like a amateur who build's his first "gaming-pc").

And if you do it yourself you can still be very much competitive vs cloud.


> (but then do it right, and not like a amateur who build's his first "gaming-pc").

Again, still avoiding the point, but oddly enough proving the point. You assume everyone isn't an amateur and knows how to build and maintain server hardware. Furthermore, because the market doesn't have enough talent to support all of the companies that exist, consolidating this to a few vendors who do have the expertise is what makes sense (economies of scale) and is what the market already decided.


>Again, still avoiding the point, but oddly enough proving the point.

Please read, that was my comment:

>>Not true the point was you pay for it (cloud), or you do it yourself

>You assume everyone isn't an amateur and knows how to build and maintain server hardware.

Yes that i assume, correct. Otherwise i would not call it "maintaining", is a amateur maintaining your car? Your software? If you have just amateur's handling your hardware it's probably better to pay a cloud-provider or pay a integrator todo that.


> you would never do that with software right?

Hilarious you used this as an analogy since software development shops are notorious for cutting corners when it comes to QA.


And that's why you have to test the software before production right? ...Hilarious indeed.


> you would never do that with software right?

You facetiously implied that every company fully tests software before it gets to production. Oh boy, do I have news for you...

Note the word "fully" as the variations of what gets tested is so broad, I don't even know where to start to explain this to you.


I never wrote "fully", but you test your software (i hope). Your just try to justify bad work-ethic.

>Oh boy, do I have news for you...

Nah it's ok, just happy that i have colleges with a much better mindset and risk-management understanding.

And i stop here, since you try to change what i really wrote.


>"I believe companies selling bare metal as a service are a happy compromise of cost and convenience, though."

This is what I do. I rent bare metal from Hetzner and OVH. I also have some hosting hardware right at my place. It saves me a ton of money and no I do not spend any meaningful time to administer. All done by a couple of shell scripts. I can re-create fresh service from the backup on a clean rented machine in no time.

As for cloud - if I need to run some simulation once a month on some bazillion core computer then sure. Cloud makes much sense in this particular case. I am sure there are other cases that can be cost effective. Bot for the average business I believe cloud is a waste of resources and money.


If you don't enjoy it then what were you doing working at a datacenter?

I enjoyed server admin, "back in the day", when your servers were pets and not cattle. But of course we have to make tech just as expendable as our workers, business school demands it! What if your pet server gets hit by a digital bus?!


Pet server classes is a much nicer concept anyway. I never liked the instance based personalization. Creating a machine, defining its class, and seeing it become a machine of its class is magical.

Of course, the newest idea is for creating and destroying machines automatically... that outside of the could is quite pointless but people want it anyway. I imagine seeing all that orchestration working must be even nicer than a machine autoconfiguring, but I am yet to see a place where it just works.


One argument for "cattle" servers on bare-metal is security. Being able to reset the machine to a clean, known-good state would clear any leftovers including potential malware. Having machines provisioned from images that include everything they need to run also means you don't even need to grant anyone root access (which you'd otherwise need to be able to audit so they don't leave anything malicious in there).


I, too, enjoy pet servers over cattle. But when important parts of modern life depends on servers, I can definitely see the rationale for cattle.


Hetzner is about 10x cheaper than AWS, give or take.


I love Hetzner and send them money every month, but you do get what you pay for. I don't think I'd like to run FedEx off of Hetzner.


I was actually commenting on the margins that "cloud providers" charge.

If I was a manager at FedEx I'd definitely spend some resources to DYI and send some of those millions over to my direction instead.


> Because, personally, I'd rather retire than deal with Dell, HP, Cisco, fibers, cooling issues, physical security, hardware failling...

This isn’t really a meaningful analysis though. It’s just “when you do things in house there are things you have to do”.

It’s like saying, “I would rather retire than clean the toilets, restock the toilet paper, etc” in a discussion about whether to outsource your bathroom maintenance. Doesn’t tell you what’s cost effective.


I'll be really curious how much change Oxide will bring to the status quo.

The promise is to be able to pay up front for a rack that will function as a highly capable VM, storage, and/or compute host, without any of the overhead that Dell, HP, and IBM bring. Just plug it in and start giving it workloads to do. All config can be done through the web-based management console or via the API, just like AWS.


> fibers, cooling issues, physical security

All of that can be handled by your colocation facility. In most cases you won't ever reach the scale where building your own DC makes sense.

> Then you still need to pay VMWare for a decent virtualization platform

Should still be cheaper than paying the AWS premium including for bandwidth, not to mention that you don't always need virtualization. If all you need the bare-metal for is a handful of machines to do a very specific task that's too expensive on AWS then running directly on the metal is an option (and leave on AWS the stuff that does require the convenience of virtualization).

> I believe companies selling bare metal as a service are a happy compromise of cost and convenience, though.

Agreed. Most companies shouldn't ever deal with hardware directly - just rent it from a provider and let them do the maintenance.


I'm more or the less the sole decider for all tech decisions in my org (I don't have full budget authority, but I tell the budget holders what things cost). I'm 100% on board with cloud and even going further up the value chain to PaaS and SaaS. Cloud is expensive, but predictable. DevOps is very expensive and unpredictable. I can't even keep staff retained these days. Having a fixed dollar cost, even if it's high, saves not only the operations cost, but also the accounting cost and recruiting cost. And not just cost, but risk! Managed services are generally lower risk, and even if they aren't you can buy some indemnity that they'll cover some of the cost of failures.


that part can be outsourced to eg. Hetzner


You are right about this.

One of the visible signs of an unrecognized (and therefore, unresolved) dilemma is oscillation between two poles.

I got this from the late Eli Goldratt and the problem-solving tools he created. One of those tools is the "Evaporating Cloud," which is a way to visualize a dilemma of some kind.

I'd suggest that there is an unresolved dilemma here: Should we rent or own data centers?

Over a period of years or decades, you can watch these things flip-flop between two extremes.

It's interesting to me that this dilemma is kind of like a specialization (in OO terminology) of a more generic dilemma, which might be something like: "In general, do we want to own or rent the things we need to run our business?"

If your turn your head a bit, close your eyes a bit and squint, you can see how a dilemma like this can apply to things like "What should be our policy regarding employees vs. contractors? Should we try to hire and retain over a long period of time, or should we rent the people we need to get the job done and release them as soon as we are done with them?"

The overall point is that whenever you see flip-flopping between two poles, think "Unrecognized dilemma."

Sorry if this is vague.


Often it’s simply the principle of the excluded middle.

If not for the rent seeking behavior of AWS and its cohort, the answer to ”own or rent” would be clear. The answer is “yes”.

Running a data center means you have your own sheep, which means you need shepherds. Shepherds are very useful to have when you have a question about sheep, especially when those questions are about how to manage or use sheep to best effect.

An approachable shepherd can save the rest of the company a lot of money on missteps and bad assumptions, but if you start getting rid of all the sheep, the shepherd will leave too.

We should be using cloud providers for DR, and for regional load balancing. But the company should be maintaining at least one data center of their own, in the same time zones as most of their developers.

I mostly blame Dell and IBM for this. IBM experimented with making server rooms easier to maintain 15-20 years ago and didn’t make it stick. Others ran with some of those ideas. Dell… I don’t know what Dell has done but I know nobody has been writing about it, so from a visibility standpoint they have done nothing.

If/when someone makes it easier (reduced labor) to manage your own servers, the pendulum will swing back.


You're right, I mean, this was the vision for Multics. They would rent computing as a utility to all the businesses, big and small, backed by their computers. At the time it didn't pan out but it was definitely a desire.


I think you are absolutely right, however when they do reverse the decision they will publish a great article about how bringing their datacentres in-house is going to save them $800m a year and a bunch of execs will grant themselves a bigger bonus for saving the company so much money. It's a win-win!


It's quite possible that at today's prices, they will save $400M/yr, and in the future, as cloud vendors raise prices, bringing it in-house will save them $800M/yr.


Not only that, it's possible that cloud vendors don't charge an extra penny but that the their software grows to consume the virtually limitless computing resources available to them costing them significantly more than expected.

It wouldn't take the most creative exec in the world to make a plausible looking case for changing in either direction even based on the same numbers. A bit of wishful thinking about which costs are included or excluded is probably more than enough.


> rent all information infrastructure

Most of them were renting most of the infra anyway:

- Co-location (physical space rented)

- Managed service to keep the lights on (power, back-up generators)

- Leased hardware that gets replaced every 3 years

- Managed service for switch, firewall, etc. monitoring

- ISP, back-up generators, etc.

> but that will likely be many, many years down the road

They're going to explain to their future bosses that buying land, building a huge building, buying servers, figuring out cooling, setting up redundant ISPs, etc. etc. is somehow going to be smarter? Explain that one to me?


You're paying for all of that whether it's the cloud or on-prem. The only real difference if you're a large company is whether you're paying another company's profit margin as well.

If you are big enough to have your own datacenter, you are paying Amazon enough to buy that much physical space, power, bandwidth, IT staff, etc. plus funding Bezos' trips to space.

There's a justifiable niche where you're too small to justify running your own server/below the need of one full-time IT staff where the cloud makes sense, and startups temporarily benefit from the ability to rapidly scale on cloud platforms. But any Fortune 500 transitioning to the cloud is literally just taking their money and burning it, because the alleged cost savings of efficiencies of scale are being completely absorbed by the cloud provider's profit margins. And they're still going to end up paying their own IT staff to handle the cloud management in addition to (by proxy) paying the cloud provider's IT staff to manage the hardware.


> You're paying for all of that whether it's the cloud or on-prem. The only real difference if you're a large company is whether you're paying another company's profit margin as well.

So, then why even provide the distinction of rent vs own if thats the case?

> you are paying Amazon enough to buy that much physical space, power, bandwidth, IT staff, etc.

And you're also sharing the costs with other people. Clearly FedEx is saving $400M/year while also funding Bezo's trips to space, no?

> But any Fortune 500 transitioning to the cloud is literally just taking their money and burning it, because the alleged cost savings of efficiencies of scale are being completely absorbed by the cloud provider's profit margins.

This is literally not the case without the details so stop speculating. Every F500 should (and probably is) be doing a rent vs own calculation and determining the TCO, and then making the decision from there. It's not unilaterally the case, it has to be assessed.


> Clearly FedEx is saving $400M/year while also funding Bezo's trips to space, no?

I wonder how much they'd save if they rebuilt their infrastructure with own DCs. There are huge savings in decommissioning mainframes, switching to free software and automating more. At the scale of Fedex, I don't think they'll spend much more on hardware and ops than AWS.


The profits by cloud providers are still limited by a competitive environment. Multiple cloud providers have a strong incentive to compete and keep prices low. It's much more difficult to switch from in-house data infra so they aren't as focused on efficiencies.


Today I learned that economies of scale and specialization are not things.


Please whenever you envoke any model, have a sense of scale or region of applicability of the model, no model is true in all cases unless you're religious. The point the GP is making is at the scale of a F500 it starts to make less sense especially if you already have the infrastructure, the expertise, the experience, and so on. AWS is going to manage your VM better than you at your 3 person start up, but the economy of scale / specialization arguments work less well when you have an experienced workforce and developed system in place already and you're one of the world's largest logistics companies.


You're ignoring the massive ecosystem of vendors, managed service providers, colocation facilities, consulting firms, etc. whose profit margins are paid to manage most on prem deployments. Very few entities are doing it all with in-house staff.


There are a couple comments pointing this out, but that's not unique to on-prem: Cloud providers are also buying hardware from vendors, employing contractors, buying or renting facilities, hiring consultants, etc. All of that you are going to pay for either way.

The cloud just offers an additional middleman that also gets a profit margin.


Yeah but you pay one invoice instead of 100 invoices, you have one sales representative OK maybe a couple to deal with instead of 100s.

You get your support in one place instead of Hodge pogge of ... yeah we rent server from X but software that runs on it is from Y and in reality provider Z is support so now C has to agree for physical access to the server and now align stars to get all 4 working together instead of shifting blame around to fix anything.


Sure, you pay an Amazon employee to manage that for you instead of paying an employee of your own. Either way you're paying for that, and with the cloud you also pay for Bezos' spacecraft fantasies.

And you hope the Amazon employee makes decisions that are favorable to your business, even though they do not care about it.


I feel like everyone in this discussion is missing the forest for the trees, especially given the way most public company executives think and operate, which is quarterly.

The problem with building out a new state-of-the-art datacenter - or several of them - is the enormous capital expenditure you've just put on your company's books, not to mention the operating expenditure of all the people that will be required to run it. Yes, it's true that as time goes on, you can claim some tax advantages in the form of depreciation, etc. on some components of this new huge outlay, but at the end of the day, when the company misses quarterly or yearly expectations from Wall Street and the stock price takes a 10% hit, the "blame" gets squarely laid at your feet - you are, after all, "the problem". You spent a shitload of money provisioning future resources for the company's (expected) growth.

Meanwhile, in Cloud Cuckoo-Cuckoo Land, your rival has migrated all or nearly all existing infrastructure to the cloud, thus including a significant op-ex, but not nearly as large as the enormous cap-ex you've just incurred. They're hailed as a hero. A goddamned visionary! Look at all that money they're going to save the company! Nevermind the fact that they explicitly instructed the IT department's head to provision only the necessary resources for current operations, after all, the "promise" of the cloud is that you can just spin up whatever you need in a few minutes, anyway. And besides, the cloud deployment of all the company's servers and the necessary expansion won't incur a significant turning of heads until long after our visionary executive has jumped ship to another company for more pay, a corner office, and a better stock compensation package. Best of all, he say that he saved the company $XX millions of dollars over your plan, and it can be said legitimately, even though it is, of course, inaccurate.

If you're huge, it's never cheaper to farm out the administration of your critical infrastructure to qualified experts. But because of the dominance of the Quarterly Report Cycle in modern business, this gets swept off by the wayside as "outdated thinking".

I say this, by the way, as an IT professional building out cloud solutions for companies. For a lot of small-to-medium sized businesses, and startups especially, the cloud makes sense. If someone at Federal Express thinks the cloud makes sense, they're looking out for themselves, not the company.


> benefit from the ability to rapidly scale on cloud platforms

Big companies need this too. Some team wants to spin up some new service to try out some idea? In cloud-land they push a button. In "rack & stack" land they have to wait for Ops to purchase the hardware and provision it.

Cloud makes it cheap and easy to try out new stuff.


Assuming you're doing dedicated machines for services. My company runs on-prem with a big cluster scheduler & maintains headroom in it; small deployments of new services and modest scale-ups of existing services don't require explicit capacity requests. Only if you're going to provision a huge number of instances do you need to wait for infra to buy machines. Which also requires advance planning with the "elastic" cloud anyway.


As any cloud, you should always have spare capacity on-prem. Spinning up a VM on-prem doesn't have to be slower than in the Cloud. At this scale, you're just your own cloud provider.


“Paying for another company’s profit margin as well”

All transactions, no matter if in the cloud or on prem, go towards another company’s profit margin.


I don't think it will be a future generation of executives; I think it's the current generation of executives at companies and agencies like Google, Amazon, Microsoft, and the NSA who will not rent all their information infrastructure. I agree that the executives who thus turn their companies into sharecroppers on land owned by Microsoft et al. will be richly rewarded despite the misfortune that befalls their stockholders; like Stephen Elop, if things go badly, they have an impeccable excuse.


Thanks. I think it will have to be a future generation of executives whose egos won't be tied up in old debates and who therefore will be more willing to sacrifice old decisions. If things go badly, that future generation of executives will take over sooner.


Some might return to on-premise data center, but most will not.

It's really just like the utilities these days, 98% people will not dig their own well and install some sewage system and buy a generator, but big companies can still opt to have them on-premise installed, even though some of those are just backup systems.

there is no return for the cloud for 98% of us.


You generally can not go from city water/sewage to on-prem for legal reasons. However, many are going on-prem with solar. I am going almost completely off-grid for cost and SLA requirements.


What's interesting is people seem really excited about installing rooftop solar, but I see fewer companies building collectors in places with cheap land and ample sun. That tells me part of the selling point of rooftop solar is it feels good, and the economics might not actually work.


> but I see fewer companies building collectors in places with cheap land and ample sun

You mean there are fewer companies making large investments than smaller ones?


> part of the selling point of rooftop solar is it feels good

As someone who's been building an off-grid solar system this is largely the case. I've spent thousands on panels and batteries, my electricity bill is only a few hundred a month but we have frequent outages during storm season. It's definitely more of a feel good and control thing. If I consider the cost of running the electricity to our remote location, I'd probably barely break even in 10 years if I'm lucky.


If you spend US$6000 on panels and batteries, your electricity bill is US$300/month, and the outages and power lines cost you nothing, then your payback time for the off-grid solar system would seem to be 20 months, under two years, not 10 years. The outages and power lines would seem to shorten the payback time further, not lengthen it. Without disparaging the feel-good control thing, which I think is reasonable and important, I feel like I must be misunderstanding something about your explanation.


The economics favors distributed storage rather than generation.

I worked in a building during rolling blackouts in California. The Tesla Powerwall on the building was "screaming" (ultrasonic harmonics from piezo components) 24/7. The security guy actually came to get me to ask if the thing was going to explode.

Clearly, the building owner was load shifting and making quite a bit of money from it.


> The economics favors distributed storage rather than generation.

They shouldn't, though. If powerwalls are that great, it should be more cost effective to install giant banks of them in cheap warehouses outside major metros since you'll have economies of scale on installation and maintenance. Utilities are sorta getting into this game, but it's more common at the individual level, despite being more expensive.


> If powerwalls are that great, it should be more cost effective to install giant banks of them in cheap warehouses outside major metros since you'll have economies of scale on installation and maintenance.

This works only if the electrical grid has the ability to consume back your stored power. Most electrical grids in the US would have problems doing that.

For example, UCSD has their own generating facility; however, SDG&E was sufficiently backward that UCSD was unwilling to go through the grief necessary to put energy back into the grid. Therefore, UCSD only does load shifting or disconnects the campus from the grid during rolling blackouts.

Because of the poor energy grid transport, storage batteries make the most sense when you can consume the stored energy locally to minimize your grid consumption during high energy prices--ie. you have a high-rise office with lots of air conditioning or a manufacturing facility.


There are definitely countries and US States where the economics of solar don't work.

Luckily, it does work for most of the US, especially desert/arid areas.

> but I see fewer companies building collectors in places with cheap land and ample sun.

They are plentiful in Turkey and some southern EU countries like Greece/Croatia, I believe. Not sure about deployment in the US.


Captured Solar (electricity in general) doesn't travel well over long distances.

Generating solar for local use saves the cost of transport.


> electricity in general doesn't travel well over long distances.

High-voltage line losses can be as low as 3% per 1,000 km. That should be easily offset by the better location and better angle/tracking.


It's amazing there are no other voices in the room pushing back against this. It really makes you wonder how companies even function at all, much less make money. Being so short sided never ends well.


Actually, I don't think the decision is that shortsighted in this case. It could work out pretty well for a decade or longer. IMHO FedEx is likely to have a good run before the costs and risks of this decision start outweighing its benefits.


However, when the situation starts to become unbearable, how long will it take to rebuild the infrastructure and source talent? Another decade? Adding the startup costs for any possible rebuild, you're apt for a substantial net loss. So you're probably going to shy away from such a reintegration, while losses accumulate further, until this becomes unsustainable, putting the entire corporation at risk.

(The wise thing to do to mitigate financial impact may actually be starting the reintegration process right now.)


Will it? If they do this right it won't. There is a large cost running a data center. So long as long as there is competition they are better off with experts doing computers while they focus on logistics which is what they do well.

Of course they should make some effort to ensure there is competition. That means they are careful to ensure more than one provider exists even if it means taking a slightly higher priced option. Also contracts end early if the company is bought out. (that is if AWS, buys google cloud - see above about ensuring there are competitors in the market)


> Of course they should make some effort to ensure there is competition.

My guess is, we'll rather see some consolidation, like in every other segment. Which also means considerable expenses for the remaining players, who will experience increasing need to regain some by pricing. As theses players are sharing roughly the same boat, this will show quite naturally some characteristics of an oligopoly. At the same time, self-managed infrastructure will have become a rarity and costs of reintegration are prohibitive, which should allow for some elasticity in market prices. Also, any investments towards reintegration won't show results during the turn of current management, minimizing chances for this to happen, yet again. (This is not a level playfield anymore.)


$400m saved today in exchange for $500m 10 years from now sounds like a good deal financially!

Sometimes tech debt can be used instead of financial debt!


Mind that if you'd ever wanted to integrate again, this means buying land, planning infrastructure, building it, planning, obtaining and setting up hardware and software, hiring, training, defining and testing procedures, etc, as you're starting from zero again – and all this while the costs, which forced you to consider this move, are piling up. Odds are, you'll never do this and cloud providers will know this. The comparative costs are now not those of running your own infrastructure, but those of setting it up (again).


FedEx had very few on site people managing the hardware, it seemed like most work on hardware was done by the suppliers. There were also very few servers there and actually running, I think the software powering the enterprise took up a lot less than they expected.

Either way, I think you're spot on with the "training" part. FedEx was one of the first companies to raise significant capital and pursue a business dependent on technology. It also treated its people very well so relatively few left since the 70's. I think they just didn't train the next generation enough and they realized that those skills are disappearing rapidly because no college grads want to learn old & boring stuff and everyone who does know it costs a lot to keep (FedEx still heavily relied on COBOL when I left ~5yrs ago).


I deeply appreciate you sharing this context. What do you think about my wild predictions of disaster?


> how long will it take to rebuild the infrastructure and source talent?

How costly is it to maintain that level of talent? Do you think top server admin talent is screaming to work at FedEx or Amazon?


There are always going to be short sighted decisions. But a good design that would abstract away the fact that you are running on AWS or on-prem would pay dividends.

With a good design, there is always that implicit threat that they could move back to on-prem with little effort.


Convincing people future risks outweigh the short term gains is very hard. That goes for all things.


There are very likely voices, but they didn’t win this battle.


The optimize for the near term and after a while some other company optimizing for their near terms as startups stumble upon an awesome new product-market fit and upend the previous companies.

There is no grand plan, there's just evolution.


Some are starting to question the almighty Cloud™: https://www.economist.com/business/2021/07/03/do-the-costs-o...

One of the things that bother me about AWS is that I still need to manage their risk of a data center going down by using multiple availability zones and managing the complexity of extra resilience. There is a conflict of interest: AWS has a perverse incentive in keeping a single AZ robust and unless there is a natural calamity, it should not go down.

I thought one of the major reasons of going to cloud is that you need not manage risk and offloading it from on-prem.


I disagree. This isn't FedEx's core competency or a differentiator for them. They're also not at a scale where it could make sense to colocate or build their own DCs. Fiddling with low-level tech infra would just be a distraction.

Now, they might find that software is a differentiator for them, but that's different.


For a logistics company, the IT system is the differentiator. Reliably routing and tracking packages to maximize utilization of trucks and planes allows them to be cheaper and faster. That all depends on both hardware and software. So I'd definitely think that running their own systems should be a given.


They are doing the rational thing and there is no going back.

HNers need to get their heads around this: the economies of the could represent a fundamental, secular shift.

Imagine if Fedex designed and made their own delivery vehicles. Then they 'outsourced' that to Ford/GM. You might say 'look at Ford's amazing margins, look at the money being left on the table' - but in reality, it's still more cost effective to 'buy vehicles' the to 'DIY' them.

In the 'long run' those Azure/AWS margins will erode - and/or - the long tail will creep up on them. Basically data centers offering less, for less, and companies realizing they don't need the complexity of AWS do do basic things.

What that will look like is hard to say.

With hardware it meant 'pivot to Asia'.

Will this happen again? Chinese companies creating large data centres in the US with minimal staffing, monitored and operated out of China?

If we were not in a giant geostrategic kerfluffle with them, then yes, I would say that is the future. But given security issues etc. it's probably not.

Can India do that? Maybe.


Ford and GM and Freightliner are not in a position to use price discrimination to confiscate FedEx's entire profits; if they try to sell trucks to FedEx at an inflated price, FedEx can just buy the same trucks from dealers, or buy them used, or buy GM trucks instead of Freightliner trucks.

By contrast, generally speaking, cloud providers are in an excellent position to use price discrimination to extract the entire surplus value of every transaction. And the flexibility and reliability provided by FedEx will to a significant extent simply be that of their computers.

(As chrisseaton pointed out, Boeing is in a cloudier position.)


So 100% yes, AWS has much more power in the value chain than Ford. But not really.

Especially to the extent that special services are not leveraged, then there are cheaper alternatives. A lot of IT is just 'instances' not 'fancy cloud queues'.

AWS pricing is very transparent and people can make the decisions they want. Generally speaking, AWS does not price discriminate on the basis of 'business model' and their margins are meaningless with respect to the overall business efforts of a Big Co.

I mean - unless you are doing big AI crunching, or 'free hosting content' - then frankly the AWS bill is not going to be a huge line item relative to overall operating costs.


Fedex isn't building their own vehicles, same as they wouldn't build their own servers. But going cloud is more like outsourcing all transport activities to other logistics companies. No one says they have to do every part of DC ops themselves, but outsourcing everything and using proprietary products makes them very vulnerable.


Sounds a lot like farming out NOC-Ops to Wipro. Somebody had to think it was a good idea & managed to sell it to the policy makers for a nice promotion. So, when reality hits, does the same person get demoted further down than their original ladder rung?


They are gone by that point, with a padded resume to boot :-)


We have our own datacentres. Lucky enough to have got them up and running pre cloud era. If we were any later we would be full cloud but the reality is we outcompete everyone on price in our space because of it.


Tragedy of the executives


> that will likely be many, many years down the road

And this is where people in my age range (20-30) will get to swoop in if we invest the next 10-20 years learning about and tinkering with server hardware.

I highly recommend it for everyone.


This is one of the things bad about Wall Street: Short-term decision-making. Everybody cares what happens to stock prices in the near-term, rather than 10 years out.


Great concrete example of the principal-agent problem here.


At least in Oil and Gas. Servers moved back into the personal towers for the first time in a decade. Latency issues.


AWS Outposts ?


Ah did not know about AWS outposts. They use Azure so hybrid setup and moved the latency sensitive servers back into the tower everything else lives on Azure.


And they will gently depart under their beautiful golden parachutes. What a racket.


reminds me a bit of:

https://www.reddit.com/r/AskHistorians/comments/vqr30e/jack_...

"Jack Welch extracted record profits from GE for 20 years, but left it a hollowed-out "pile of shit," according to his successor. What exactly did Welch do that was so damaging, and how did he get away with it for so long?"

from the post reply, in case it's not available from the URL:

alecsliu · 2 days ago · edited 2 days ago Gold2Eureka!Bravo!Today I Learned

Welch took over a GE that was at the time, a major company. At the time, he viewed GE as bloated and needing to change. While he might've been right about that, the approach he took was perhaps less ideal.

One of the worst things he helped make commonplace among American companies is the concept of stack ranking. The way it worked was like this: people were divided into three groups: A, B and C. A's were the top performers who needed to be rewarded generously, B were adequate performers who should be allowed to stay, and then C were those who needed to be fired.

So far so good right? Well, not exactly. For Welch, these three buckets could be separated into the top 20%, the middle 70%, and the bottom 10% (20/70/10). Based on the above points, the bottom 10% would thus need to be fired annually.

In the short term, this helped in making the company lean and look more productive, increasing the bottom-line and portraying an image of success. However, the long term consequences of such a change was cultural degradation and the introduction of new bloat and waste (running all of those performance reviews and firing and rehiring so many people takes a lot of money.) Consider the case of a perfectly adequate team: the entire team has achieved their targets and has contributed to the company as per their job description. The issue? In stack ranking, 10% of this team would still have to be fired even though everyone did their job. Unsurprisingly, the introduction of this competitive atmosphere where it isn't enough to succeed, one must be better than their peers, results in backstabbing, competition, and a host of issues which eventually weaken a company's competitive edge.

This was only a part of Welch's general treatment of workers as numbers rather than humans. Aggressive cost-cutting, offshoring, etc. were the norm under Welch's regime and he would destroy entire divisions. This again, was great in the short-term but bad in the long term. Welch clearly had a dim view of company culture and believed it to be unimportant.

The other major issue of Welch's was the acquisition of hundreds and hundreds of businesses, as part of Welch's goal of acquiring his way to the top. On the surface, this is what Welch did; on paper, he didn't destroy the main profit makers of GE so much as create new ones, primarily in the form of its financial arm.

The result was that this allowed GE to play with the numbers in a way that allowed him to make sure that GE was always meeting targets set by Wall Street; in order to generate the right earnings, simply buy or sell certain assets, write-off others, etc. and when your company is an acquisition machine, it's not too hard to find the right numbers. This process helped expand GE into a mega-conglomerate but again, ultimately left the company in a weaker than expected state. In particular, on paper the core business of GE became its financial arm, as that was where all the funny business with the numbers was happening (nothing explicitly illegal though). Ignoring the damage the Welch did to GE's profit centers through his horrible business practices, this was a huge part of why GE declined so rapidly in the years following. GE's valuation was based on an inaccurate picture of its profits and value, so when the truth started coming out (especially with the Great Recession collapsing financial services profits), GE was quick to follow.

As for why Welch was able to get away with it all? Well, the answer was because he was delivering. He hit the earnings targets, he made the board and shareholders very happy, by all accounts GE was the paragon of success and everything was going right. What Welch did was unprecedented, and it's hard to really understate that. For all of his faults, he had America tricked into believing that what he could do was unique and that he could avoid the realities of the economy and cyclical markets, that no matter what was going on, GE was special. Welch died a rich, rich man and many, many people profited greatly off of GE during its 20 year bull-run.

Edit: My off-the-cuff writing is always horrible wrt to grammar and structure so I'll probably edit it later haha


You can say the same thing for hundreds of parts of a company like FedEx.

For example FedEx flies Boeing jets. It's completely reliant on Boeing for parts for those jets. Presumably Boeing can charge whatever it wants.

Your company is always going to be reliant on other suppliers and contracts. Unless you're planning on building your own independent country in some location that has all the natural resources you need.

Non-argument.


> For example FedEx flies Boeing jets. It's completely reliant on Boeing for parts for those jets. Presumably Boeing can charge whatever it wants.

Nobody said that FedEx should build their own server hardware. And that's what you are comparing it too.

FedEx could just buy server appliances from e.g. Dell (buying the Boeing jet) and operating it. Because paying some other air cargo company will eat a lot of their margin, the same with Cloud infrastructure. They are not a startup which could better invest their time in development, they can without any problems hire some people to administer their server fleet. When they switch to a cloud they will likewise hire some engineers only managing e.g. AWS to administer it.


I'm simplifying here, but it feels to me like you are making the assumption that FedEx is a mostly static business, whose IT needs should be all about minimizing the cost per IO or compute operation. In the real world, business needs are rarely static, and moving fast and innovating is extremely valuable, even for a company as large as FedEx. They are choosing to move resources from managing their IT infra to AWS, but what they're really gaining is not a reduction in labor costs or CAPEX, but rather the ability to move faster. Sure - given some headcount and sufficient CAPEX, a good engineering+SRE team can create and maintain a nice bit of infrastructure, but it is a significantly harder goal to truly deliver the benefit most leading cloud providers can provide an engineering org.


Probably the right amount of rented data center capacity for FedEx is not zero, yes. But it's not 100%, either, because what they're giving up is the ability to move faster when their outsourced system administration vendor isn't meeting their needs.


Most large companies do at least some elements of multicloud. This may be as simple as doing ETL in AWS and then pushing the data over to BigQuery for dashboarding, it might be using all AWS and also Office365, it might be building applications which part run against Dynamo and part run against Cloud Spanner. It's also the case that the applications you run each year have some turnover rate. Maybe you're running Jira this year, maybe you're switching to Asana in a couple of years. Maybe over a 5 year period you're moving from Teradata to Snowflake. Which cloud env do you deploy Snowflake to?

Your negotiating power is then in the momentum of change. If GCP is working better for you than AWS right now, send your new spend to GCP where possible and where stuff is rotating out, move the new stuff to GCP. If you just got a good deal from Azure, start moving in that direction.

This is one of the 'big company vs small company' things, by the way - it doesn't make as much sense for a 100 person startup. But FedEx are a 300k employee juggernaut who spend $75B each year servicing $100B of revenue. They'll have a lot of tech in use.


The optimum is very likely near one of the extrema. Either 0% or 100%.

Anything in between will just mix all the problems of a cloud with all the problems of running a datacenter.


I host a lot on AWS but you will need as many people in IT support as before. You can use consultants for a time but after a while your infrastructure will disintegrate because nobody knows about the intrinsic properties the infrastructure of your company.

I believe the self hosted options generally offer vastly more flexibility and can innovate at a faster pace. But only FedEx will know if it is a good decision, perhaps their infrastructure was indeed not competitive. But I heavily doubt they will save 400m in the long run.

AWS support is actually awesome and you usually get responsive within a day, even smaller businesses. But cloud hosting is only useful for certain scenarios. In this case it is Azure and I believe Microsoft currently innovates far slower than Amazon. Of course Amazon could be a direct competitor to FedEx in a few years too given they do more an more logistics. If they aren't a competitor already. In this context I would understand the decision to close their data centers because it is hard to compete with Amazon in the cloud space... a computing alliance with Microsoft would make sense.


>I host a lot on AWS but you will need as many people in IT support as before

This doesn't pass the smell test at all. Datacenter ops are way more complex than AWS ops; why would someone on cloud need IT support for VMWare, BMC Ops, Power management and generation, supply chain management and land leasing/renting? These are the few roles, off the top of my head, that just go away when moving to the cloud at FedEx's scale.


This is because most people insist on not acknowledging that running your own datacenter will involve "mundane" tasks as "sourcing bolts on a Sunday before Thanks Giving". For a great deal of the people I spoke to, "running your own datacenter" means "we hired 3 servers from a thing in London and we had 1 guy installing linux on them".


Well, I didn't really meant data centers, just in general on premise infrastructure. There is a difference of course. Our administration is almost completely in the cloud, but for production and some adapted applications we still need local data. There is simply no technical solution to run everything in the cloud and we also want to retain some data locally. VMWare is an example. UPS and backups are another. Sure, we use service providers for provisioning of new VMs but IT still needs to do maintenance of the software that runs on it, needs to keep inventory about the servers running in the company. And if you have one VM running additional provisioning is not too much work.

But sure, at the scale of FedEx this is true, they also probably have a very high percentage of administration only. Although I still think the cooperation with MS was still a strategically planned after paying attention to Amazon.


Probably not AWS, given that Amazon is a major competitor to FedEx in the logistics sector.


Boeing jets are a lot more like IBM mainframes than they are like Dell servers, which are still mostly commodity.


It's true, of course, that every company is deeply reliant on its web of suppliers and customers, and it is common for a company to have a single supplier or customer that has enormous power over it. But I think you may not appreciate the impact of that situation.

FedEx might have a viable alternative to Boeing for repair parts, but not for planes, and more broadly I think you could make that argument about US airlines in general in the late 20th century: despite things like the MD-80 (before the merger) they really had no alternative to Boeing. It is perhaps not a coincidence that the net profits of US airlines in the late 20th century were almost exactly zero, while Boeing was and is one of the most profitable companies in the world. So far Boeing isn't squeezing FedEx that way, but not because it can't.

(I do note, though, that FedEx owns its own jets rather than renting them.)

But I think there's a different sense in which informatics are more core to FedEx than even flying. FedEx is a corporation, which in its essence is a set of business processes: relationships, practices, and information; unlike a 19th-century railroad, its physical assets like airplanes are relatively expendable by comparison. Historically, those processes happened mostly in people's heads, especially the heads of its managers, but today the vast majority of FedEx's business processes are automated, which means they happen in FedEx's data centers.

And whether those automated business processes happen at all, and how well they happen, is a matter of competence in informatics. Dan Luu argues convincingly that Twitter's kernel team has been key to their ability to execute in https://danluu.com/in-house/; FedEx needs not only data centers but a kernel team and a convex optimization algorithms team.

It's extremely common for companies that are deeply dependent on a single supplier or customer to end up either merged into that other company or bankrupted by the situation.

As for the country, when I set it up I'll be needing yeomanry regiments. You in?


The better analogy would be that Fedex, since they need to transport things from city to city, could choose to buy jets to transport with instead of renting them or contracting a company to fly for them, which they already do. To remove datacenters is more like removing the jets.

Edit: removed rude phrasing.


> The comparison you meant

No that’s not what I wrote.

You don’t buy a jet outright and forget about the supplier - you’re eternally reliant on their service, certification, parts, etc, as regulation is so high.


Yes but you are comparing to having a datacenter vs removing the datacenter completely. In this example FedEx has removed the datacenter completely, instead relying on someone else’s servers. So the correct comparison is to remove the jets and let someone else fly for you.


> So the correct comparison

That may be your comparison but it’s not the one I meant.


I should have said the more appropriate comparison rather than phrasing it rudely as I did; sorry for that.


It is discourteous to claim that someone who says something you disagree with actually intended to say something else.

And believe me, I know a thing or two about being discourteous.


That’s true, I should not have phrased it like that.


Your airplane analogy is apt, but it cuts the other way.

In fact, FedEx likes to buy out Boeing facilities so that they're not "completely reliant" on Boeing for anything.[1]

As for the more general cloud v. on-prem debate, I usually don't like analogies, but the plane analogy is good because jets and logistics-management systems are two things that FedEx needs to handle extremely well to compete.

So, what does FedEx actually do wrt jets?

FedEx both owns and leases them. It mostly owns them outright. Its owned jets are a mix of new-ish to very old (think DC-9s). It also leases and some financing magic (such as some big lease agreements for new 777s a couple years ago).

The reason is that FedEx generally gets excess value from owning rather than leasing, but there are some circumstances where leasing makes sense.

Same goes for "cloud" deployments. The correct answers to cloud "versus" on prem for large organizations like FedEx - "it depends," "that's a false dichotomy," and "it's almost certainly a mix of both" - are neither interesting nor simple, and so those answers don't get execs' attention...or headlines.

For FedEx to brag about taking an extreme position on cloud deployment is breathtakingly foolish. If I were an investor, I'd want to hear something like, "Based on a careful analysis, we've decided to shift certain specific operations to cloud-based systems. We plan to maintain control over the infrastructure and operations of mission-critical systems that have demonstrated resilience." (I'm assuming the latter are systems like the financial institutions rely on, where purpose-built stuff like mainframes using IMS have demonstrated near-zero downtime and darn-near-bug-free software for decades, because FedEx, like banks, needs to handle lots of simultaneous transactions in real time without error or deadlock.)

[1] Example: https://www.ch-aviation.com/portal/news/102874-fedex-to-take...


This is astounding; I had no idea. Thank you for explaining the details. Do you have publicly citable sources for the information on the owning vs. leasing?


> For example FedEx flies Boeing jets

This, in particular, is a bad argument since jets are replaceable and interchangeable.

Possibly, the only thing that is not replaceable in FedEx is their information system.


>Renting your information infrastructure is a great way to reduce startup costs, but down the road, that information infrastructure runs your company. Trying to outsource it is like trying to outsource upper management.

This isn't a one size fits all thing. For a company like FedEx I don't see the advantage of owning their own data centers. They just don't have the scaling challenges that would require that. Data centers or information infrastructure as you put it does't fall within the core competency of FedEx, so I don't think it's a fair comparison to say its like outsourcing upper management. I think its more fair to so that FedEx building their own datacenters is like building their own roads to deliver packages on.

For a company like Facebook or Google, the difference is clear. They need to handle such a high scale of traffic and volume of data that they need custom infrastructure to be able to scale efficiently at significantly reduced costs. The same reason it made sense for them to invest in building their own databases, because the existing options didn't meet the scaling requirements. FedEx won't realistically be needing their own database or infrastructure any time soon.


> They just don't have the scaling challenges that would require that.

I don’t know about that. They are a global logistics organization that rivals Amazon in scale. They are not parsing web scale data like Google, but as far as meatspace data goes they are they seem to be about as big as it gets?


> I think its more fair to so that FedEx building their own datacenters is like building their own roads to deliver packages on.

Your metaphor is a little bit off. This is more comparable to FedEx renting vs owning their fleet vehicles.

Obviously they aren't going to be setting up foundries to to scratch-build engines and network switches. But it probably makes as much sense for them to own the buildings & computers running their logistics software as it does for them to own the buildings and vehicles running their logistics hardware.

As an aside, I'm sure FedEx would LOVE to get customers locked in to private FedEx roads with private FedEx addresses that no one else is allowed to deliver too. Thankfully, our system of public infrastructure is robust enough to make this infeasible.


Fun Fact: They don't technically own their planes, since it would have made them an nice target for a hostile takeover if someone wanted them


This seems... incorrect? http://www.fedex.com/us/about/overview/companies/express/Exp... indicates that they own 696 aircraft, and https://www.reuters.com/article/us-airline-fedex-orders/fede... says that they just bought 50 more in 2015. No mentions of a lease agreement here. In fact, according to Wikipedia, for smaller aircraft FedEx operates a dry lease business where they lease their fleet out to contractors to deliver certain routes:

    In the United States (along with Morningstar Air Express in Canada), FedEx Express operates FedEx Feeder (and for Morningstar, mainline FedEx service) on a dry lease program where the contractor will lease the aircraft from the FedEx fleet and provide a crew to operate the aircraft solely for FedEx


Please elaborate.


Oh really. They lease their fleet of planes? I had no idea!


>>They just don't have the scaling challenges that would require that

Scaling is one of the big cloud advantages. Static Known workloads will almost always be cheaper on Prem than dynamic workloads that need automated scaling

One of the big ways an organization can save money is if they can scale up and down their loads on demand

But if you have a static 24/7 workload almost universally I can build a onprem solution that is be 50% or more cheaper than cloud


>For a company like FedEx I don't see the advantage of owning their own data centers. They just don't have the scaling challenges that would require that.

Depends how seriously they're locked onto the postman/route inspection problem. Which is a factor for delivery companies and, depending on how hard you chase it, a driver for almost infinite compute. Once you're playing that game, the economics look very different to most 'not an IT company' IT needs.


Difference being that FedEx pays 0 $/km for the roads, but the $/GB and $/GHz will be at the mercy of their cloud provider.


> pays 0 $/km for the roads,

That is really not true at least in the USA. There are all kinds of taxes and fees for using the roads with large trucks. Have you ever seen the weigh stations on the sides of the highway in the US? Those are there to fine the trucks for what they are carrying if it is beyond what is allowed among other things. You said km though so you probably are not in the US.


Every state charges their own fee per mile to commercial trucks. Filing taxes as a long distance trucking company that operates in multiple states is pretty annoying.

There's also tax built in to every gallon of fuel.

There are also flat annual registration fees you have to pay as well.


Don’t forget the tolls for various highways, bridges, and tunnels as they’re almost always priced higher for vehicles with more than two axles.


Side point here, but that's a very justifiable cost. The damaged to a road is proportional to the fourth power of the axle weight [1]. That's something that amuses me when drivers complain that bike riders don't pay a "road tax". If they did, it would be a laughably low amount compared to what a car driver would have to pay in proportion.

[1] https://www.insidescience.org/news/how-much-damage-do-heavy-...


It's not true in the EU either, plenty of countries have truck and trailer specific tolls. The average truck on the German autobahn, for example, pays €0.15 per kilometre.


Which roughly covers the actual damage large trucks do per km traveled and in no way covers construction costs.


More or less all EU members should charge freight road transport fees on highways and sometimes other roads too.


Fedex does not create their own electricity or fuel either, yet without those they have no going concern. Yet they choose to acquire those services on the open market because it is cheaper, even though they do make themselves more dependent on the supplier. In the end, full self-sufficiency is just not viable for large companies in a highly specialized economy.

In this case, if a cloud provider gets too monopolistic then the BATNA for fedex is to re-hire (or train) enough sysadmins to run their own systems again. That puts an upper limit to how much a cloud provider can charge for their services, in addition the competition between Azure/AWS/GCP will also keep prices somewhat down. The cloud providers know this and will price accordingly.


I'm not arguing for autarky, I'm arguing for maintaining a strong negotiating position to keep from being taken advantage of. Fuel is fungible, and electricity is provided by regulated monopolies who are forbidden to price-discriminate, so the same issues don't arise as when you're outsourcing your core business processes to Microsoft.

I don't think FedEx re-hiring a competent sysadmin team and rebuilding data centers once they've lost that knowledge is merely a question of spending enough money on the problem; it's the kind of high-risk IT project that often sinks companies.


> Because what's their BATNA? Migrating from Azure to AWS when Microsoft doesn't want to let them?

Uh yes, exactly?

Clouds give small customers the first hit free as a loss leader. With very large customers, clouds have to be price competitive because if you're at a large scale, it is totally worth spending millions of dollars for a 5 year plan to change clouds. The people who get screwed are the medium to large customers for whom the cost of changing clouds is too high to recoup in a reasonable timeframe. The moral of the story is be very small or very large, but avoid being medium sized.


>With very large customers, clouds have to be price competitive because if you're at a large scale, it is totally worth spending millions of dollars for a 5 year plan to change clouds.

You've literally just explained why a large company would be perfectly willing to be overcharged by a cloud provider in order of millions of dollars. They would have to pay that for migration anyway and there is always a risk of creating disruptions in the process.


My model is that changing clouds has a taxi-like fee structure: there's a base cost and a percentage cost. Larger firms can just swallow the base cost. Your model seems to be that the percentage cost goes up as the firm gets larger, which I don't think is correct.


And if all you use is object storage, k8s, and some kind of SQL data lake, migrating between clouds is... actually not that terrible?

It's a huge project but there's not a lot of actual redesigning.


Every one of these large enterprises, including the cloud providers themselves, like stability. So they sign multi-year multi-million dollar contracts that expect a minimum spending and offers a negotiated discounted price.

It’s also not just the cloud, enterprises know they get the best pricing when they commit to a vendor and the vendor also bends over backwards to accommodate these large companies. As long as there are 2-3 cloud providers big enterprises will be just fine.


As a startup, the greatest benifit of cloud compute isn't the fact that I can scale up at the push of a button, but that I can scale DOWN at the push of a button. I want to test things on beefy systems, but I don't want to have to pay for those systems any longer than I need to use them. To me, this is the single best thing about cloud compute.

Once your start up has become an established company and are so busy that you are no longer able to scale down regularly, it seems that's the time to start having your own compute vs continuing to line the pockets of the cloud providers.


Yeah, that's my thinking. Also as a small startup they don't benefit from confiscating your profits through price discrimination; if anything, they benefit from "investing" in you as a customer, to help you get to profitability, while remaining as addicted as possible to their services.


Dangling modifier here: I meant, "Also, when you are a small startup, they don't benefit from confiscating your profits through price discrimination."


But that's only really a change when you come from commodity hardware. For those still on mainframes, what's the difference? Technically the hardware might be charged per replacement cycle instead of monthly, but vendors are just as empowered to demand whatever they want. It's certainly hard to avoid lock-in running stuff on a cloud provider, but I suspect that it is even harder when depending on ye olde mainframe.


I agree, isn't it way easier to switch vendors when running on the cloud? I'd say that encourages competition, since we can literally move the infrastructure using keystrokes instead of muscles.

This makes me think computer infra could potentially become a commodity.


As always, it depends on how you use it.

Most cloud providers do let you run your own software on a managed substrate of infrastructure. But they also let you, nay, encourage you to build on top of their abstractions instead.


I guess you can waste a lot of effort and money on meticulously substituting "batteries included" lock-in features of your cloud provider when the easy path would in fact have been much cheaper (substitute when needed, not prematurely), just as much as you can fall victim to lock-in traps that would have been super cheap and easy to avoid. I imagine that telling one from the other is a form of art in its own right.


I agree, mainframes are almost as bad.


I do get your point, but if they have a fair amount of IBM mainframes, they are already renting their infrastructure. The mainframes have a culture of making you pay for operating system and other pieces on a basis of how much compute power you have "varied on". You don't really "own" a z-series box.


> Ultimately companies that abdicate their informatics operations like this will give their profits to their data-center operators, who will be empowered to charge them whatever price they want.

This "ultimate" endgame has been predicted since at least 2006 and we've yet to see anything but price decreases on cloud services. Tons of labor has been invested into deliberate cloud agnosticism with no apparent results. I am fully on board with economic arguments that favor data centers over cloud for some organizations. I don't think the capture-and-increase argument holds evidential water or ever will in a competitive environment, and the environment is more competitive than ever.


To add to your point I think the competitiveness will increase over time with further technology innovation. Cloud providers are constantly adding features and trying to catch up with each other. New entrants are adding novel competitive dynamics as well (e.g. Cloudflare or Deno Deploy).

The cloud market is so large ($ TAM; also growing rapidly) I think there will always be a tailwind of investment and innovation that prevents monopolistic stagnation.


it was exactly this kind of thinking that lead to Japanese automakers crushing American ones

vendor relationships do not have to be hostile. AWS has been around for a while and has never raised prices, some services have gotten 99% cheaper.

yes there's some hypothetical day where they say "tricked you, all that stuff about margin being opportunity was a lie" and they try to extract value short term while burning their reputation

but every day that goes by where this doesn't happen is $$$ wasted on trying to avoid it


Hum... The US business all already converted from the GP's mindset, and the US automakers still need to be rescued from time to time.

I'm betting it is not this kind of thinking exactly that causes the problem.

(Oh, and do take a look at the vertical consolidation of the Japanese industry.)


> that information infrastructure runs your company

This is silly. There is a lot more to FedEx than hosting physical compute infrastructure. FedEx doesn’t own all its own airports. A CSP is just another vendor.


At almost every airport that FedEx has an operation at, there is another airport within 100 miles that could be used instead, managed by a different company. The barrier to changing is not controlled by the airports.

That's a competitive environment.

How many major cloud providers exist, and what could they do to make it difficult for FedEx to leaving for another one?


>How many major cloud providers exist

Plenty! AWS is HUGE but hardly a monopoly.

>what could they do to make it difficult for FedEx to leaving for another one

If the guys running this at FedEx are smart, not much. There are ways to deploy all of this in a platform-agnostic way.


AWS, Azure, and GCP are all operating at large scale and compete with each other directly. There are other providers too. It’s definitely a competitive environment.


The "real world" counterpoint to this somewhat abstract notion is that even if FedEx wanted to build on-prem, there is not enough cloud ai talent in the universe to meet future demand. It's not just their pref for academic sanctuary, it's that there simply aren't enough cloud architects & AI PhD's with experience at AWS scale. It's hard. Choice is an illusion. GCP AutoML, Tableau, Bubble. Way simpler to train fresh grads via cloud native tooling, than from the cli (the way we did things) ;)

I think the digital transformation to watch is the US Government's IRS Tax Cloud. Never been into taxation law & policy wonkery myself, but Tax Cloud is purported to Save Trillions in TCO!


https://twitter.com/cynicalsecurity/status/15407428422381281...

> Clown Computing is an elegant, distributed, breach of trust.

> Take customers, as many as you like, convince them to hand you all their data, the management of said data, the authentication and, while you are at it, the DNS. Oh, and they pay for it too!

> If this was done in real life it would be called a robbery, possibly at gunpoint, definitely in broad daylight. It used to be the case that Oracle licensing was deemed the pinnacle of IT robbery ('90s) but it looks positively quaint now.


The transfer of various liabilities is omitted from the above tweets, which is clearly worth something to some people.


What liabilities? Cloud operators are breached on a monthly basis and there are no repercussions.


What makes you think F500 companies don't get breached all the time?


Exactly. I know if a large bank that went to the cloud after one of its infrastructure employees walked away with data on a USB stick.


Could be as simple as the decision maker being able to point finger at the vendor. Could be PR related, where media does not report your company’s equipment was compromised. Could be reduced exposure to getting locked out of your own system and having to pay a ransom.


Absolutely, companies outsourcing their data infrastructure to the big 3 will end up having "supply chain" and Intellectual Property issues with their data in the same way companies that outsourced too much of their supply chain top SE Asia / China did.

But by the time this becomes a problem the executive team will have cash out nice bonuses from the short term gains from cost savings...


That’s an incredibly silly take. Our whole economy is based on specialization. FedEx will never be as good at running datacenter as Amazon or Google. There’s room in the system for the cloud operator to earn a profit while still offering the product at a lower cost than anyone else can achieve in-house.

Apple fairly notoriously hosts iCloud in other company’s clouds, and they’re a computer hardware and software company! If you’re less sophisticated than Apple, it’s a no-brainer to go with clouds.


The part about Apple is not accurate. Apple operates several dozen datacenters, has spent north of $30 billion building them and expanding them in recent years, operates their own Kubernetes (was Mesos) cloud platform that looks a lot like Heroku internally, and leverages public cloud for a couple parts of iCloud as a redundancy play, and no more. Maps helped prompt the expansion from single-homed iTunes legacy datacenter strategy, but even that iTunes legacy has been online since the early 2000s. Note that my context ends almost a decade ago, so, there’s that. The fleet dedicated to Maps alone when I worked there was large enough to be its own FAANG/MAMAA.

They’re in the datacenter game long term. In no universe does Apple “host iCloud in other company’s clouds”. Apple is notoriously a control freak and would never place their strategic services at the whim of cloud AZs. That idea was proposed and quickly eliminated. (Believe it or not, it is possible to spend that much on Azure/GCP and still only use it for basically blobs. How do you think they moved so easily in 2018ish?)


While my take may be incredibly silly, jwb, it is a different take than the one you're arguing against, and in some sense diametrically opposed.

My argument is not that specialization in informatics operations is unprofitable; rather, it's that specialization in informatics operations is indispensable for maintaining the negotiating position required to earn a profit. I'm not arguing that there isn't room in the system for the cloud operator to earn a profit; I'm arguing that after committing themselves completely and irrevocably to cloud operators, FedEx will no longer be allowed to earn a profit, or rather, just enough profit to keep their stock price from collapsing.

Now, it may be that the advantage of specializing in informatics is so powerful that this situation is unavoidable, and the only viable option for companies like FedEx is to become, in a sense, franchises of Amazon, Microsoft, Google, or Baidu. You remember when this happened to companies like Intergraph in the 01990s: they were reduced to undifferentiated Microsoft resellers. And it's not clear they really had another option. I'm not confident that the situation is currently so extreme, but maybe it is.

As other commenters have noted, you were pretty comprehensively wrong about iCloud, but that's a minor supporting point.


I think it depends. There is certainly a long-term cost risk, but future costs are unpredictable and it is not likely a core competency of FedEx anyway. FedEx likely will never invest enough to be great at managing their own data centers so why not remove the complication.


> companies that abdicate their informatics operations like this will give their profits to their data-center operators

Is the expectation data centre operators will see margin expansion? (Genuine question.) I had thought data centres were increasingly becoming commoditised.


Yes, a cloud vendor is a company that has literally all of your company's data and a plethora of cloud services that are subtly incompatible with their competitor's offerings, and their machines make the millisecond-by-millisecond decisions about what your company does. They are in a position to calculate precisely how much you can afford to pay them without going bankrupt, and charge you $1 less. Oracle has aspired to this business model for decades, but as you can see from the fact that many of the other Fortune 500 companies still have profits, hasn't been entirely successful.


The exact same thing can be said about the IBM mainframes they are replacing. Which is a much older business model.


Yes, running your business on IBM mainframes is a very expensive mistake, for the same reasons; but even IBM mainframes don't have the degree of price-discrimination power that a cloud provider does.


> cloud vendor is a company that has literally all of your company's data and a plethora of cloud services that are subtly incompatible with their competitor's offerings

Balancing vendor lock-in against efficiency is the remit of a half-competent CIO or equivalent. Going cloud doesn’t require using every niche AWS feature.


In a competitive market, margins are dictated by marginal cost of production of you and competitors.

Data center/SaaS is somewhat naturally monopolistic due to cost of switching though. Legislation should target "high cost of switching" areas to make them more competitive. Translates to lower prices, better for society in the end


On mainframes, they were already giving up profits and flexibility to IBM and that small ecosystem of related vendors.


You could make the same argument for companies that don't own their own buildings, their own land, or have vendors of any shape or form.

That is, all companies ever.

The opposite of what you're proposing is "do your core business".

Why would fedex be better at running computers than Azure is?

Why does FedEx use vans built by a car company? They should build their own vans. The tires should clearly be made from rubber that FedEx vulcanized themselves.

I've seen huge insurance companies outsource not just their hardware, but their ops too.

McDonalds may be in the real estate business (and not the burger business), but FedEx is not.


My company built a data center that was supposed to lead us into the future, and it started running out of capacity within a few years.

The cloud is the only reasonable way to keep up with exploding compute demands.


Exactly, they literally design their own hardware. To me it's like trying to roll your own word processor. You can but unless it's core to your business, it's cheaper to buy.


I feel like I could probably roll my own word processor this weekend.


This seems to imply that if processing loads for a given line of business start to level off, it may lead some to move away from the cloud.


I helped with the closure of Intuit’s data centers and it was absolutely the right move for them. The amount of resources needed to maintain data centers world wide is significant.


Theoretically cloud providers should be able to provide both the infrastructure and managed services much cheaper (due to economies of scale) than a single company running their own.

The question is how much of the savings end up in whose pockets, or whether the price at which they actually sell exceeds the TCO of running it yourself.

Outsourcing at least the commodity part (dumb hardware) seems reasonable, and migrating to a competitor doesn't seem like too bad of a BANTA if you aren't locked in.


They don't run their own electric companies, and often don't even own their own buildings, both of those are essential. I suspect they lease their trucks and planes too.

If they're building it on AWS only kit and locking in, then they open themselves to risk. If they build it on standard VMs running on AWS, Azure, Digital Ocean, then they can shop around at contract renewal time, just like they can with Mercedes vs Ford vs VW trucks



Eh as long as one design for portability migrating isn't that hard, which enables them to leverage their own size for volume discounts

As long as they stick to services that have overlap been providers (postgres, mongodb, nfs/cifs, etc) or transformers (s3, azure blob storage) and package their code in somewhat agnostic format (docker images, wars, what have you) they'll be fine.


Yes but the round trip to/from the cloud can be a forcing function in a large bureaucratic company to dump a lot of ossified infrastructure and IT organizational baggage. When you re-localize you'll be building a new IT organization to do it and using new hardware.

Big organizations often do things that seem wasteful as a way of dumping organizational cruft.


Even in CS we call it "loose coupling". There is no no-coupling.

I agree with you, that the hype train "cloud = total freedom" (whatever that means) is rubbish. More and more it seems, that a cloud model is just yet another option between Mainframe, AWS, Google Cloud etc.


I agree that long-term, outsourcing of compute capability is probably going to be a negative for everyone. That said, when your organizations point of difference isn't rooted in computing, and you're not a producer of compute platforms yourself, why wouldn't you source that from a third party? The arguments don't look that different to buying pizza boxes from HP, Sun etc - If they could have gouged your profits away I'm sure they would, but as long as the space is commodified competition should be assured, and that sets pricing down at competitive levels, not monopolistic levels.

Far as I can see the only arguments here are that either cloud compute is naturally monopolistic, which seems a stretch, or there's another factor that will lead to ruin here (e.g. long-term connectivity loss between your and your compute in a particularly nasty disaster scenario). Like, I don't care* if my baker's goals are orthogonal to my own. They just provide crullers for the morning stand up and I can get them from other places despite the insurrection that would occur if staff ever went without. I can likewise not care if my cloud provider's goals are not aligned, because if they screw me I can just shop at another bakery, so speak.

All of this said, outsource to cloud isn't free. You have to have an exit, or at least transition, plan. You have to architect into the configuration, not just transpose what you have. You have to maintain platform-agnostic restorable backups of the information that's critical to your organization, and have an operable proven plan to bring in online. Failure to do these things, which I think is the case for a scary number of organizations that have 'gone cloud', could indeed lead to ruin.

I know this comment is weird. I'm in an odd space here of agreeing with your conclusion but mostly disagreeing with the premises on which you reached it.

*until we see a cloud provider that also owns major fab i.e. if Intel or AMD went into direct cloud provision and took it seriously for about a decade I'm pretty sure it's game and over the world itself would implode from the sheer monopoly that would ensue. Until then, however...


> Because what's their BATNA? Migrating from Azure to AWS when Microsoft doesn't want to let them?

As long as they have backups, yep, that's a quite possible alternative. And backups cost very little, and don't need to run flawlessly 24x7.

They will probably pay through their nose either way (backups or not, migrating or not), because that's how the cloud works. But well, that's for their bean-counters to count. As long as they have backups, it's not strategy defining or an existential risk.

(Anyway, nobody goes around talking about BATNA of the proprietary, rented mainframes. I wonder why.)


Probably for the same reason nobody goes around talking about the BATNA of having employees. This doesn’t make any sense.


Not only people do talk about the BATNA of hiring people all the time, but I fail to see how this is relevant to a machine rental contract.


empowered to charge them whatever price they want

That's correct assuming monopolistic behavior. I believe pricing in information infra will be a race to the bottom. We've not seen a lack of competition in this space.


I am pretty sure part of it is also some kind of greenwashing move to abstract away your carbon footprint. Like exporting pollution to far east by producing there. Sure there are tools to calculate your energy/carbon footprint from cloud workloads but if you don't operate a datacenter yourself anymore you can't be as much accountable as the cloud vendor is.

But there will surely be a backlash and an - at least partial - return to operating your own datacenter for some companies in the future.


I used to agree with your opinion. Then I worked as an engineering manager at Capital One (deep learning, not infrastructure) and I saw all of the good reasons they had to move to AWS.


That sounds interesting! Can you elaborate?


Fedex is not in the business of making their own cars, they're not going to be in the business of fairly complex cloud ops either.

This is permanent secular shift.

The advantages of the cloud are just starting to be realized.

That said, for some kinds of operations, they may want a lower cost cloud provider with a more basic stack.

Secondary operators of the world need to get together and start offering a standard stack for basic things so that people can wean off of Azure and AWS.


I think it's fine to be honest.

Over time competition between the 3 major clouds combined with hardware getting cheaper will mean there is not going to be a massive jump in prices, if any at all.

It is not impossible to design relatively cloud-portable server software.

Also I think FedEx, and companies in general, would prefer to increase revenue rather than reduce costs (by running their own data center) in order to increase their net profit.


Yes, just like leasing office buildings instead of buying them causes companies to give all of their profits to real estate companies. Or using dell windows laptops. /s

If AWS et all get too greedy, competitors will pop up saying 'your margin is my opportunity' and market competition happens for something that is a commodity, and becoming more so as time goes on.


I think this is one side of an argument. It's valid, and it needs to be heeded, but still incomplete.

Cloud setups are useful, as are many ways of their associated architecture/infrastructure elements. Saying no to these is problematic too.

It is true that cloud computing has become a monopolistic, categorically non-neutral sector. They'll have you over a barrell.

So, dilemmas. Difficult strategic choices.


Disagree, companies cannot focus on too many things at once and get them right. Better for them to focus on logistics, fleet, cargo, etc than investing in data centers. Staying in your area of competence usually yields the best results. Btw, as I was using FedEx for an international shipment, their IT services are abysmal and could use a shake-up.


You're forgetting the most important part of this entire thing, the massive bonuses the current executives are going to get for saving the company over $400m. Whatever happens to the company afterwards is not their problem, they're already moved on to the next company to spread the legend of them saving FedEx over $400m.


This line of reasoning has been around as long as the cloud has been a thing, and yet companies are still doing it and seeing better results than running their own data centers. At what point do we put this argument to bed? You want to talk about fear of switching costs? Fedex was still on mainframes, for crying out loud.


No, they could use several proprietary solutions that aren't "pay by the hour", that come with their own consultants as well as the physical infrastructure. Also, FedEx is not a customer that a big profit sucking outlet wants to lose by gouging rent. They have more leverage than you're giving them credit for.


Why be an information infrastructure company on top of a delivery company? Why would they want to do both if they didn't need to. And pretty much all companies are going multi-vendor to avoid lockin and provide leverage for negotiations. Their information is the part that runs the company, not the stack.


Why be a management company on top of a delivery company? Or a finance company on top of a delivery company? Or an employer on top of a delivery company? Because if you outsource any of those functions 100%, your vendor owns the actual company.


If your company is already public (like FedEx) then you don't own your company anyways. So who cares who owns it. All that matters is that it profits the shareholders.


Yeah, the shareholders really don't appreciate it when the management of the company sells it out from under them to a vendor, cutting off the shareholders' future stream of profits. The cloud is a new enough grift that they maybe aren't aware that that's what's happening.


Yeah but cloud infra is a competitive market with multiple players, it's not like they're going to just buy a monthly AWS contract. They will probably have a cloud migration strategy, multiple year fixed contract, and multi-cloud support.


Most public companies already outsource their executive management to Wall St analysts.

I think the reality in this case is that the cloud providers are little different than the incumbent (ie IBM or some other dead platform). Mainframe is a sole source solution, and IBM is always trying to extract their own pound of flesh as they manage that business down to zero.

At scale, you’re better off managing a handful of suppliers (probably Microsoft, Oracle, Azure, IBM) than dealing with one, and figuring out how to sustain the business as the workforce literally dies off. Nobody with half a brain is getting mainframe skills.

I’m not a fan of FedEx, but they seem to be a company well aware of its limitations and able to take action to correct. For example, they went to “outsourcing” the last mile delivery to independent contractors first when they figured out (almost too late) that e-commerce needed ground shipping. They sort of did Uber, except they didn’t have the ability to just break the law.


> Renting your information infrastructure is a great way to reduce startup costs

they can rent hardware only, and still run their own infra (kubernets and all stuff on top on it), then they can negotiate reasonable prices and migrate away if they want to.


It is 100%.

Cloud services are about to evolve in ways that non tech-specialized companies can replicate on their own. Cybersecurity requirements will grow and bandwidths will be pushed to the limits. Just leave it to the professionals.


Was Fedex really running its own infomatics operation or did they mostly have layers upon layers of IBM consultants? Yes, the "owned" the hardware in that case, but it would likely have been no easier to switch.


Similar to the decision to own a building or rent. Moving is a real pain and can kill companies. Better get a long lease with lots of extension options because at the end of the lease your rent is going to go way up.


Yes, it sounds like they are about to head down the same road as many large firms who sold all their real estate because short-term investors said it was a good idea.


The network traffic price is absolutely vital. This is essentially the ransom that the cloud provider holds on you for leaving.


> Trying to outsource it is like trying to outsource upper management.

You say that like it's bad.


And what do you think ibm could charge them for mainframe repairs?


Outsourcing hosting doesn't seem that different from outsourcing electricity. If it's not a core competency, or competitive advantage, let somebody else deal with it.


It's like oracle


Your reasoning is faulty and very probably self-serving. More and more companies can and will outsource their infrastructure. IT guys hate this because it threatens their "paychecks."

The optimal amount of cloud services for an established company like FedEx is 100%, not merely with a "disaster recovery plan" but with live, 99.99% redundancy by which I mean two almost exact systems running nearly simultaneously (within a second or two of one another) on two completely different networks.

FedEx enjoys almost all of its competitive advantage from its physical network. IT is not core to its business.

Here's the problem... almost no company actually does disaster recovery and "parallel redundancy" properly because most C-level executives only pay lip service to it.

Therefore, the whole notion that disaster recovery and "parallel redundancy" don't work is predicated on the false notion that companies actually have proper disaster recovery and "parallel redundancy" in the first place.


It is surely true that I have a cognitive bias in the direction you say, and that many companies operate at lower levels of informatics infrastructure reliability than Azure.


Making accurate calculations of data center cost is of course complicated and rarely manages to take everything into account, but the common knowledge I’ve heard is that there comes a point when a company is so large that going on-prem is what actually saves them a lot of money.

If FedEx is not large enough to be one of those companies, how large do you actually have to be? Or has cloud pricing changed to the point where this is no longer true, and even huge corporations can save money in moving away from their own premises?


It's not only about size, there's also workloads to take into consideration.

Take FedEx, their operations are highly dynamic - per day but also per period. The number of packages sent, and being transported, vary greatly between a random Tuesday 15:00 and the weeks before holidays.

In such a scenario, with your own infrastructure, you need to overprovision, by a lot, to be able to handle the heaviest load possible, and then some margin on top. And that capacity is wasted what, 90% of the year?

Meanwhile if you subcontract that part on AWS, it's their problem, and they can afford to handle it, and you only pay for what you actually use when you use it.


Someone, somewhere, has to actually buy the computers. "Only pay for what you use" isn't magic, whatever fraction actually gets used (across all customers) has to end up paying for the whole computer.

Taxis are "only pay for what you use", but owning your own car is often cheaper (even if you only use it, what, 1/12 of the day?) and more convenient.


This is all true, except that the markup for a rental can be so high that owning something, even for fairly low utilization, can be less expensive.


>>Someone, somewhere, has to actually buy the computers.

Exactly, so as long as it's not you, you should be fine.

Let AWS figure out how they wish to do it for you.


Until AWS runs out of computers.

It's rare, but there are enough strange instance types that it's possible.


> Taxis are "only pay for what you use"

Taxis are not scalable, you can't make one human drive 100 taxis. Meanwhile you can add entire new data centers with barely any extra burden on the software teams that manage them.


I feel that Zipcar would be a fairer comparison to owning your own car. You could compare a taxi to your own chauffeured car.

We own our own car out of convenience not cost savings.


The cost does not scale linearly with the #of packages. A table with 1 entry is not cheaper than a table with 1 million entries.

The same routing algorithms have to run either they have one package or 1 million packages.

Plus you need to keep the history of the operations for months if not years (meaning that you will store both holiday and random Tuesday data anyway).

So where is this flexible cost exactly?


Not just routing. When a plane or truck arrives and is unloaded, thousands of packages have to be scanned, which triggers a cascade of messages to different systems. In addition to the routing and whatnot, customs declarations might have to be sent, track and trace gets updates (both from arrival scan and customs), there's billing etc. All this flurry of activity happens within an hour or so after the plane lands, and then it quiets down until the next plane or truck.

I could easily see FedEx can save money by dynamically scaling capacity to track the arrival of their planes or trucks.


May I point you to the DynamoDB docs? Costs scale along the dimensions of storage and activity. A 1 entry table is essentially free if you use on-demand RCU/WCU provisioning.


Ever heard of the traveling salesman problem because that is worse than linear, it is factorial. Granted, that is probably overkill for most people and my team (Not FedEx) just does NxN so it is only quadratic but it is definitely not constant nor even linear.


Exactly because tsp variants are exponentially combinatorial we solve them with a strict time limit and use the best found solution.

So for practical applications they are constant time.


> So for practical applications they are constant time.

So they should be able to handle all of this on a raspberry pi, if the size of the input has no effect on processing time.


Yep a raspberry pi can give you A solution.

It is not difficult to find a solution to routing problems, the difficult part is to get probably near optimal solution (<1% optimality gap).


So it scales with optimality but not size?


It scales with the number of feasible routes for the particular problem instance you are looking at.

The issue is that there are exponentially many of them and you cannot easily rule out that there is no better solution than the one you have on hand.

That is why we solve these problems for as long as possible with as many resources as possible.


> The same routing algorithms have to run either they have one package or 1 million packages.

Certainly you know that running a routing algorithm 1 million times because you need to route 1 million packages is going to need 1 million times more CPU time than one package, right?


No, it will take as much time and as many cpus as you have available, unless you manage to find the globally optimal solution earlier (practically never, for the problems that FedEx solves).

The number of variables is not a predictor of time and average complexity for integer programming.

We can solve some problems with millions of binaries in minutes. There are other problems with a couple of hundred of binaries that we cannot absolutely solve to optimality.

Sorry that this is your typical sorting problem.


I'm wondering if we're talking about different things.

If it takes 1 second of CPU time to process the path for 1 package, then it will take 1 million seconds of CPU time to path find 1 million packages. You can add more CPUs to run multiple path finds in parallel (ie, use 4 CPUs to cut the wall time to 250,000 seconds), but there's still 1 million core-seconds being spent.


Seasonal demand


The flip side is the approach that Amazon took with AWS. Maintain the server capacity but invent some sort of mechanism to sell any extra server capacity during the down time.

Perhaps there isn't the mindset/ability to actually execute this though.


That's a myth, and obviously false on the surface: What are they supposed to do when they need that capacity back for Black Friday? Kick out all their customers for a day?


Wasn't this confirmed to be a marketing myth?


Maybe this is true for public cloud pricing but it hasnt been true for some time when it comes to enterprises on cloud. Large enterprises come in and ask for a quote for X cpus/ram/storage over X years. They sign a multi year contract with a guaranteed minimum spend, and a discount on every line item. Then they "lift and shift" their physical data center into a cloud region, in a very static way (no autoscaling etc).

This works great for everyone, the hyperscalars have much lower costs than even the biggest enterprise customers so the company get a good deal. And the cloud company makes some revenue off the minimum spend (making capacity planning a lot easier) and they know once the compute and storage there it will be very likely the company starts using extra managed services.


Running your own data centers is a form of vertical consolidation, which is a strategy. Specialization is the opposite strategy. A single company may strategically decide to specialize some things and vertically consolidate others.

The dollars-and-cents are details. Those matter, of course, but don't base your entire analysis there. Staying vertically integrated may help fedex be 4.713% more profitable today, but might miss out on important growth areas and become a dinosaur. Or maybe the important growth areas involve data center innovations, and staying vertically integrated allows them to exploit those opportunities. So it's not always an easy decision, but it involves more than the present-day costs.


It's not just money, it is also how you spend a limited quantity of organisational focus. You cannot do it all, this is actually even more important when you are that big of a company.


Technology scales faster than fedex's data needs. Maybe somewhere in the future the reverse trend happens as their needs can be met with very small servers. Though I've always been wary of putting your data somewhere because getting your data to a competitor costs alot of money.


I don't think it's the server costs they care about it's stuff like db maintenance, security, security, and security. Security professionals are expensive, and it's very hard to get right in house on metal.


Moving away from on-prem data centers does not reduce security costs. At best, it's a lateral move, and depending on the cloud setup, may actually be more expensive.

Ultimately, this is an opex vs. capex decision. Data centers are expensive, and require a lot of up front capital to go into the ground. FedEx is worldwide and moving more technology to the edge, so buying land, designing and constructing buildings, and then operating them for 10 years or more in a large number of locales requires a huge investment. They can take that money now and solve their problems immediately.

They are also not saying what they mean by "closing its data centers". FedEx announced a 10-year partnership with Switch last year to build and operate edge facilities. FedEx may be moving out of data centers it operates on its own into facilities built and run by others, and using "cloud-native" designs deployed on hardware in those facilities.


> Technology scales faster than fedex's data needs.

I don't think this sentence has meaning


I mean it is like executive statement e.g. Real time analytics in Cloud IoT on Edge.


The transition to “cloud” is often more about the culture and capabilities of a company than the technology. If you have been on some shitty mainframe and a bunch of other technology from the 60s you cannot bank on that carrying you for the next 20-30 years. The people that know that stuff are ancient, expensive, and retiring. “Cloud” is a very good opportunity to jettison the legacy burden and reinvigorate the technical competence of the org.


Or maybe they are large enough, they just don't realize it right now.


Its weird to have companies say they're moving from Mainframes to the Cloud.

The Mainframe is a sort of cloud, each one has usually double the spare capacity and you call IBM to unlock it.

The cost of running a Mainframe is measured in terms of MIPS based on amount of compute used and there's compute reserved for offloading things like JITs (zAAPS). Essentially a version of cloud compute costs.

I can understand if you're locked into COBOL with EBCDIC and can't find talent, but Z runs freaking Ubuntu now! I know IBM works furiously on modern ecosystem support.

Not shill for IBM, but what exactly is it that they're expecting to be so different from their cloud migrations? Or is everyone here caught up in the "Mainframe old" falsehood.


I doubt they have too much COBOL lock-in, since the article notes their datacenter opened in 2008.

Here's the thing: A mainframe may have spare capacity, but you still have to call IBM to unlock it. In order to make a Z run in a cost-effective manner, you need to run at 90%+ utilization at all times - which is excellent for batch jobs that can be scheduled, but is difficult to achieve with on-demand loads.

You're paying based on compute available, not just compute used (unless I'm misremembering our contracts back in the day). Sure, the amount of available capacity can be changed, but a phone call to big blue is not automated. Cloud autoscaling is.

You're right that IBM mainframes are far more modern and cost effective than we assume. Sadly, they are still behind the curve of cloud hosting.


> a phone call to big blue is not automated. Cloud autoscaling is.

Respectfully, the hell you say. Cloud 'autoscaling' is something that has to be tenderly maintained by software engineers, who expect to be paid salaries and benefits and so on. It's not like FedEx can just rsync their data into the cloud and have all their software run forever. Instead, they need the engineering team that manages their existing data workflows, and now they also need cloud engineers to translate that into something that won't bankrupt the company, since they're moving from the mainframe world (where you keep your system loaded to an efficient price point) to the cloud world (where you are billed by the second).

Moving your compute spend from capex to opex is a perfectly valid move, but pretending the reason is that someone has to make a phone call once in a while is kind of bizarre, when the alternative is having to hire a whole new cadre of techincal talent.


Why would one have to hire a whole new cadre of technical talent? Wouldn't it just be taking existing talent and/or ops teams and having them work with the autoscaling APIs as opposed to working with the data center teams. If the skillsets don't match then you can either train existing talent or, yes, hire new folks.

I agree the occasional phone call removed from your workflow isn't reason enough to justify a giant cloud migration... but it is one of many reasons.


I'll acknowledge that "have to" is maybe an exaggeration, but it's definitely going to be cheaper to hire new talent than reskill existing engineers (who may not even be interested in reskilling), and profit margins generally dictate the cheapest path forward in an ultracompetitive industry like logistics. And on top of all those issues, the existing workflows must be maintained while the cloud-adapted ones come online, and in a complex enough enterprise environment we're talking about a project that can take years to execute.


This sounds like something they will solve in cloud 2.0


I'm confused about the first data center in 2008. Surely they had data centers before that, or were they outsourced before? Automatic routing of deliveries has been the standard for many decades, I'm sure the code on their mainframes is much older than just 14 years.


> Its weird to have companies say they're moving from Mainframes to the Cloud.

Ever work for a company with their own datacenters that were NOT cutting edge? The work to provision resources can be so painful that it severely inhibits prototyping. Being able to spin up resources in seconds with some API calls is very valuable.


No one says you can't have both. An on-prem data center for reliable workloads and cloud to try out new stuff and products or to get short-term capacity. But in the end, 95%+ of servers in your data center will run 24/7 and for years on the same software. No need to outsource that.


> Or is everyone here caught up in the "Mainframe old" falsehood.

It's probably this. I've spent time trying to advocate for the benefits of these systems, but its like shooting a super soaker into the sun.


I think it’s also both the people making the tech decisions and the people implementing them simply don’t want to work on mainframes. I would 100% of the time prefer to work with an on-prem dc or colo using commodity hardware and OSS to mange it.

I get my pick of a billion different hardware vendors, they’re all interchangeable and interoperable, every problem I can possibly dream up has been solved a hundred times before. The skills are commonplace and transferable so hiring doesn’t mean convincing some poor desperate grad to get trained in the company script, hires will actually want the skills, and you can find senior people.


Can you elaborate a little? I'll admit I know close to nothing about mainframes.


Mainframes are still the best option for systems that cannot fail. Even business like FedEx is not as critical as some types of business I've seen ran through IBM.

A good example of where these systems come into their own is payment/ACH/wire processing. The consequences of these networks going down are so severe that it is worth it to construct an entire facility with the computer systems and business in mind from the very beginning.

Today, mainframes are more for the type of business that are finding the need to pour the literal foundations for their own datacenters. If you are even considering cloud as a viable path, then this kind of stuff is certainly not for you.


A lot of the cool tech that Linux and parts of cloud tech are crowing about are just new versions of stuff IBM was doing in the 1990’s. Sometimes I wish I knew some of those people, because I’m sure it would be fascinating to wind them up over a few beers and see what comes out of their mouths.


Mainframe = locked in to IBM, is probably the logic.


How is it any different than being locked to AWS/Azure/Oracle/Google?


Well your also hardware locked. The last time I worked on a mainframe (about a decade), 8 mb of RAM was $1000, a network card was $800. Commodity hardware around the same time 8mb was probably $50 and a network card even a good one was most $100.

You're paying a premium for both the hardware and the lack of develops knowing EBSIDIC, JCL, Cobol, etc..

Actually, from what I remember of Oracle, that might be very similar. I remember having to pay license fees per core.


You mean gb of memory right? Or you might have been working with several decades ago.


Its not be most C-level execs are these companies can present it as IT evolution and progress. Balance sheets look different because accounting has costs in different places. It all looks good.


The lock-in is less severe on AWS/Azure/Oracle/Google.


I'm assuming FedEx wants to be in the package delivery business, not the owning compute hardware, datacenter, cooling and power business.


The fundamental issue with mainframes is that IBM made ramping up talent on mainframes extremely painful. Sure they run Ubuntu but how does one actually get a mainframe environment to learn on?

Mainframes are still king when it comes to transaction processing but having such a closed off ecosystem has screwed them.

Even FPGA's are more accessible to learn. On AWS you can spin up eg; an F1 instance and get a dev environment.


> you call IBM

you don't need to call AWS. Negotiate a blanket deal and do whatever you need. Run whatever you want. Unsure? Check billing page.


Are they really going to save 400 m$, or was it the figure "promised" by the shop that will handle the migration?


There's not a company in the world that spends more than $1m annually on cloud costs that has saved money by doing so. You don't go to the cloud to save money, you go to the cloud to reduce technological risk.

If you go to the cloud you don't need to fire anyone for choosing IBM, you're not getting strangled by any Oracle contracts, you're not gonna lose all your data because of security holes in your use of Microsoft products.

You're also not going to be dealing with downtime because you couldn't find staff talented off to properly configure your Cisco networking equipment.

Your system administrators department is not gonna block any innovative employee initiatives because of the strain maintaining more projects puts on a deployment stack carrying 150 different projects but which was architected to solve a single business goal 15 years ago.

Imagine being a 67 yr old manager who just wants to be in the business of getting letters printed on tree pulp from Bumfuck, Idaho delivered to Louisville, Alabama reasonably effectively. Knowing nothing about computers or information technology, imagine the insane stack of perfect decisions that have to be made to get the IT infrastructure of a company like FedEX running.

Not saying going to the cloud is the best decision. Not even saying it's the decision I would make. But it does sound very enticing to just have all of those problems go away by throwing a couple hundred million dollars a year at Google, Microsoft or Amazon (btw lol if they go with Oracle or IBM instead).


>You're also not going to be dealing with downtime because you couldn't find staff talented off to properly configure your Cisco networking equipment.

>Your system administrators department is not gonna block any innovative employee initiatives because of the strain maintaining more projects puts on a deployment stack carrying 150 different projects but which was architected to solve a single business goal 15 years ago.

As someone in the “cloud” team at a legacy enterprise I strongly disagree with both of these points.

Cloud networking is as complicated as anything the Cisco people ever did but instead of CCNA you have certifications that barely scratch the surface of the complexity. So you get cloud people who barely understand the platform they’re administering flying by the seats of their pants. And instead of having a networking team to focus on networking the same people trying to figure out static routing across regions in AWS are also the people responsible for migrating EC2 instances from GP2 to GP3 but they were deployed with cloudformation which will replace the instance if your change of the disk type.

So getting to the second point this team will be totally overwhelmed and likely inexperienced so good luck getting them to do anything to help your “innovative” project because right hope they’re too busy trying to figure out how EKS is using up all the IP addresses in us-west-2.

Executives who think they’ll save on money by moving to the cloud are delusional. They’re also delusional if they think it’ll increase stability or resilience. And that’s not even getting into the EMR clusters the “data science” team spun up and left running at $30k/day.


God so much this. I see entire teams of developers who are, theoretically, software engineers. Yet their entire day is just configuring AWS. Endless meetings with endless acronyms and endless complexity to solve problems that have been solved for 20 years, but instead of focusing on the metal and first principles, the entire architecture is lost in a sea of "cloud services" that require multiple certifications to even begin to understand. All of this in service of an application load that could easily be handled with a few big servers.


> There's not a company in the world that spends more than $1m annually on cloud costs that has saved money by doing so. You don't go to the cloud to save money, you go to the cloud to reduce technological risk.

I don't know why I have to explain this every time "data center vs cloud" discussions come up, but if you reduce risks, then you are in effect saving money.


There are other ways of reducing risks. You only save money if it's the most efficient way of reducing risks and your risks are purely convertible into cash.

I'm quite sure if you take managing your IT infrastructure as seriously as you take your core business, you can definitely save tons of money by handrolling your infrastructure.

There's also a big difference between doing so 10 years ago versus now, with all the enterprise grade open source solutions to infrastructure challenges.


> I'm quite sure if you take managing your IT infrastructure as seriously as you take your core business

Try telling a company like Catepillar who manufactures excavating tools to "take IT as seriously as you do making tools"?

> with all the enterprise grade open source solutions to infrastructure challenges

You mean like OpenStack? Have you ever been in a large IT org (non tech company...like a distributing/manufacturing company) that has tried to implement it? and then maintain it? Oof...


It seems you're trying to nail down an absolute, I'm just saying that there's options sometimes. In my opinion AirBnB is setting money on fire by running on AWS. They've got huge talent pools of great engineers they could activate to in house their infrastructure and they'd save hundreds of millions of dollars. At the same time there's companies that have no business running their own web applications let alone their own infrastructure. A company like caterpillar I think should be run almost entirely on no/low code platforms. Their research department might run some code, they might have teams doing embedded dev for their devices. Beyond that it should just all be SaaS. And between those two extremes there is like a whole spectrum a business could be on.


> In my opinion AirBnB is setting money on fire by running on AWS.

And I'm saying you're completely armchairing this analysis because you literally don't know any of these details.

> A company like caterpillar I think should be run almost entirely on no/low code platforms.

Are you suggesting a global manufacturer like Catepillar run it's global financial ledger on a no-code platform? Which implies building the code for it and then maintaining it?

> It seems you're trying to nail down an absolute

In fact, I'm not. The only absolute I'm trying to nail down is "you need to do a buy vs build, rent vs own assessment and make the decision there, neither one is unilaterally true without that assessment". Anyone trying to speculate about budgets in the $100M+ space is just heresay.


> you go to the cloud to reduce technological risk

And exchange it with dependability risks

> you're not getting strangled by any Oracle contracts

Unless you go to the Oracle cloud

> Your system administrators department is not gonna block any innovative employee initiatives

They're still there, aren't they?


Article says they're going Azure and Oracle. I actually use Oracle myself, but only their free tier because it's very generous. It probably works as marketing because I'd be inclined to throw them in the mix if I was looking for a real provider.


>Are they really going to save 400 m$

I'm sure they'll save many millions.

Earlier this year I left FedEx after 15 1/2 years of service. Every single day I used an IBM AS/400 terminal to interact with several systems to do my job, the bulk of my job was done via that terminal. Yeah, that's how old most of FedEx's tech is. A few months before I left they had just migrated one of our in-house systems over to Oracle likely as part of this. That said, the mission-critical system was having almost daily errors/downtime once migrated to Oracle soooo...

I imagine some of this is the simple fact that they need to replace these severely aging, no longer supported, IBM AS/400 servers throughout the company. They lost hardware support around 2020 and, if I'm not mistaken, haven't been made since like 2008 or something. That alone is going to save a pretty penny in new hardware and energy costs as well as free up physical space at often already crowded areas.

It'll also save time-lost costs. Any time power would go out at our building, the servers would usually be done for tens of minutes even with the generator kicking on. We'd lose a couple of hours a year usually to the servers that were in our building coming back online. A couple of hours, times 100~ employees at one site, equals a LOT of backlog being created which ripples through the company. While that few thousand dollars they're paying employees during that downtime isn't much, it would cascade and disrupt the freight handling. If a single package didn't clear customs in time, then it might end up as an overage and have to go to a bonded cage, that's now 2 extra movements added, that's freight planning for possibly multiple trucks that will be part of the delivery once the package landed in the United States, you might see a thousand or more shipments (that may or may not be single package shipments) now needing to be handled extra at a half dozen ports, even more sort facilities, and even more local facilities. That's just from 1 office losing power for say a half hour.

Moving those servers to a cloud provider should provide a much better uptime which should translate to a notable savings in the above situations.


IBM i on modern Power machines is fine as far as performance goes.

The hideous frontend of greenscreen RPG programs is optional, you could replace them with Java if anyone cared enough about UX for internal tools (they don't).

The hard part with that stack is getting RPG devs - most of them are 50+ and expensive.


Save money by replacing 14 year old machines? More Oracle in the cloud? IBMi is available in the cloud now.

Now they'll need how many engineers to move the code stack (that probably no one knows) to some groovy stack with 12 layers.


They were probably promised a very low price for their first few years, before their cloud costs get re-negotiated. Once they are locked in, the cloud cartel can jack up their price to make sure they get as much money as possible.

But also, mainframes are extremely expensive, so they can probably do better on commodity hardware. The cloud is a way to rent a lot of commodity hardware.

The big-brained move that they are almost certainly not doing is to use cloud services to bridge the retirement of their mainframes and move everything back on-prem with Linux boxes in 5 years.


> They were probably promised a very low price for their first few years, before their cloud costs get re-negotiated. Once they are locked in, the cloud cartel can jack up their price to make sure they get as much money as possible

People often say that, but that literally has happened once, with GCP, and has no relation to how AWS and Azure do things. Please go and find an example in the 10+ years that AWS have been a big serious contender for the world's IT workloads of them jacking up prices.


It certainly seems unlikely. Maybe it's the current operational cost, not subtracted the new operational cost. Maybe it's due to already adopting cloud and thus having paying double. Maybe it's due to the use of mainframes, rather than conventional servers.

Either way, it's definitely not the cost difference between on-premise and cloud. Cloud providers are not charities, and their buildings and staff are not cheaper than yours.


So do you feel FedEx should make their own trucks, ships, and planes?


FedEx essentially does make their own trucks - they buy from white box suppliers who have very narrow margins. Ships are similar. Planes probably have slightly higher margins, but they still will buy used.

Getting computers from the cloud companies is very different: most cloud products (aside from server rentals) are not in competitive markets.


I would be surprised if FedEx or UPS or any last mile delivery company owned any ocean going ships.


UPS only ships by ground and air, so yes it would be rather surprising.

Fast delivery couriers do not ship by sea.


False equivalence. FedEx isn't building the servers here, either. But they do have their own maintenance for their fleet. Thus the better analogy would be if FedEx outsourced fleet management & maintenance to Hertz or similar.


There is a huge difference between transport vehicles vs. data centers in terms of the amount of lock-in to particular vendors &cost of switching.

It looks to me like the cloud providers will soon have FedEx nicely tied over a barrel, and those 'savings' will prove illusory .


With trucks etc. there are still independent companies that can fix those and transitioning to a new provider is straightforward. With IT in cloud there is a strong vendor lock-in.


I think that's a false equivalence. That's more like FedEx buying their own server hardware.


> Either way, it's definitely not the cost difference between on-premise and cloud. Cloud providers are not charities, and their buildings and staff are not cheaper than yours.

Exactly, which is why it’s quite a bit cheaper to not have a building and staff right? The premium you pay for the cloud provider having them must be less than having them yourself, or nobody would ever move to a cloud provider.


No, if the premium was less than the cost of operating a datacenter, no cloud provider would be able to stay in business, much less interested in entering the market.

In order for the cloud business to make sense, turn a profit and sponsor the kind of development into new products done by these companies, we can assume that the cloud services sold by these providers must have a very healthy margin compared to the cost of operating a datacenter full of resources.

The primary thing a cloud business can do to drive cost back down past what you could do with your own datacenter is to have better utilization of their resources by dealing with an average across a much wider range of workloads, or by selling spot instances that get killed if capacity is needed, but based on how things are priced it appears that this mainly just pads their margins.

For the customer, it only really ends up cheaper than on-premise is if: A) you need very few resources and do not already have anywhere to put a server or lack a good uplink; B) your use-case is extremely well-optimized for cloud (e.g. extremely bursty serverless where you average load is a tiny fraction of a single machine); or C) you are Netflix and can make a deal with AWS.


Either that or it’s an accurate figure for the first year or so while the cloud provider gives a ton of credits. By the time that all runs out the senior exec behind the migration will have moved on and someone else can do the maths.


It's likely the 400 m$ is inflated, "by 2024" will be many years late and there will be significant increased costs during the interim period, also likely kept artificially low.


$400M sounds huge, but why do I think it’s trivial next to FedEx’s operating budget? A couple years ago they were making 18 Billion* a quarter.

* edited from million (meant to type billion)


If think you mean $90 *b*illion a quarter.



Is there any reasonable answer to this question ?


Interesting story as I recall Bank of America saving 5x that amount per year by doing the exact opposite move.

https://www.businessinsider.com/bank-of-americas-350-million...


The difference may be that they're sticking with their own data center but they don't have a mainframe setup. Mainframes come with expensive, locked out hardware, expensive maintenance, expensive experts, you name it.

With modern open source tooling, ridiculously powerful graphics cards, easily accessible FPGAs and free data center hosting software like OpenStack, Kubernetes, and glusterfs you can replicate many of the features that made mainframes enticing many years ago.

It'll require a heck of a lot of work to get the same performance out of a custom built solution, but long term it's probably cheaper than relying on IBM.

I doubt that they'll be saving any money by moving to the cloud unless they're horribly inefficient with their contracts right now. Just moving away from mainframes alone should save a significant buck, but if they were planning on restructuring their digital infrastructure anyway then now is probably the best time.


That's how you do it. You don't stick with on-prem but just build your own cloud. At that size, AWS won't really buy hardware much cheaper than you. And while you can't build every cloud product that AWS has, the most popular ones are open source anyway.


This is 100% reasonable and plausible. It's perfectly acceptable that one F500 company can see savings by using cloud and others see increased costs.


Doesn’t matter what you actually do, you get the benefits of shaking off all the barnacles doing any rewrite or rearchitectire.


Guessing they're not counting stuff at their flight sim training center (12 bays) as a 'data center'. There is zero % chance that the workloads supporting simulators are gonna run in the cloud. They're incredibly latency sensitive, extremely finicky, and just getting the stuff virtualized continues to be a struggle for aviation OEMs and flight training centers (though FedEx was a pioneer in this regard).

Granted, the footprint per sim bay isn't huge (1-2 racks, usually, sometimes 3), but it can add up at the larger sim facilities. Several of the major airlines are running 30+ bays simultaneously, with United poised to come in with the most at 40+ in the near future.


They have up to three racks for a flight simulator? I would've never thought they're that complex. Do they need that much computing power or is it other devices that are needed?


2/3rd's ish compute, 1/3 other stuff from what I've seen. A good chunk of the compute are appliance-type boxes supporting simulation of particular embedded systems in a traditional environment. Most of the complexity comes from integration of these misc accessory systems.

In general aside from the boxes handling the graphics projection, most of the 'compute' is idle a lot of the time, even when the simulator is running. There's usually a big burst of work to prep and load the simulation into the simulator, then minimal (but very latency sensitive) chatter back and forth between the various simulator components and the backend compute as the simulation runs.


I left FedEx (Trade Networks) after 15 and a half years earlier this year. The entirety of that 15 1/2 years was spent using IBM AS/400 terminal to interact with various systems and was also used to submit paperwork to Customs and any other applicable government agencies to clear international freight.

It... was an experience.

I imagine that is going to take quite a bit of work to migrate over to someone else's servers, as I doubt they're going to be like "sure, we can accommodate your decade and a half old, mission-critical, systems"


I've seen two mainframe "replacements" fail to either retire the mainframe, OR produce an alternative system which people like. Has any Fortune 500 company successfully "retired" their mainframe?


I've seen it. It was essentially four steps:

- Build middleware (REST API endpoint) in front of the mainframe that was 1:1 mainframe functions. It initially does nothing except auth and routing.

- Re-write/migrate all other software from using the mainframe directly to using the new middleware.

- Work to further restrict anything directly connecting to the mainframe aside from the middleware, including staff's terminal access by building management solutions (e.g. new UIs, new management middleware).

- Once the mainframe is completely isolated, start building drop-in replacements for business segments and then switch the middleware(s) to point at the new solution from the mainframe. This should be invisible to all external consumers. You can "hot test" it by having the API execute on the mainframe + new solution, and checking for 1:1 responses.

The hardest part about the above isn't the tech, it is the internal norms that you're fighting against (e.g. staff have had terminal access for tens of years, and know the commands to get their work done by memory). So you have to create very compelling alternatives using the new API middleware/web/apps, "export to Microsoft Excel," graphing, mobile support, and similar that a terminal cannot directly offer is clutch here.


This is exactly what Hilton did in the 2010s. We stood up an ESB (wso2) between the modern front ends and the legacy mainframe backend, and then broke down the mainframe's functions into microservice domains and wrote them. Had the ESB route traffic to the microservices until the mainframe wasn't doing any work anymore and was able to discontinue it.

There was a lot more complexity that that, but that's the gist of it. By the time I left, my team had carved out the bulk of the mainframe's services and had stood up over 50 microservices.


Off topic but I can't tell you how reassuring this is. This approach was essentially what we had come up with as a proposal for a Navy HR migration away from mainframes, though we never even got close to being able to actually attempt it. But at least it sounds like it could have worked had we tried it.


This is (essentially) how I've seen it done as well, the very few times I've seen it done successfully. But did you guys do it in 2 years, like these folks think they are?

The hardest part about the above isn't the tech, it is the internal norms that you're fighting against

This is so true it's almost painful.


> The hardest part about the above isn't the tech, it is the internal norms that you're fighting against (e.g. staff have had terminal access for tens of years, and know the commands to get their work done by memory).

I have sympathy for those people. Not a fan of AS/400, but working in those terminals is incredibly fast once you know all shortcuts. Modern GUIs, especially using all those fancy frameworks, are much slower. If the alternative to the terminal is a new web application, it's extremely likely that it'll be harder to work with.


There's so much wisdom in this comment. This is also how you tackle and replace very large legacy systems correctly.


Not sure about 'mainframes' specifically.

But one of the easiest way to retire anything, is to stop doing new work on the older systems. Any new project should go on the new tech.

Soon enough you will see the older systems gone. It won't happen in weeks or months, but 1 - 2 years is enough.

At one point the whole banking industry used to de-facto run on Java. Eventually people just started doing new work on Python. These days Java is just another tech there.


I've seen that with a core banking system. A decade later it's still running, with many patches in other systems to avoid its shortcomings. there's still no viable plan for migration.

Just because you don't invest in a system doesn't mean it becomes obsolete.


Cue 10 years down the line, when everyone realizes cloud comes with tradeoffs in flexibility and reliability and they all move back to on-premise for business critical stuff.


More like 20+ years -- after a new generation of executives and managers has taken over. Their egos won't be tied up in old debates and they will be more willing to sacrifice old decisions.


It would be interesting to see the actual numbers on how much such a transition will cost.

Can their existing mainframe software and databases simply migrate into a cloud offering? If not, then if their planned transition has delays or technical hurdles, how much of that cost savings is affected?

If their cloud providers change pricing structures in future contract negotiations or if the amount of data they transfer in/out of each cloud provider location changes dramatically in order to better serve their customers, how much of that cost savings is affected?

It's clear that you can run a business with minimal amount of physical plant for servers. It's still not entirely clear to me that large established businesses can actually save money this way.


From an executive viewpoint, this would be a reasonable forcing function to overcome long, deep rooted stasis in your IT department.

Let's assume for a moment that modernization is good. It probably is wrt mainframes. For every team that refused to modernize, you can now drive them to the 2024 date, absolutely no exceptions.


Is there any gain here aside from cost saving? Will this saving be passed on to the customer? (laughs). What happens when the inevitable AWS outage then causes a global shipping bottleneck? Have we still not learnt that having sovereignty is better than saving money? Has COVID taught us nothing? sigh


What’s to gain other than $400M in savings every year indeed?

If it drops all the way to the bottom line, that would boost their 2022 profits by 35% and 2021 profits by about 45%. Seems like a decision you have to consider, even with possible failure modes.


https://en.wikipedia.org/wiki/FedEx

Wikipedia says their net profit for 2021 was $75 billion, so your math seems off.


You are correct that I was wrong. (I searched "fedex profit 2021" and incorrectly reported based on a quarter's profit, which is sloppy as hell on my part.)

That said, it seems like $75B is wildly too high as well. Page 43 of their most recent annual report shows a consolidated net income for the year ending May 31, 2021 as $5.2B, making $400M still a 7.5% boost (if it all passes through).

https://investors.fedex.com/financial-information/annual-rep...

https://s21.q4cdn.com/665674268/files/doc_financials/annual/...


That Wikipedia figure is vandalism from a couple weeks ago that I reverted.


Wikipedia is wrong, see page 43:

https://www.sec.gov/Archives/edgar/data/1048911/000156459021...

I am not even sure where the $75.231B erroneous net income figure comes from, it is nowhere in the 2021 report:

https://investors.fedex.com/home/default.aspx


In a security business, if you drop all precautions your profit margin becomes 100%! Great business decision!


The immediate gains I can see are:

* In-house development teams have more agility in standing up new services, since they can just use EC2/ECS/one of the other 1500 container runtimes AWS has, rather than having to wait for someone to provision new physical servers for them.

* An extensive suite of complementary products are available from cloud providers, it's not just about being able to start a VM, in many cases you don't even need to because you've got Lambda, and the various managed services.

* Less training/hiring overhead, since people who can use AWS are pretty much a commodity at this point, whereas people who can manage a mainframe are an increasingly rare breed.

* Redundancy becomes easier. As does data locality, let's say Singapore declare all services delivered in Singapore must be hosted there. That's a lot easier to handle if you don't have to go and build/acquire a data centre first.

* The aforementioned 400 million dollars per year.


> In-house development teams have more agility in standing up new services, since they can just use EC2/ECS/one of the other 1500 container runtimes AWS has, rather than having to wait for someone to provision new physical servers for them.

How many people run physical servers 'raw' anymore? I would hazard their current stack involves VMware, Hyper-V, Open Stack, or a combination of the three.

You can have an API system spin-up just as effectively in a private cloud as a public one.


I worked somewhere with a private cloud. It was permanently over-subscribed and the VMs were nearly unusable because of it.

Not saying it can't be done well but there's less incentive for a company whose main business isn't providing infrastructure to ensure they've provisioned enough resources to meet demand.


That's like running against cost limits in AWS. If no money is spent, you won't have the flexibility. But it's not the fault of private clouds, they can be as flexible as a public cloud.


That handles IaaS, now do all databases (relational, NoSQL, NewSQL, KV, etc.), message brokers, object storage and a hundred other services. OpenStack handles some of those, with varying degrees of success, but with VMware and Hyper-V you have to DIY from scratch with IMHO the wrong level of abstraction.


It's fortunate if "private cloud" is actually like a cloud.


What makes you think that a cloud provider outage is more likely than an existing-Fedex-datacenter outage?


Mainframes themselves are extremely reliable. When configured as a cluster they get five nines. I'd say a power or communications failure is much more likely than mainframe downtime.

With cloud you can rely on more datacenters and, if your application is built right, it may end up being more resilient than what a mainframe setup would give you.

Downtime is a fuzzy thing. Whenever possible, I design my apps for graceful degradation - if the database becomes read-only, we can still operate in degraded mode. If we lose queues, some things will not work, but others will continue normally and many users will be completely oblivious to the alarm bells at the NOC (just kidding, there's no such thing). My SLA for full operation is much more relaxed than for degraded operation.


I think a lot of sites would benefit from offline-first and cache a bit of information, perhaps encrypted with the same password you used. There is a lot you can still do if you can a functioning site (served from a service worker) with some cached data.

You could read HN articles you read before, for example!

Not to rely on this for the nines, but as a nice way to keep things useful when there is an outage or just slowness.


I'm not sure having mainframes counts as sovereignty. Yearly savings suggests it doesn't.


These hyper-ideological takes on finance are so tiring. If you think they are bound to make more profits, I suggest you invest in them, they are called FDX on NY stock exchange.

EDIT: I used to be like that, too.


Fedex will continue to charge as much as they can to boost their profits. This move will increase their profits, making $1.50 on each of 100 million deliveries instead of $1, boosting profits from $100m to $150m

However if they were to undercut UPS and others they might be able snap up UPS business, dropping the profit to $1, but on 200 million deliveries, and thus they'll make $200m instead, you the customer will have a cheaper delivery, and everybody wins


Until it becomes a race to the bottom and UPS drop their prices as well. Then both companies suffer


> becomes a race to the bottom and UPS drop their prices

This is competition.


They're not really competing on price but on quality. They make the most money with super urgent deliveries. If they offer later pick up times or earlier guaranteed deliveries, customers will be happy to pay more. And their IT system is crucial for that.


This is how the "passing the savings on to their customers" happens.


And all consumers win. This is the beauty of capitalism, it's how it's supposed to work.

It fails when you get high barriers to entry and company colluding


You can take part of the savings by buying NYSE:FDX stock.

I'm not being facetious.


I'm in corp land and this to me reads as the CIO using external PR to push internal agendas.

Are they really going to close all their on-prem datacenters in the next 2 years? Maybe.

More likely, this is a way to build momentum around the transition to cloud. Not a bad strategy, and not uncommon from what I've seen in the enterprise.


So are the moving to AWS? Are we going to have a complete breakdown of packages delivery next time there is an outage at AWS?


The article says: The company is a known Oracle Cloud and Microsoft Azure customer.

With Amazon being a formidable logistics competitor, I can see why Fedex might be loath to fund a major competitor.


> The company is a known Oracle Cloud and Microsoft Azure customer.

I guess I'll prepare for the inevitable apocalypse then...


> The company is a known Oracle Cloud and Microsoft Azure customer.


I read the parent comment as questioning the wisdom of relying on a cloud provider (but not specifically AWS, as you indicated) for a global operation life FedEx. Even with service zoning, in extreme cases that provider constitutes a potential single point of failure. Unless FedEx are able to run their core systems on Azure and Oracle‡ then I think the point stands.

It would be interesting to know how Fedex sees the risk/reward equation, and how Azure (for example) compares to a (IBM?) mainframe data centre in terms of reliability and uptime

‡ I'm not sure if there are any cloud neutrality solutions that permit this, but it is possible that FedEx have rolled their own.


Hard to say for certain. Companies that size often have many very independent operations run by separate internal departments. I have worked places with internal, AWS and Azure operations underway.


Where I am various parts of the business run workloads on AWS, Google Cloud, and Azure. To some extent I think this sort of behaviour is encouraged by large corporates because it makes contract negotiations easier if you're able to wave in the direction of the other cloud providers you're using and suggest it wouldn't be too hard to migrate over to them.



Do you trust FedEx to operate datacenters better than AWS or Microsoft?


I don't even trust FedEx to deliver my packages and that's supposedly their core competence.


Not especially, but that hardly matters; what matters is that FedEx's CIO doesn't trust FedEx to operate data centers in a way that is better for FedEx than AWS or Microsoft.


If AWS is charging a 1000% markup on a bunch of stuff, you don't have to be as good as them to save money doing it yourself


Which services do you believe AWS is charging 10x markup on?


Egress


How did Dropbox end up saving tens of millions bt making their own cloud infrastructure but FedEx loses money?

Surely FedEx would have a metric shit ton of data?


I'd bet Dropbox has way more raw bytes to store than FedEx. Plus, the Dropbox business model is based on data egress to clients, which is really expensive on cloud servers. FedEx needs data centers more for internal logistics, which is probably why they're going with Oracle.


Ask yourself what probably happened a few years later when they had to negotiate new leases from their datacenter landlords. Funny how they never put out a follow up press release about the enduring value of on-prem. Also a company that basically can’t grow and hardly turns a profit.

Really, not the first example I’d reach for.


I wonder how much of that savings comes from closing data centers and how much from getting off IBM mainframes. It's hard for me to see why a company like FedEx would need to be spending anything like $400m a year on data centers unless a big chunk of that is paying IBM to rent their mainframes.


It would be interesting if those down-voting my comment would explain where my intuition is wrong. FedEx delivers 12 million packages per day. If each package requires 10 transactions that works out to about 40k transactions per second. You should be able to handle that with 1 IBM mainframe or much less than 1000 servers and I don't believe either of those would approach $400 million per year to operate.


I never understood the logic behind companies abandoning their own data centers and moving their gold (Data is the new gold) into somebody else hand. I see that many here are also thinking in the same line.

But when I spoke to a CTO of a F500 company. This was the take he had. He feels one of the major driving force is to reduce the carbon foot print of the company. His company is designing the system to have the crucial data on their own infrastructure but keep they other data & do the processing of cloud service. Instead of relying on 1 provider, they are using multiple. Also supporting some small local cloud providers.

This seems very relevant for a logistics company with a huge of fleet of air craft. Especially considering the upcoming carbon tax.


I'm skeptical of the carbon footprint thing for a few reasons, but the other point is interesting and something I think about once in awhile. As we build out more and more bandwidth, compute, storage, and APIs to tie them all together, it seems that the world more resembles some sort of large computer and most of the work is just getting the data and compute closer together when needed and minimizing costs when it's not. I know I'm far from the first to have this realization.


I wonder how much of this is motivated by the lack of staff. The market of available mainframe operators has been drying up for decades. When your entire business depends on a technology nobody is willing to learn or go to school to understand, that creates a problem.

Also, from my understanding, it can take a tremendous number of distributed x86 systems to supplement the scale, speed and reliability of a mainframe. It's possible that unless FedEx gets away from managing storage arrays and hypervisor farms, they wouldn't have enough staff to operate the datacenters once they're full of enough servers to replace the mainframes.


> The market of available mainframe operators has been drying up for decades.

Just like many other industries, the market for mainframe operators that want to earn less than technologists in modern stacks has been drying up. A lot of mainframe operators refuse to bring their pay scales in line with what developers in modern stacks are earning. Why take a pay cut to go work on a 40 year-old system when you can work on modern stuff with more money?

They keep the pay low and complain about a labor shortage to deflect from the fact that they're simply choosing to ignore market forces.


Would be curious to learn the specs of current day mainframes, and cost.

What sort of work is typically done on such machines these days.


Disclaimer: I have not worked directly on mainframes themselves.

They've been doing the same type of work for decades. Batch processing of records and transactions (payroll, insurance, banking, airline booking, etc.) IBM releases new mainframe models all the time, and typically companies will lease them and upgrade every few years.

When I worked at an insurance company, the mainframes were the core of their financials. They had around 8 of them in an active/standby DR site. The machines could do instruction-level error checking, meaning even a weird processor failure/error wouldn't impact the end transaction. Basically if you want to guarantee accurate handling of your money, you use a mainframe.

One common thing to see nowadays is the mainframe becomes abstracted by some other layer of API, such as a rest API. This has reduced the need somewhat for mainframe operators, but in the end you're still interacting with some COBOL that was written 30 years ago and somebody needs to understand it and the hardware. And the hardware is so different from traditional servers, you can't just pick it up and wing it. You really need someone who knows what they're doing.


The thing I see missing in the comments is that sourcing hardware over the past few years has been incredibly difficult. It will get better, yes, but that may have been the last straw to convince them to switch.


Past few years? Maybe the past 18 months it's been slightly more difficult but even at the beginning of COVID it wasn't hard to source most anything. And even during these more recent supply-constrained times, you can still source hardware it just might take a little longer for delivery or you might pay a bit more than you did before.


Another way of looking at at: Where own data-center cost gets to high it might be due to lack of responsibility to act or inability to manage.

A company then have two options: (a) replace/pressure top/middle management to get costs down and processes streamlined or, (b) move to the cloud. Its easier (and possibly good for your resume) to move to the cloud and then a few yours later -- decide again.

So moving to the cloud is sometimes an indicator of an inflexible/stale/mature company.


Surprising to see a company claim massive cost savings for moving _off_ the cloud.

But perhaps they're talking about engineering velocity, reliability, etc. Of course everyone knows Dropbox's story with this: https://www.datacenterknowledge.com/manage/dropbox-s-reverse...


Here we are, meticulously crafting open-source programming languages, libraries, operating and distributed systems, only for people to squander the knowledge and freedom on closed ecosystems like Apple's and Google's, and, in the data-center world, the top three cloud behemoths. Assuming typical lock-in, this won't save anything in the long run, nor help with resilience and freedom of choice in the market.


'Within the next two years we’ll close the last few remaining data centers that we have, we’ll eliminate the final 20 percent of the mainframe footprint'

Does that truly sound achievable, when they have the most difficult to migrate part of their mainframe applications still on their mainframe? The easy stuff they have already moved off.


I think they will regret this in the long run when they have disagreements with their providers or their providers go bankrupt and fold overnight because of corporate mismanagement/thievery (ala Enron). Even Amazon and Google will go under eventually...


Or the amazon cloud becomes the new IBM mainframe and in 20 years they all work on getting off it, but can't because they are so deeply integrated.


it would be hilarious if IBM is the immortal one and eventually buy out a limping AWS and keep the software running for the next N decades.


Maybe, just maybe FedEx had a bunch of employees making a lot of calculations before going this route. Just maybe they made an educated decision, based on their inside knowledge of the company and it’s needs. Or maybe a bunch of people on HN just know better.


This makes a lot of sense if you think through the angle that their core value prop does not come from the network/data latency optimization. It's not Netflix and Dropbox - this is FedEx. Their core business value proposition comes from logistics and being able to react to the information quickly, but extra millisecond latency is unlikely to make or break the business.

In light of this, it's easier to outsource the network and data infrastructure to a provider that actually does it day in and day out and put the limited engineering resources to something that is more impactful.

Just reaffirms that quite a few companies do not want to run their own servers unless they absolutely have to.


> through the angle that their core value prop

Surely this is just one ingredient to a successful company and there are myriad factors involved. The core business of Boeing isn't to make planes its to sell them.


That's the wrong analogy. In your example, the former is required for the latter. FedEx wants network stability and a data store, most likely, but the success of their business does not depend on them hosting their own infrastructure.


Where are they using all of these data centers and mainframes? That is they are spending at least 400 million to do what exactly? To me that seems like rather high figure in any case for their sites and logistics...


With (most likely) 0 knowledge about the internal details of their business, but with the confidence of the average programmer: "they spend too much".

I imagine Twitter could also be built over a weekend, right? :-)


The more ignorant you are about the reality on the ground, apparently the easier it is for some people to tell others "obviously, you're doing it wrong".


Then again, maybe these savings come from whole same clean-up of services and other stuff they are running... After all there is likely decades of cruft around...


Literally every one of their vehicles is a knapsack problem and a traveling salesman problem needing to be solved every day. I'm sure they go through plenty of compute power.


With airplanes especially, optimal utilization is key. Routing millions of packages a day through such a complex network around weather and other incidents while keeping SLAs is simply astounding.


Yep... A couple of million of packages daily + tracking a fleet of vehicles, client facing site, apis for their tracking and courrier services/devices plus some internal services (billing etc...).. this seems like it would fit on a few racks, not nearly 400million, or whatever the full number is (since 400m are just the savings).

But yeah... old company, they might still have stuff written in cobol virtualized somewhere.


Their Memphis air hub processes 2 million packages by itself during the season. You also forgot to include FedEx Freight in your estimate.


edit - Autocorrect was terrible on my phone today!... The Fedex Memphis air hub processes more than 2 million packages a day itself during the winter holiday peak season.


With 300k employees you can expect them to have thousands of disparate services that will likely get rewritten part of such migration


But even virtualized Cobol should fit in a few racks.


”FedEx first opened an on-premise data center at 250 Spectrum Loop in Colorado Springs, Colorado, in 2008”

That seems implausible: where did they keep their mainframes before 2008?


Colocated in another company's data center.


They could get back to colocation. I think even AWS & Co don't own all physical buildings, they also rent parts of data centers in some locations. You don't need to own real estate or a physical building to run servers.


What's the 8MW in this context?

Is that power. Or something else?


"We presently own and operate four primary campus locations (“Primes”), encompassing 12 active multi-tenant data center facilities with an aggregate of up to 14 million gross square feet (GSF) of space. Our existing facilities are equipped to provide up to 490 megawatts (MW) of power, with the potential to scale to more than 1,300 MW upon full build out of our existing footprint."[0]

It sounds like Switch Inc. advertise their datacentres by Wattage. I am not sure how common that is.

[0]https://investors.switch.com/company-profile/default.aspx


Very common.


Do you buy by the watt though?

It also advertises the square footage but you don't buy by the sq ft either


> Do you buy by the watt though?

Usually in larger DC deployments, yes, you are quoted, metered, and billed by total power. The amount of space is (usually) an afterthought. There is more space than you can fill and remain under the power commit, which also correlates to total cooling load too.


Along with the data,

They. Own. The. Data center(s).

Well, in some sense or another. They might not be completely paid for but you get the idea, there's got to be a certain amount of worthwhile assets that could be leveraged.

Too bad there's not any business executives who have any experience at making money from a data center itself.


This is because companies like AWS are able to perform Undifferentiated Heavy Lifting. There is no need for companies with undifferentiated workloads to keep investing to Capex that will depreciate as soon as they are stacked.


From an managerial accounting perspective it makes sense. Reduce cost centers and focus on core competencies. But many things make more sense from a short-term, myopic, improve this quarter accounting sense that are actually bad ideas.


Not saving, just giving those $400m to AWS or Azure, over some amount of time =)


Long enough for this exec team to "prove" on paper they saved money during their tenure. Than hit the road and the next exec team to move it back because 'ooops'


https://www.switch.com/colocation/ The Switch colos all appear to be rendered concepts, but impressively so.


I always wondered what latency requirements they must have in a distribution center that has to route packages dynamically and make those routing decisions within milliseconds. Surely there must be a lot of logic at the distribution center, they probably are their own data centers. Or can you really have 10+ms latency for each decision?


Is there a website where I can bet on the outcome of this?

I've seen lots of companies announce their intentions to move off of their mainframes, but I'm not sure I've ever actually seen one pull it off.


I think the thing you're looking for is a "prediction market". https://en.wikipedia.org/wiki/Prediction_market

Related to that, one funny thing I remember from Robert Heinlein's "The Moon Is A Harsh Mistress" is that the Lunie protagonist didn't understand why Terrans bothered with insurance companies. Are there no bookies?


There's not really a market for three big airline shipping things. Amazon air + UPS are tag teaming them in a bad way. This isn't about IT, it's about that FedEx is not in great shape.


Everything will merge to one BIG company because of cost-effectiveness.


"FedEx first opened an on-premise data center at 250 Spectrum Loop in Colorado Springs, Colorado, in 2008" - what were they doing before that?


This is an ideal project for managers whose end of year bonus is linked to immediate cost savings. Who cares if it costs FedEx more in the long run?


Extremely shortsighted. But the cloud is still the big trend. It’s a “brilliant” move until it isn’t. And then the pain is monstrous.


What price is acceptable for ownership of your data?


"save"

In the very short term. Data centers are expensive as hell to operate, but so is the cloud. I think colo would have been the way to go.


Most financial analysis are companies are short-term.

But labor costs keep going up and cloud computing costs keep going down.

So I don't see how this is short-termism.


> cloud computing costs keep going down.

For now.

I also have to keep reminding my coworkers that AWS gives us faster cores every few years but so far they always have the exact same amount of memory. So caching things (especially in process, but also on loop back) to save computation isn’t really a winning long term strategy. It’s never going to get better than what it does for you in the beginning. It will only decline in value.


They want you to use other services for caching, right?


I see most places raising their prices as seed funding goes out.


Lol “saving $400m” really equates to “moving onto AWS or Azure and paying hundreds of millions to them”


Too bad it's not sooner, could use a flood of decent servers on the second hand market.


Reverend Mother Mohiam: Many men have tried

Paul: They tried and failed?

Reverend Mother Mohiam: They tried and died.


context is everything, reading that I thought it would be hilariously funny as an exchange in Father Ted.


Wow, this both a fantastic joke and could be used as a school book example of the importance of context.

Given of course that everyone would have watched Father Ted. But the world would be a better place if everyone did.


Indeed. Hadn't heard of it before moving to Ireland, I credit my loving neighbors for introducing us to this national tradition.


It’s kindof shocking to see it take so long to move away from mainframes. I understand it a bit with regard to banking specifically, but I’m shocked to learn FedEx is such a dinosaur.


And then AWS will charge $399M…


I'm surprised at the amount of vitriol being spewed at FedEx in this post for their decision, mostly about the doom-and-gloom that awaits FedEx.

Either there are a lot of Fortune-500 CTO's on Hacker News that know better than FedEx, or there is a deep-seated fear of the cloud, specifically PaaS.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: