Entirely unsurprising - we know Amazon uses their own ARM CPUs internally for large swathes of AWS infra and services.
None of the other cloud providers are using ARM at this scale yet. They’re using Ampere chips, whereas Amazon invested in building their own - it tells you the level of commitment Amazon have here.
My employer is moving workloads to ARM on AWS as it’s much better for price/perf, and our shiny new M1/M2 Macs run the same binaries. We’ve had very little issue doing the shift - Amazon Linux and Debian both support ARM well.
Must be really putting a dent in Intel’s data centre sales though.
> None of the other cloud providers are using ARM at this scale yet. They’re using Ampere chips, whereas Amazon invested in building their own - it tells you the level of commitment Amazon have here.
James Hamilton was adamant AWS build ARM expertise in-house [0] (one which they accelerated by acquiring Annapurna Labs), and he's been proven right [1] despite the obstacles he himself foresaw [2].
> and he's been proven right despite the obstacles he himself foresaw
I wouldn't trust anybody to be right about the benefits of something this big if they don't also foresee the obstacles you'll face to realise those benefits.
You still need VM to run Linux ELF on macOS, but you don't need to use full-blown emulation like QEMU if the Linux ELF is for AARCH64, unlike when you try to run x86 ELF. It's much faster this way.
Just as an aside, the same process works painlessly with podman, too, if your employer is allergic to Docker. Maybe once or twice a year I get reminded that I’m not using x86 - and it sure is nice to work all day and finish with 60% battery.
Running on ARM itself is a big power savings. Docker Desktop also had issues with CPU load which they might have improved since I stopped using it a couple of years ago. There were a bunch of GitHub issues starting around 2018 which were closed, moved, reopened, and it looks like it’s still a concern whereas with Podman I’ve been able to use it without thinking about it.
"Does that mean you're all running asahi linux or are you somehow running linux elf binaries on macOS?" So you're running whatever linux districtuion you want or some particular flavor through Docker on Mac? And you don't have ELF binaries because you're using precompiled packages for ARM and don't need weirdo prebuilt ELFs, or you have a workaround?
Docker for Mac runs a full-blown Linux kernel last time I checked (Docker for Windows can use Windows containers). So whatever OS they run on the cloud (with OCI) these containers are going to work on the (very fast) M1/M2. The OS the Docker images are using is irrelevant. The OS running in the cloud is irrelevant. Only relevant factor is Docker for Mac as it means Linux kernel overhead. But this is offset by M1/M2. Clever setup!
Plus as the Apple virtualisation framework now implements pretty standard virtual hardware - it uses virtio - we are finding things work vastly better in Docker for Mac - particularly as we can use virtiofs etc to pass files through from the host.
The interesting thing is, how is Docker for Mac able to run x86 containers at speed when Apple's Developer Documentation for Rosetta says it can't be used for virtual machines?
Or have they just embedded the QEMU on-demand translator with binfmt_misc in the ARM virtual machine and the M1/M2 is just powerful enough to make users not notice what's going on?
Yes, but pretty much in the same sad state as they were a few years ago. We're running a few in production and next time we touch it we're replacing them with Windows Server Core EC2 instances built from baked images.
Note on Linux containers:
Running Linux containers on Windows requires the use of LinuxKit or WSL. Docker Desktop for Windows requires you switch modes between Linux container mode and Windows container mode as you can't run both simultaneously. A workaround is to install an additional Docker daemon inside the WSL environment. Most people are going to install Docker Desktop and use WSL in Linux mode. It's fine. Hopefully my facts are up-to-date.
I think ARM and AMD both are starting to cut deep into Intel's market share. I do think Ampere is a bit higher priced than I'd be comfortable with if I were setting up servers. The AMD 128-core Zen 4c server chip is very impressive in terms of per-core power use and on par on power usage with ARM even.
Incredible amount of computing capacity per u in the racks at this point on both fronts. Intel is betting on custom accelerators, but I don't know how well that fits on a rental (cloud) model unless you get a full server/cpu.
>My employer is moving workloads to ARM on AWS as it’s much better for price/perf
It's a fool's errand to try to reason about Amazon's pricing. We don't have figures about how much a Graviton costs to build or to run, but we do know AWS has staggeringly high margins across the board. They have huge headroom to make price cuts if they want their homegrown solution to win.
And perhaps it is better! I have no doubt it's a good CPU. But we can't infer that from pricing.
It could also be simultaneously true that it's cheaper to rent ARM servers from AWS than it is to rent Intel servers from them, and that AWS makes more money renting you the ARM servers. In fact, given they're developed in house, it might even be likely that the ARM margins are higher for them than their Intel margins.
Quite right - you can't infer performance from pricing.
We did some assessments of how well our applications ran on ARM and x86 based AWS instances, and quite simply it costs us less to serve the same quantities of end user traffic using Amazon's ARM instances compared to the x86 ones.
I'm sure they are. But Amazon has a multi year lead.
Amazon took a gamble on server ARM chips. Not a bad or blind gamble, but they had no idea if their customers would take to them at all. I'm sure if they hadn't, we'd all be sitting here saying what an obviously dumb waste of money.
AWS has a massive economic hand in ARM support. By offering cheaper compute on the ARM platform, they gave enough reason for everybody to make things work on ARM. I think they had a pretty good idea that customers would make it work for the price
> The real question is, why are other cloud providers not working on similar chips?
I suspect the answer is simple; they are nowhere near as flush with cash to buy up a CPU company and invest years to make their own chips. It's no small endeavor that AWS has gone through to get here.
Microsoft and Google are far more flush with cash than Amazon is. Out of the smaller cloud providers, IBM and Oracle wouldn't have needed to buy a CPU company, they'd both been designing top tier CPUs for decades.
Here's an alternative equally simple explanation: ARM doesn't actually have anywhere near the level of advantage in perf/$ that Amazon's pricing would imply. The price differential on AWS isn't coming from any kind of intrinsic advantage, but from Amazon subsidizing their ARM-based compute for strategic reasons.
If ARM wasn't more efficient for Amazon then they wouldn't be aggressively moving their entire internal AWS infrastructure to gravitons. I've heard from folks in the know there that every single new service big or small internally has to work on graviton and by default is expected to launch on it unless the product goes through hoops to justify x86 and its expense.
Sure, when AWS was created Amazon.com didn’t use it.
However today, Amazon.com is built entirely on AWS. It has been using EC2 and S3 since at least around 2010 (by 2013 “legacy hardware” was almost non-existent). Until 2020, it was all “hidden” behind Amazon’s internal tooling. Around 2019 there started to be a big push to just use AWS directly ie teams get AWS accounts owned by the Amazon.com org. There is still internal tooling to hook the code repo, build systems, and “pipelines” to directly deploy to AWS resources.
Meanwhile, the “serverless” I was talking about in the previous post was an internal service that hosts a Lisp dialect called Datapath. As a developer, you write some Datapath and give them the money and it executes as much as you need it to. This service is the core of Amazon.com and was actively being migrated to Graviton.
>I've heard Amazon.com doesn't even use AWS — was this ever acurate?
What? AWS spun out of Amazon.com having extra servers/infa after the holiday season rush and pushed for an internal way to monetize their extra compute/storage power.
I've read that they did use it for some time then stopped. People then took AWS to be "Amazon Scale" despite not being true/misleading.[0] If I'm reading it right, even your own link only says they are using s3 for backing up their database.
But is ARM necessarily that more efficient or is it just cheaper than x86 because Amazon designed it inhouse and doesn't need to pay AMD/Intel (who have high margins)?
Web workloads aren't super CPU intensive on their own. You take some bytes from the network, parse them, make some calls to services on an internal network, then serialize and send network bytes. Most of that workload is waiting for the network bytes to come and go.
For that kind of work it's all about power efficiency and Amazon is able to specifically design their gravitons to have just enough cache, CPU speed, management components, etc. to get the job done and nothing more. A bog standard Intel CPU is designed for any and every workload so it's going to be idling and wasting power on components and transistors you never use or care about.
There's a reason it took over the smartphone industry decade before AWS started using them.
Arm on the server isn't a radical idea and been proposed for a long time. Smartphones and servers share a lot of similar requirements like especially performance per watt
> Out of the smaller cloud providers, IBM and Oracle wouldn't have needed to buy a CPU company, they'd both been designing top tier CPUs for decades.
CPUs that need an insane amount of power and have barely any applications in mainstream computing. Not exactly a good fit to provide to customers.
> The price differential on AWS isn't coming from any kind of intrinsic advantage, but from Amazon subsidizing their ARM-based compute for strategic reasons.
By running their own CPUs, AWS doesn't have to rely on Intel's ability and willingness to ship stuff and they don't have to pay Intel's margin either.
> CPUs that need an insane amount of power and have barely any applications in mainstream computing. Not exactly a good fit to provide to customers.
Obviously there's no real market for Sparc or POWER in the public cloud, and there probably never was a time window for creating a viable market either.
But if they'd wanted to get into the server ARM market, they would have had the expertise in-house, there was no need to buy out some startup for the team. That their expertise was on CPUs with different microarchitectures, but that shouldn't matter all that much (didn't matter when Apple bought PA Semi to work on ARM instead of PowerPC).
> By running their own CPUs, AWS doesn't have to rely on Intel's ability and willingness to ship stuff and they don't have to pay Intel's margin either.
Right, being in control of your own destiny rather than being tied to a single supplier is a big deal, and that's one example of what I meant by the strategic reasons. Having more leverage over Intel and AMD on pricing is another. Making it harder for some customers to move their workloads away (because they're bought into ARM and other companies don't have compelling ARM offerings) is a potential third.
Not having to pay Intel's margins is a potential advantage, but requires reaching sufficient scale to offset the costs of running your own CPU team.
But for any of this to make sense, they need to successfully move a large proportion of their customers over to ARM. Just having the servers available and nobody using them does nothing. That's exactly the kind of situation where you'd want to subsidize the prices for the ARM servers and build up the customer base. That's the case even if ARM has some intrinsic perf/$ advantage; you'd still want to subsidize it to the point where you're able to saturate your ability to manufacture and install ARM servers.
Not just cash but also willing to make investments which don’t show returns within a year. Amazon has a history of very lean profits due to reinvestment and something like this might not be possible with Google’s promotion culture, especially adoption takes time so the initial numbers won’t be compelling.
Do they? My experience is that they cost somewhat less, and are somewhat slower than the Intel competition. Their price/perf advantage is mostly in synthetic benchmarks, IRL the difference is much less pronounced.
So far they seem to be betting on Ampere which seems to be developing their own cores (which is much more complex/expensive than what Amazon is doing) so it's interesting how they will do. If Ampere manages to get a moat AWS might end up behind everyone else in a couple of years.
I got an ARM Windows 11 laptop (Lenovo X13s) because I wanted a lightweight travel laptop with long batter life. I was expecting a big compromise, but it's actually wonderful!
I'm a giant Intel fanboy, but I think they have a real challenge here.
Intel definitely has a challenge – having gone through both of Apple’s PowerPC migrations, this was the smoothest yet because we’re in a much better place with open source software dominating and internet distribution being the norm. I remember things being held up by proprietary compilers or needing to port hand rolled assembly but this time around we’re using higher level languages & frameworks a lot more and it’s been mostly transparent. Similarly, back then getting hardware to build and test was a chore; now it’s a checkbox in the EC2 launch options.
Thinking about it, I knew that Intel not landing the first iPhone contract was a big loss on mobile but I don’t think anyone expected it to savage their traditional strongholds like this because that paved the way for so many tools not just being ported but heavily optimized for ARM. They’re certainly still doing a ton of business but now there’s a cap on how much profit they can take even if AMD completely founders.
I don’t want to say it’s all the iPhone but I do think it’s not a coincidence that things are very different now that we have robust tools like Clang and over a decade of compatibility patches throughout the open source world.
In 2008 even Java developers tended to be antsy about running on non-x86 hardware. A decade later it’s close to a non-issue for a staggering amount of stuff.
x86 lock-in benefits AMD greatly, just as it does Intel. It is my belief that they started looking at ARM cores when they thought they would never be able to be competitive with Intel in x86 (pre-Zen was a bad time for AMD). And when it became clear that actually their new cores would do very well against Intel, they dropped ARM plans. They don't want to move off x86 if they don't have to.
Intel Xscale were a different beast. They got that through StrongARM from DEC as part of some law suit wheeling and dealing. They never seemed to really want to push it, preferring instead to try competing in mobile with a string of failed x86 cores.
> I knew that Intel not landing the first iPhone contract was a big loss on mobile
Funnily enough Intel also had the best/most competitive ARM cores with XScale in the early 2000s which seemed like the default option for most high-end devices at the time. I wouldn't have been surprised if Apple would've just went with that (even if not x86) if Intel hadn't decided to can it without actually having any viable x86 SoC to replace it...
Why would Amazon develop a custom Arm chip and not RISC-V?
Don't they have to pay a fee to Arm for this, even though it is being developed by them?
And would Google (which is allegedly developing their own Arm Chip) and Apple (which also has their own chip) not also benefit from backing Risc-V as well?
I have not been able to get clear in my head the relationship between Risc-V and Arm, and even more so exactly what responsibility down stream each user has.
Would appreciate the some ELI5 from knowledgeable HNers.
---
Slightly related: is it more or less a given that eventually no one will use x64 chips (since Arm/Risc-V is more efficient)?
If not for servers, is it at least a given for desktop computers, as more software is written for ARM, and re-compilers like Rosseta improve?
If so, I know that Intel is investing in RISC-V, but has AMD given up the space entirely?
Because by paying a fee to Arm, they get decades of architecture experience, support, peripherals (including a well-thought-out internal BUS), etc.
RISC-V has a good ISA, but there currently isn't a lot of silicon experience in the real world (relative to Arm and Intel, for example). And Amazon isn't going to roll their own to that degree. It's not worth the meager licensing costs.
It's like the difference between deploying Red Hat Enterprise Linux and Arch Linux. Each has it's place, but right now its advantages are in much different areas.
The most common use case is where ARM provide fully functional CPU core IP that you can put into your own SoC - in AWS Graviton's case that seems to be the ARM Neoverse line of cores.
You can also get architecture licenses where you can implement your own CPU core (from scratch, or by making tweaks to an ARM design) - Apple has one of these, but Apple is a special case in that they were one of the original partners when ARM was originally founded in 1990.
RISC-V on the other hand is an architecture that is open for anyone to implement, but the RISC-V foundation doesn't sell you a core design - you need to create your own, or find a partner to sell you one. It's also much newer, so there's less of an ecosystem around it compared to ARM.
Because it's way cheaper than to do everything ARM provides on your own?
Gravitron3 is just using ARM Neoverse-V1 (which AFAIK are years ahead of any RISC-V cores). So they aren't different from Qualcomm/Ampere have been doing so far.
I think Ampere is planning to release a CPU with their own custom cores in the near future though. It's interesting whether Amazon will follow suite or not. It would be very bizarre if they tried switching to RISC-V in the near future though (I do find this semi -fanatical obsession with RISC-V a bit fascinating though but I don't think it's shared by many large corporations outside China..)
> Why would Amazon develop a custom Arm chip and not RISC-V?
The first Graviton was launched in 2018, which means it was in development and manufacturing even earlier. You couldn't build a high performance RISC-V chip back then. Even in 2023, RISC-V is significantly behind ARM/x86 in performance and software support isnt great either (for example, Debian has supported Arm for over a decade, but RISC-V support only arrived a few weeks ago).
Article says Amazon accounts for >50% while China accounts for 40%, so Amazon actually has >80% outside China. I wonder why Microsoft and Google aren’t as eager to stop paying the x86 tax.
They are but it takes 2 to 3 years to develop a chip like this. Someone already posted a link to the Google chip. Microsoft got a lot of CPU designers from Qualcomm's team after Qualcomm got rid of them a few years ago. Now Microsoft has laid off their custom ARM core team so anything they do will be a licensed ARM core.
Three. Two operated by other companies (in Beijing and Ningxia) that are only available in China, and Hong Kong which appears to be just another AWS region available globally.
Microsoft has been abysmally slow to move towards ARM. They hitched their wagon to Qualcomm and snapdragon... which does not seem like a smart decision now (especially as Qualcomm seems to be moving to RISC-V lol). Windows on ARM for servers has been a bit of a pipe dream and joke--I think they finally have it available but I stopped looking or caring about it years ago. Amazon is the defacto ARM in the cloud/server leader right now and I don't see that changing anytime soon.
>which does not seem like a smart decision now (especially as Qualcomm seems to be moving to RISC-V lol).
I politely disagree. We know for a fact, from December's RISC-V Summit, that Microsoft is influencing RISC-V to ease the burden on their own Windows for RISC-V effort.
Qualcomm was a partner for ARM, and they are already used to working with them. They will likely be partners again, with RISC-V devices this time around.
And Windows for ARM will go the same way as Windows for Alpha. Just an historical footnote.
Why? Some people here seem to be weirdly obsessed with RISC-V when it's likely to end up being even more closed/proprietary than ARM (at least on the high-end). Why would you give away your competitive advantage to everyone else by licensing your core designs?
ARM at least provides more or less an even playing field to everyone. It's much cheaper for competitors to catch-up with Gravitron since they can just license the exactly same core it's built on. If it was a RISC-V CPU and as much ahead it would be way easier for AWS to maintain it's moat.
The big hyperscale datacenter companies either design it themselves or partner with another large semiconductor company to help them design it for their particular needs.
These companies are not interested in selling to small companies that just want 1 to 100 machines. The only company that might is Ampere.
Oracle is using the ARM server chip from Ampere but they would be buying tens of thousands or more and getting volume discount pricing. I've heard rumors that Ampere customized their chip for Oracle's needs.
On a mix of Graviton2 and 3 based instances, we’ve found no significant difference between ARM, Intel, and AMD instances for weird problems/instance replacement, etc.
I tried Ampere Altra on Hetzner, it was unparallel price/performance wise. But it kept falling into low frequency state (1 GHz on all 80 cores). That's either something in the motherboard or in the processor, couldn't figure it out. Guess not mature enough yet.
Wouldn't designing your own cores be on a whole other level compared to what Amazon is doing now (licensing Neoverse)? If they don't feel confident they could build a competitive ARM core on their own (unlike Ampere which seems to be trying that) why would it be different with RISC-V which is still years behind ARM (on the high-end)?
Like what? Are there any high-end RISC-V cores that proven to be competitive with Neoverse you could license?
It all just seem hypothetical to me at this point and I really struggle trying to understand who and why would design cores to license them to others? Looking at ARM it just doesen't seem like a great business model compared to making them yourself..
Why would anyone share their own high-end cores when margins from producing and selling them yourself are much higher than form licensing (looking at ARM).
> That's gonna be fun when Amazon switches to RISC-V for their next (or next+n) iteration of Graviton.
> ARM presence in the server market will go away like poof. More than half of it.
This is a borderline fetishistic delusion to think that this could happen immediately in the first place. The changeover from x86/64 to ARM for AWS customers didn't even go this fast, and that's WITH the help & stability provided by the extensive tooling & ecosystem around ARM.
RISC-V is nowhere near to being able to provide the same sort of tooling for years to come, and this comes from someone that loves the architecture's openness.
Is there any scientific comparison between AWS amd64 and arm instances for performance and cost? I tried running some stuff on an arm instance but it seemed to similar in both price and performance to amd64. Just with added complexity to make it work nicely with Github actions built container images.
If Amazon's heralding something more general, I'm curious what a move to ARM will do to server side async and sync/async trade-off with waiting tasks/io. Less potential point if 20 threads/cores can be had for the price of 1?
Or, what they've changed to take advantage of the difference.
The Register does not report any absolute numbers and the source appears to be a proprietary market research report. Is the actual number of ARM or x86 CPUs at Amazon publically available?
As far as I understand it, electronics are recycled by crushing them and extracting a handful of minerals - e.g. gold, there's more gold in circuit boards than the rocks they dig out of gold mines I've heard.
The rest - which by mass, of course, is something approaching "all of it" - is mostly toxic fibre glass, epoxy and plastic that we don't really know what to do with. It's bad, even before you start talking about all the electricity and heat that was used to turn it into electronics in the first place.
My significant other worked in a recycling plant. To elaborate on the "rest" and how it is done:
PCBs get delivered, sometimes without big components (coolers, screws/brackets fans removed), but also sometimes with those. They are shredded as a whole, including all soldered components, to a particle size of 2mm. Iron and aluminium are separated magnetically (in case of aluminium with an eddy current separator, iron with a static field). All three components, mostly-iron, mostly-aluminium and the rest are then melted down separately. Adherent plastics, electrolytes, epoxy, etc burn off during this process. Gas and ash are lead through separators that filter out the fly-ash as much as possible, which is buried somewhere as toxic waste, the rest is entered into the atmosphere after cleaning (afaik various liquids to bubble through to wash out soluble chemicals, which are then dried and buried). All three metalllic liquids are then metall alloys of various purities which are then cleaned and separated further by skimming off the slag, adding flux, electrolysis and other chemical separation methods. Most of the "other" fraction is copper, which is the main product overall, other rare metals such as gold occur in far smaller amounts and seemingly add only a little to the overall earnings in the process.
Please note that this is the process used in a relatively new plant in a Western European nation. I guess the "advantage" over the rest of the world is that at least there is a fly-ash separator and exhaust cleaning.
Actually, there are companies working on PCBs based on organic, water-soluble substrates [1] - to recycle a board, you boil it in some hot water and get a liquid stream containing the substrate and a solid stream containing the electronic components and copper traces.
Which is a researach-stage idea with obvious drawbacks: Moisture in air will also slowly decompose PCBs, making them age much faster than traditional epoxy-based PCBs. So then you either limit the maximum age of all your products severely (not very green) or you need additional materials to moisture-proof all your casings (also not very green).
Do you really think that they haven’t spent at least as much time thinking about this? The question would be whether those devices are replaced in less than the time it takes for natural degradation to cause problems, and both of your “not very green” dismissals sound entirely too pat. The entire reason we’re talking about this is that the industry has a huge problem with things having a service life measured in single digit years and a landfill life measured in centuries – making those easier to recycle would be a huge win because there’s no way that the current model is going to get the service life significantly closer to that kind of timeframe.
> The research will also provide Infineon with a fundamental understanding of the design and reliability challenges customers face with the new material in their core applications.
It sounds more like they know that this is a hard problem requiring foundational research, and suggests that rather more thought has gone into it than a quick HN comment dismissing their work.
While there are probably more modern servers around now than ever before, from browsing serverhunter.com there seem to be a couple of cliffs where older CPUs gets rarer, one at 2017 and one at 2014.
If by recyclable you meant buying old racks of datacenter servers to reuse elsewhere, even if possible, it's not very practical. Servers are super noisy and may have exotic power requirements.
One prefers locking and complicated instructions, the other is designed with open and simplicity.
At one point, Intel seemed to be dominant, and proven right with their proprietary tech, where as ARM chugged along keeping a low profile, and keep doing the good work. Now, finally the reality has caught up with them, and turns out openness and simplicity is finally taking a lead.
How is ARM open? And simplicity is also subjective… x86 goes way back thats why it seems more complicated or “crufty”.
ARM is a huge corp like anyone else.
ARM got a huge boost now since everything is switching to mobile and their chips are more energy efficient than Intels. Added to that Intel was also asleep at the wheel and missed the mobile boat…
None of the other cloud providers are using ARM at this scale yet. They’re using Ampere chips, whereas Amazon invested in building their own - it tells you the level of commitment Amazon have here.
My employer is moving workloads to ARM on AWS as it’s much better for price/perf, and our shiny new M1/M2 Macs run the same binaries. We’ve had very little issue doing the shift - Amazon Linux and Debian both support ARM well.
Must be really putting a dent in Intel’s data centre sales though.