Evaluating Graviton 2 for data-intensive applications: Arm vs. Intel comparison

BeeOnRope · on April 5, 2022

I wrote this, happy for any feedback or to answer any question about the methodology or otherwise.

One thing note is that the headline difference between these instances types is the CPU architecture, the value offered by these instance types is as much about how many IOPS and how much SSD space EC2 chooses to offer for each family, especially for this IO-bound benchmark.

It is entirely plausible that EC2 will offer more hardware bang-for-buck on Graviton instances in an effort to encourage adoption so we can't necessarily draw a strong conclusion about the future of the Arm vs x86 battle on the server CPU front since Amazon is not an Arms-length entity here (pun intended).

Uehreka · on April 5, 2022

One bit of feedback, it’s a good idea to put “higher is better” or “lower is better” on graph axes.

BeeOnRope · on April 5, 2022

Thanks, this is great feedback that I'll implement next time. You are right that it is not always obvious especially when talking about both throughput (higher is better) and latency (lower is better) in the same post.

bpicolo · on April 6, 2022

I might be missing the context, but I think the units on the graphs might be off? Currently "Bytes per second". Guessing there should be a multiplier on that?

BeeOnRope · on April 6, 2022

Indeed, the multiplier is 1e9.

It was there in the original matplotlib charts, but was dropped in the "chart beautification" process, probably due to the way mpl likes to hide this detail off in a corner somewhere.

sharpy · on April 5, 2022

Recently worked on migrating a number of services (largely CPU bound) to Graviton 2. It was a bit of hit and mess. For some services, it worked great. Very negligible hit to performance, and since equivalent Graviton2 instances cost 20% less, it was a no brainer. Others needed a little bit of rewrite. There were also a couple that just didn't see good performance on ARM (very branchy code)

BeeOnRope · on April 5, 2022

Interesting. Other than branchy-ness, was there anything else you think correlated to poorer performance on Graviton 2.

ceeplusplus · on April 5, 2022

Branchy-ness usually also correlates with cache friendliness (e.g. branchy code is usually a server app calling lots of different functions, with lots of virtual function calls), and Graviton2 has an abnormally small L3 cache relative to its x86 competitors. If I had to hazard a guess I'd say heavily vectorized code will also perform worse especially compared to Intel instances where you have AVX-512.

winrid · on April 6, 2022

Didn't read into this much yet but the smaller L3 cache makes me sad. Benchmarking will show true colors but hoping to move some DB clusters to Gravaton2.

phonon · on April 6, 2022

You're comparing 4th generation storage to third generation. Unfortunately, the I4i instances are not publically available, despite being announced 4 months ago :-(

https://press.aboutamazon.com/news-releases/news-release-det...

https://aws.amazon.com/ec2/instance-types/i4i/

bhouston · on April 5, 2022

Interesting to see that it has higher throughput per $. That is the most important fact.

BeeOnRope · on April 5, 2022

Yeah, these instance types look strong and it seems that Amazon is also innovating in the local storage space as well as I don't find many of the usual gotchas with SSDs there (long GC pauses, etc).

We may be moving to a future where cloud hardware stops being "just OK" and a low maintenance version of what you could build yourself, but actually offering unobtainable-by-mere-mortals hardware and firmware (we are already seeing this for e.g. with quantum computing options in the cloud and even arguably GPUs and ML accelerators).

imtringued · on April 6, 2022

That's actually one reason why ARM has failed so hard in the server space. Nobody is going out of their way to get their code to run on unobtanium unless the cloud provider is an absolute giant and has a proven track record.

It's far more likely that someone will make the switch to ARM if their laptop or desktop is already ARM because then you aren't at the mercy of the server vendor deciding to only cooperate with a handful of cloud vendors or even dropping the project completely.

BeeOnRope · on April 6, 2022

I agree, there's definitely some friction there and for random personal projects I for sure choose to host on the same architecture I develop on.

For large deployments, the calculation might be very different. Beyond that, plenty of people are developing on Arm Macs now.

> unless the cloud provider is an absolute giant and has a proven track record.

Right, well, this is Amazon AWS we are talking about?

ceeplusplus · on April 5, 2022

It is still vastly cheaper to buy your own GPUs and ML accelerators than to rent from cloud providers for any reasonable level of utilization (i.e. not a hobby).

majormajor · on April 5, 2022

A lot of legitimate uses out there for occasional, important-when-it's-needed, on-demand compute that works out to very low utilization overall.

ip26 · on April 5, 2022

I'm excited on behalf of the scientists I know for cloud HPC.

I see a lot of "develop program locally on tiny subset of data" followed by "how the hell am I ever going to run the full model for the final run on all the data!?"

A couple hours machine time leased on a behemoth is really the ticket, and not really that expensive.

zozbot234 · on April 5, 2022

The cost of data egress from the cloud really limits its potential for most HPC applications, though. The "free" or reasonable-cost tiers are okay for small-scale toy use, but not much more than that.

AitchEmArsey · on April 6, 2022

For academia, all 3 major cloud providers have a data egress waiver - meaning that as long as data egress is less than 15% of the total bill then it is free.

Combine this with the fact that S3/Glacier make much more sense than local HDDs for long-term archiving and egress turns out not to be a significant factor in my experience.

xhkkffbf · on April 5, 2022

There's some truth to this, but for many sporadic jobs the cloud can be quite useful. My favorite applications are the streaming video services that have most of their traffic on Friday and Saturday night. You're right that the prices are too high for long, constant workloads, but not about this only being hobbies.

BeeOnRope · on April 5, 2022

My point was that there are accelerators in the cloud which are not available to purchase by the general public and for a time the same applied also to GPUs in a practical sense (i.e., GPUs were being sold in principle but were out of stock for months on end).

Depending on your individual time value of money, it can make sense to rent even a high rate so you can access this hardware now.

zokier · on April 5, 2022

Of course more perf/$ is the whole selling point of graviton so it would be pretty big failure somewhere if that were not the case

BeeOnRope · on April 5, 2022

Yes, but I think one thing that is interesting here is that these Graviton 2 instance types also offer, for example, better IO throughput or storage/$ even though the CPU architecture doesn't obviously affect that (i.e., it's not like you get access to a new source of quality NAND chips to build SSDs when you switch your CPU).

Another thing I found interesting was that the porting process was painless for this native application. There are lots of examples of quick wins on Graviton for interpreted or bytecode/JIT'd languages where the runtime exists on Arm and the software just works, but porting a complex native application would seem to pose more problems. In our case, however, it was quite straightforward (more than I would have expected).

agallego · on April 5, 2022

perf is important, but predictability of the SSD perf is honestly just (if not more) valuable, specially when you have to build SLA's for your products.

xhkkffbf · on April 5, 2022

Does anyone have any insight into the Graviton 3? I'm fascinated by their single precision floating point which can be useful for some ML models.

BeeOnRope · on April 5, 2022

It looks like a strong chip. They are pretty much jumping one or two x86 generations with every release, sort of like what Apple did for many generations in a row starting with the A6 or whatever.

One difference here is that Amazon isn't actually designing the cores: they are using Arm reference designs: in this case the N1 Neoverse, and building around that.

sydthrowaway · on April 6, 2022

Amazon bought Annapurna though - why aren’t they building from scratch?

BeeOnRope · on April 6, 2022

They may do it at some point, but perhaps the path of least resistance so far is to use the existing core designs, which are not bad at all and getting better and spending their effort on the uncore and offcore components which don't have off-the-shelf designs just sitting there and which are very important for server parts?

ksec · on April 5, 2022

Oh Gosh we are in April already and Graviton 3 was announced 4 - 5 months ago? I was hoping we have pricing and more performance figure by now.

CyanLite2 · on April 5, 2022

cliff notes?

Site seems to have been hugged to death

pzarex · on April 5, 2022

Link works for me...

TL;DR, Arm instances give better bang for the buck...

speed_spread · on April 5, 2022

Probably hosted on Graviton instances

redwood · on April 6, 2022

Got to feel bad for Intel