Has Amazon EC2 become over subscribed? (2010)

trotsky · on April 22, 2013

I was trying to figure out how this guy could be so behind the times - just discovered internal network latency at AWS?

Then I saw:

Published: 9:01 AM GMT, Tuesday, 12 January 2010

ultimoo · on April 22, 2013

Yes, the year of the blog should definitely be present in the HN title.

That said, what is the deal with the internal network latency and the other points the OP touched upon today? Did Amazon get around to resolving things, or have things worsened?

seldo · on April 22, 2013

My non-rigorous impression is that latency is roughly the same now as it was in 2010, i.e. not great, but at least it hasn't been getting worse.

redact207 · on April 22, 2013

Have there been any improvements since then?

codewright · on April 22, 2013

No, it's still common to have in-house "find a good node" code that fires up and takes down machines until one that will be reliable is found.

wmf · on April 22, 2013

Previous HN discussion: https://news.ycombinator.com/item?id=1048694

t0 · on April 22, 2013

Can you enlighten us? Why is it so slow? Are some people abusing all of that free bandwidth? Is their LAN simply not built to handle much traffic?

trotsky · on April 22, 2013

I don't have any inside information, but I would guess it's because their san is so sketchy everyone rolls their own DFS and uses other chatty replication strategies, leaving the network running data and disk on the same topology. These days a lot of that traffic will take 4 hops within an AZ.

dotBen · on April 22, 2013

I'm loathed to perpetuate a 3 year old article, but...

One of the key contributing factors to this kind of network degradation of AWS (/or other cloud vendor) is the abundance of the "bad neighbor test" - where a client performs tests to see if they can achieve a 'preferred' amount of CPU/IO on the host of their new instance.

Resource sharing rules at the host level actually means that if everyone is trying to max out their instance, you would still get the equal share you are entitled to and guaranteed with your instance, so what the bad neighbor test really means is whether you can actually go into your neighbor's CPU allocation due to their under-use.

Well, if everyone does that then the system degrades as someone has to a party using less than their allocation and the amount of instance spots that don't fail the 'bad neighbor test' become non existant.

The overall health of the entire network would actually be better if folks didn't do this practice and instead everyone simply evened out their use across their instances that enjoy additional resources and stuck it out with the %age of their instances that only achieved their guaranteed minimum resource use and no more.

My company uses another "cloud-like" vendor and although we don't perform 'bad neighbor' tests upon new instances, it is fair to say our application benefits from the fact that the majority of the instances on their network are under-utilized and we can push into the max CPU of the host beyond the limits we pay for. Where instances do share 'bad neighbors' (ie we can only get what we paid for and no more - boo hoo, etc) we still keep the instance but simply route around that and distribute less load than on other nodes in our network.

That doesn't become the most cost effective mechanism, but the "savings" of the 'bad neighbor test' are probably negligible and ironically by not doing this we become the "good neighbors".

jamesaguilar · on April 22, 2013

Actually, if the system is well-isolated and correctly subscribed, at worst you'll get exactly your reservation. It is not necessarily the case that someone has to lose.

dotBen · on April 22, 2013

Right. The problem is that a 'bad neighbor' test is often defined as when one can only get that guaranteed minimum amount of resource.

That's the part that doesn't scale.

michaelt · on April 22, 2013

The amazon documentation doesn't let you pin down exactly what that guaranteed minimum is, as half the measures are things like 'IO Performance: Moderate' or 'Compute Units: 4' or aren't specified at all (like EBS performance)

Makes sense from Amazon's perspective, of course - less promises to keep, more flexibility.

Can't blame people for measuring the performance empirically, in the absence of hard guarantees. Just that produces results that happen to be wrong.

nucleardog · on April 22, 2013

Compute units are actually a quantitative unit. One compute unit is, to the best of my recollection as I'm on mobile, the equivalent work of a specific class of 1.7GHz CPU.

Other than the "I/O Performance", I think all of their specs are pretty well defined if you're willing to dig up the appropriate docs.

michaelt · on April 22, 2013

I suppose it depends on whether your application depends on ephemeral disk random or sequential I/O, EBS I/O, I/O to the internet at large, cpu cache, ram bandwidth, support for AVX instructions and so on.

To be fair it's understandable why Amazon doesn't promise these features will or won't be present - it would make their already-complicated product offering even more complicated. And for a great many applications, customers won't be sensitive to details like CPU cache and disk performance.

jamesaguilar · on April 22, 2013

Seems like the simplest solution is simply to pay for what you need. How many businesses are really compute-cost-bound these days?

ultimoo · on April 22, 2013

Since you are well-versed with this issue -- I know that Amazon offers tenancy options while creating virtual machines. Is this option not utilized because the price of single-tenancy is higher than the headache of "bad neighbor test"?

dotBen · on April 22, 2013

So as I mentioned we don't use EC2, we use another cloud service, but the economics and technical issues are the same.

But on EC2 Dedicate Instance (ie single tenancy) my guess is that if your application (or, business model) relies on each of your nodes being able to utilize more than your equal share on the host then in fact you would NOT want more than one of your instances to exist on the same physical host, in order to maximize the chances that each instance can grab all of the resources on it's given host.

If this is your model, having all your instances on the same physical host would be disastrous. In fact, there's (economic) argument for Amazon offering customers the complete opposite - pay to guarantee that no two instances are ever instantiated on the same physical host.

niggler · on April 22, 2013

Is there any scale (like xlarge) where you are effectively the only VM on a particular physical machine (thereby obviating the issue with the small instances)?

staunch · on April 22, 2013

...and here's part of why I created Uptano.

https://uptano.com

The flexibility of usage-based billing and instant provisioning is awesome, but it's really not worth giving up dedicated performance IMHO.

jamesaguilar · on April 22, 2013

I upvoted because I love HN comments that offer solutions. That said . . . how can you charge half as much as Amazon for the same service and make a profit. I assume their margins are not that fat, so what are you cutting that they offer? Or else what secret have you discovered that no one knows? I guess since this is your business there's a chance that you won't answer, but I'm curious.

RyanZAG · on April 22, 2013

AWS uses high end, expensive enterprise grade parts (I believe), meanwhile uptano is likely using standard off the shelf parts, likely with bulk account discounts. Each of those servers could likely be put together for $500 - at $100/month, he should be making very good margins after a year or so. I have no idea on rent/electricity/etc, but with enough 1U servers the total cost per server may be only $10/month or so.

The big costs you pay for on AWS are the engineering, networking and UI development. Dedicated servers should be easier to provision and manage and he probably has a much smaller team.

So theoretically, it is possible that his prices are half as much as Amazon and he still makes a decent profit.

mwfj · on April 22, 2013

More like giant margins...

http://blog.carlmercier.com/2012/01/05/ec2-is-basically-one-...

fooyc · on April 22, 2013

Amazon reserved instances cost 2 to 10 times the price of renting a private dedicated server at your random dedicated server hosting service.

Compare youself: https://www.ovh.com/us/dedicated-servers/

oellegaard · on April 22, 2013

That service looks absolutely awesome. Just signed up and had a look - but I miss a major thing: API.

staunch · on April 22, 2013

The system is built on an internal API. We just need to expose a version!

coolsunglasses · on April 22, 2013

I've been looking for this for...ages...

Thank you...so much for posting this.

eurleif · on April 22, 2013

This still uses virtualization though, correct? So you don't get full dedicated performance?

staunch · on April 22, 2013

We're using OpenVZ for this reason. It's very close to bare metal performance, with the advantages of virtualization.

ceejayoz · on April 22, 2013

You can pay a (hefty) fee for dedicated tenancy, where your EC2 instances only share servers with other instances in your account.

The largest instances (the quad- and octuple-extra-large) likely are on their own servers but I've never seen that explicitly confirmed anywhere.

DrStalker · on April 22, 2013

You can pay extra for dedicated Instances: http://aws.amazon.com/dedicated-instances/

> Dedicated Instances are Amazon EC2 instances launched within your Amazon Virtual Private Cloud (Amazon VPC) that run hardware dedicated to a single customer.

eurleif · on April 22, 2013

$10/hour per region? Wow.

niggler · on April 22, 2013

To be fair, though:

"An additional fee is charged once per hour in which at least one Dedicated Instance of any type is running in a Region."

So it's not like they are charging $10/hr/instance (so it amortizes over all of the dedicated instances)

jamesaguilar · on April 22, 2013

Still, that's 100k a year, not including the cost of the actual instances. Not an option for smaller services unless your stuff simply can't be made to work without dedicated instances.

eurleif · on April 22, 2013

I doubt it would ever make sense to use this over dedicated servers, except at a large scale.

Xorlev · on April 22, 2013

c1.xlarges are thought to be effectively the only VM on a machine. I can't dig up the article that proposed the methodology for determining offhand.

c1.xlarges still underperform far under what I'd expect.

sliverstorm · on April 22, 2013

Are VMs ever really performant?

Xorlev · on April 22, 2013

Compared to the 26 ECU m3.2xl, the 20 ECU c1.xl underperforms it by 3x (basic web, JSON struct) in my tests.

corresation · on April 22, 2013

Modern hypervisors have an overhead as low as 2%, so yes, they absolutely are. It is often worth virtualizing simply for the flexibility/management that it bring, even if it's a single virtual machine on a very large box.

LogicX · on April 22, 2013

Somewhere between uptano and EC2 is internap (formerly Voxel.net) agileservers: http://www.internap.com/agile/

rattray · on April 22, 2013

Is there a cloud provider that doesn't (yet) have these issues?

Is it possible to run any cloud service at Amazon's scale without these issues?

(genuine questions speaking from a point of ignorance)

rdl · on April 22, 2013

Some of the OpenStack providers run their own storage networks using conventional SAN tech. Super expensive but more consistently performant.

dmpk2k · on April 22, 2013

My experience with SANs is that they are anything but consistent. Local storage is a better idea: fewer moving pieces to go wrong, fewer moving pieces to understand and debug, fewer possible sources of contention, and the latency is low.

SANs in a cloud environment optimize for the wrong thing. Servers by and large have a high uptime -- since their falling over is comparatively rare, this is simply a problem I've never had difficulty with. What I have had in spades, before I learned better, were database problems due to wild fluctuations in latency to the SAN.

It doesn't help that when SANs kick the bucket, they tend to affect a lot of things.

rdl · on April 22, 2013

The context where SANs make sense, IMO, is when you've got a few servers which need to share stuff (VMs, or whatever). So, essentially everything can fit on one $10k 10GE switch. I've personally never screwed with anything >800TB, too.

Rather than "strictly local storage", I'd say "keep storage as local as possible", but there are absolutely times where keeping it in-chassis isn't optimal.

stonith · on April 22, 2013

There are some using Ceph for the volume service who shouldn't be terribly expensive. Dreamhost for example.

joefarish · on April 24, 2013

Which providers are they? Do you have any experience of using them?

rbc · on April 22, 2013

One think that I think may be oversubscribed at EC2 is the API layer for controlling things like instances, ELB's and autoscaling. This seems to be most obvious in Virginia. During the lightning storm in Virginia last summer, it seemed like API access fell off a cliff. I'm guessing that was because everyone was trying to move their services from the impacted availability zone.

JoshGlazebrook · on April 22, 2013

I've had these issues with the micro instances before. One deployment will be so sluggish it's almost unusable, but starting up another one results in one that is just fine (micro wise).

rabbidruster · on April 22, 2013

While I understand computer scientists are not always the best writers, phrases like "Amazon do have a breaking point" make it hard to continue reading this article.

pcl · on April 22, 2013

In British English, a company is plural, not singular. So this construct is correct in some dialects. And he also uses 'armour', which is again a British construct, so I'd guess he's just not using American English.

(I wonder if British law considers a corporation to be a person to the degree that US law does, or if this plural view of a corporation is pervasive in law as well as in grammar.)

mpclark · on April 22, 2013

This is oft-repeated, but not actually true. Most people in the UK who care about these things (for instance, sub editors in the printed press) hold that companies are singular entities. The confusion comes with sports teams, which are commonly referred to in the plural, so "Microsoft is" but "Manchester United are".

Edit: For example, here's what the Guardian's style guide says on the matter:

http://www.guardian.co.uk/styleguide/c#id-3022716

Of course, with something as flexible and constantly evolving (and as used and abused) as the English language, it is usually possible to find examples and counter-examples for just about anything. There are also edge cases; the Guardian refers to police forces as plural entities, I believe. Suffice to say, when I was running a back bench, singular was the order of the day when it came to company names.

omaranto · on April 22, 2013

I don't think sub-editors are a compelling example: people in those positions are likely to ignore the language they learned growing up and stick instead to some made up rules they believe are 'more proper'. I have definitely heard British people say 'my bank are' and 'ASDA are'.

mpclark · on April 22, 2013

Yes, of course you have heard people commit these errors. Others say "haitch" for the lettter 'H' and still others talk about "nu-cu-lar power". We ain't all educated proper, that's for sure.

I'm intrigued that you are ready to rule out the contributions of a class of people who manipulate the written word for a living (and debate usage among themselves to the point of distraction!)

ASDA? Well, I'd use "Asda" since it is pronounced as a word, not four initials. But the company itself still seems to be struggling for consistency on that...

diroussel · on April 22, 2013

I'm British, and I always thing of groups of people as groups of people.

But then I never won any awards for grammar.

mpclark · on April 22, 2013

Are you trying to say you disagree but you think you're probably wrong?

diroussel · on April 24, 2013

I don't think it's a case of right or wrong. It's a writing style.

But for me, when speaking and writing consider companies to be groups of people, and so I use plural. And I think that is common over here in Britain.

codeulike · on April 22, 2013

I'm British ...

That should be explanation enough.

sophiebits · on April 22, 2013

If it's the plural conjugation you're complaining about, know that that's correct in British English: http://english.stackexchange.com/a/1339/50.

rabbidruster · on April 22, 2013

Thanks for the link. It sounded extremely weird to me. I didn't know corp were often plural nouns in British English.

eru · on April 22, 2013

For some more fun with American English, try http://fine.me.uk/Emonds/

asveikau · on April 22, 2013

Interesting link.

I'm not a linguist but parsing these sentences as an American reader, I feel a bit like this page's examples are playing games with word order and omission. The cited "prestige" grammar sounds less intuitive to me not because of the pronoun used, but because of word order and words that are left out.

Example cited as correct prestige grammar:

> They didn't give anyone that worked less than she a raise.

That sounds a little weird to my American ears, but "worked less than she did" sounds totally correct.

"Worked less than her" (cited as correct non-prestige) sounds a bit casual and informal, not sounding too jarring but not what I'd expect in decent writing. Similar to the other example of "us commuters". If I'm talking to someone I wouldn't blink if they said this, but I wouldn't see it in the New York Times. (Though this also reminds me of phrases like "me too" or "it's me", which despite being inconsistent with distinctions between subject and object in other phrases, you'd hear a lot more than "I as well" or "it is I".)

> Mary and him are late.

Sounds very wrong to me.

Thinking back to my childhood it was pretty common for kids to be a bit "confused" about using pronouns this way before 10 years old or so, so maybe there is something to the author's statement that kids learn the non-prestige form and then the educated ones are "corrected" later.

> Mary and he are late.

This still sounds weird. I'd say "he and Mary are late".

> her and us

> she and we

These sound pretty clumsy regardless of which is supposed to be used.

johndonsu · on April 22, 2013

Your arrogance is astounding. This is why people hate Americans.

An English person, writing English in an English way, and you say you're not going to continue reading. Do you require him to write like an American? Why should an English person, writing their own language, have to follow your conventions?

"I didn't know"

Well then don't start shouting your mouth off! If you don't know, keep quiet.

rabbidruster · on April 22, 2013

I had a hard time reading that phrase, and I thought it was because he didn't spend enough time editing the article. I like to think I am a cultured person, but I truly didn't know it was grammatically correct in British English. Don't judge all Americans just because I am naive.

detst · on April 22, 2013

Your misunderstanding and dramatic response is astounding. It's ignorance, not arrogance. Say what you want about his reaction to the perceived poor grammar but it's nothing more than that.

johndonsu · on April 22, 2013

If he didn't understand why it was written like that, how about trying to find out why, before posting a snarky message?

mpyne · on April 22, 2013

For the same reason you didn't bother to figure out whether it was ignorance or malice on his part before jumping to your own convenient conclusion, I would imagine.

rabbidruster · on April 22, 2013

Again sorry I offended you. I wont make this mistake in the future.

dasil003 · on April 22, 2013

Wow you sound extremely angry. There must be a deeper issue at play here because this is way out of proportion to the comment you are responding to.

pinneycolton · on April 22, 2013

It's common in the UK to use "do" when referring to a collective. The author is from the UK.

philwelch · on April 22, 2013

I'm curious what you find wrong with that sentence.

icedchai · on April 22, 2013

We're from America. We speak Murican.

mattdeboard · on April 22, 2013

Might have been using it like "The cattle do graze", or "Black Sabbath do have a new album coming out".

mynameishere · on April 22, 2013

Come on. Quit downmodding people for a common misunderstanding. To any American "Amazon do" really does sound off.