Please follow suit with SSD EBS pricing - 60TB should not be costing me 6k/mo.

kondro · on Jan 6, 2016

Why? It's got 3+ redundancy and is simple & fast to snapshot to geographically redundant backups.

mikeyouse · on Jan 6, 2016

Not OP but you could literally buy 3 sets of 60TB hard drives every single month for less money.. There should be some economy of scale here, no?

dangrossman · on Jan 6, 2016

I don't know what kind of hardware Amazon uses, but it looks like 60TB worth of any kind of SSDs starts at $24K on Newegg. If they're provisioned 3X, that's 12 months to pay off the drives.

iofj · on Jan 6, 2016

3x redundancy doesn't actually require you to triple storage : https://en.wikipedia.org/wiki/Hamming(7,4)

If you check out 3x redundancy, what does it get you ? Well, you can correct any 1 bit error (not 2, because you wouldn't know which version is the correct one). Hamming(7,4) with column encoding gets you the same (better in some ways even). Therefore would you really be lying to your customers if you told them you gave them 3x redundancy if you used Hamming(7,4), column encoded ? I'd say no. Because it gets them the same : any disk can fail, and you can rebuild the data.

If you intend to serve your customers correct bitstreams in the case of bitflips on the disks, you'd need to read 2 disks even in the case of 3x redundancy, exactly the same as in the Hamming case. Of course, people might choose not to do that, but then you only have backups, not redundancy. What can go wrong with 3x replication reading from one disk is that your system updates the 3 disks based on information read exclusively from disk 1, which may turn out to be wrong data.

But Hamming only costs you 175% storage, not 300%. That brings it to ~7 months. And with precomputed lookup tables Hamming decoding is far, far faster than reading from disk (even without I bet it would still beat it).

Another huge advantage Amazon has is that EBS means they don't have to allocate SSD space unless a customer actually uses it, not just if they reserve it (and they pay for it when reserving it). So in practice you do what ? 100% overprovisioning is prudent ? Let's say compression, given that these are operating system images mostly, gets you another 30-50% or so. If they dedupe, they could get far more.

On the other hand the newegg figure doesn't include power to actually use those disks (SSDs are cheap though). Amazon of course doesn't pay anywhere near full price there either. Then, actually putting stuff onto an EBS ... amazon charges for that. And of course, Amazon needs to develop a lot of software to make this happen. So there's various other things not counted here.

So hardware costs for Amazon would be at most 2-3 months or so until they're repaid, no more.

acdha · on Jan 6, 2016

You're leaving out things like ops & security staffing, etc. not to mention the physical hardware other than disks: you need servers, cabinets of disks, etc. all of which need to be purchased, monitored and replaced just like everything else.

An SSD might use less power but you need more of them and the rest of the storage server won't change at all.

Finally, I'd love a citation for any compression + dedupe savings at the level you're seeing for large heterogeneous deployments, not to mention reliable performance at their scale.

hrez · on Jan 6, 2016

As example purestorage.com is all ssd storage with inline compression/dedupe and quotes 5-10x reduction on VM's.

acdha · on Jan 6, 2016

Do they specify the VMs tested? AWS has a lot of different versions of things in play and most of the people I know fall into two camps: fairly generic VMs running compute jobs, which probably would compress well, and VMs running huge databases / image farms / etc. which do not. By VM count I'm sure the former dominate but by total storage consumption I think the latter wins – I would, of course, love to see if anyone has hard data.

hrez · on Jan 6, 2016

AWS doesn't have that many OS versions. Besides I'd expect huge datasets go either to S3 or to ephemeral storage like Cassandra clusters etc. EBS isn't the best place for it. Your mileage may vary etc.

mikeyouse · on Jan 6, 2016

Yeah, my bad.. I read past SSD in the initial post, I was looking at spinning drive prices.

mkching · on Jan 6, 2016

A storage server and network to put those drives into which would give similar functionality to EBS would have significant costs and require some degree of management.