Why do you think AWS is a poor fit for your workload? There are customers who ch...

ChuckMcM · on July 17, 2014

   > Why do you think AWS is a poor fit for your workload?

Every time I've done the math it comes out 10x as expensive, and I've done the math with pretty much everyone from the CTO down to the guy who is "sure we can do it cost effectively."

Generally folks who have petabytes of data in S3 don't have large amounts of read/write change. So a typical file or image sharing site like Imgur or Github will be 'read mostly'. When you're doing search you crawl billions of documents, often replacing 30 - 40% in your store, and you are constantly re-reading them as you index and rank them. Further, as you process the data what you really want to do is push your computation out to the data rather than pull it over the wire, mutate it, and push it back over the wire. Processing through 1 petabyte of data, on a 10gbps backbone where you pull it and then push it, running full duplex (so you're pushing and pulling at the same time) takes a million seconds at a gigabyte per second. That is 11 days, 13 hrs, 47 minutes. That is what happens changing 1/3 of a 3PB data set. Pushing the computation into the data (which is to say running the processors where your data is actually stored on disk) you can process a petabyte (assuming your data distribution algorithm is good (and ours is)) in about a day and a half. Not quite 1/10th the time.

If you want to coordinate 10,000 worker threads which are working through your data, you need to be able to share messages between them, they don't have to happen a lot but their latency adds up if they take too long.

You end up asking for the same system built in the "cloud" that you've built in a colo; Dedicated "fat" machines with lots of memory and disk, all within easy network 'shouting' distance (aka a non-blocking full crossbar bandwidth network) from each other without any confounding network traffic going over your back bone. And when you arrive at that inevitable conclusion, the actual loaded cost of the machine falls right out the bottom and lands in the customers lap including all the loaded up margin costs. Three months or so ago (right after the last price war) we ran all the numbers again, Amazon would cost about $1.5M/month to let us do what we want to do.

Now as I point out to cloud sales guys, and to you, this isn't "bad" or actually a problem, building search engines requires an extraordinary amount of horsepower to be brought to bear on a very large, very noisy, data set. There is absolutely no rationale for making that configuration cost effective, its an outlier, the number of people who do that you can count on one hand. But it is the same reason that people don't just pop out the AWS toolkit and poof crawl and index billions of web pages :-)

   > volume wins in cloud

I don't agree with that, I think what 'wins' in the cloud is the ability to oversubscribe the hardware. Just like Comcast sells everyone on the street 50 megabits of internet knowing darn well that more than a handful use that much at the same time everyone will throttle down, cloud providers sell you an 'instance' which probably spends most of its time not doing anything at all. And while it isn't someone else is. That is the 'magic' that makes this stuff so profitable for Amazon. Not people like me who have 100GB memory machines running at 85% utilization 24 hours a day[1]. It would be like everyone on the block signing up for high speed internet and every one of us downloading a copy of the entire Internet Archive :-) Not a likely situation, so rarely considered something the infrastructure needs to support.

[1] To be fair, they don't do that continuously, crawls start and stop and we switch things around, but when they are in the fetcher/extractor phase and running at R3, its a wonder to behold :-)

kordless · on July 17, 2014

> what 'wins' in the cloud is the ability to oversubscribe the hardware

We're increasing compute exponentially, so it makes sense we'd want to over subscribe it as much as possible. Demand is like a dog nipping at your heels.

dekhn · on July 17, 2014

AWS doesn't oversubscribe hardware the way you think.