Hacker News new | past | comments | ask | show | jobs | submit login

I'm not in the web or cloud business, but I've filled a rack with my stuff before. My impression is that hardware has become a lot more capable even relative to its tasks. With high iops memory, many cores and obscene amounts of RAM, I would expect companies of a much larger scale (in $, FTEs, or most other metrics) can be served by one 4HE machine, or by one rack, or by one room. Thus I would expect the knowledge of how to handle 5000 hard drives to become more obscure, naturally, but the skill to run a decently sized web application to remain almost constant.

Does this math work out, or have the tasks become more demanding at the same speed that hardware has improved?




IMO your assertion is validated by the excellent overview of Stack Overflow's infrastructure given here:

https://nickcraver.com/blog/2016/02/17/stack-overflow-the-ar...

Very few web apps will ever serve as much traffic as SO.


SO doesn't have a very operationally complex app.

A bank running 50 different services, on different platforms, with serious audit requirements, physical and logical access control, strict change and configuration management, etc., has two orders of magnitude more complexity. And that shit is very expensive in manpower.


"Very few web apps will ever serve as much traffic as SO."

Their traffic is like 80-90% reads and they actually hire good devs and let them work on perf.

Neither of those things are true in typical companies.


There are now businesses that explicitly depend on the elasticity of the cloud and can never really be moved on premise without massive up-front investment in hardware that may only be used a few times a year for their biggest customers. Trying to hybridize these workloads hasn't been very successful as of yet. It is possible that K8S could relive this problem but I haven't seen it in practice, at scale.


Instant Elasticity in Cloud is a myth. If you think you are going to get 1k hosts just like that from AWS you will have an unpleasant experience.

I work at one of the decent size tech company and we are split between cloud and on prem. From our experience you have to inform AWS/GCP in advance (sometime way early) if you are looking to meaningfully increase capacity in zone/region.

Sure, auto scaling few hundreds of hosts may be possible but people who run a service which needs few hundreds of hosts run it directly on AWS, they will run it some kind of scheduler+resource manager which will have some kind of operational buffer anyway (as in you would already have those hosts so cloud elasticity is not a factor here).


How early is "way early"? Because as long as it's shorter than the two-three weeks it'd take to order boxes, rack them, provision them (which would be automated but might still take a afternoon), deal with any QA hiccups... I'd much rather call my AWS rep and say "can we add 30% by Thursday" and have them figure it out (and at such a large scale you might be able to spread it out across a couple regions anyway unless you only serve a specific part of the world).


From what I have seen it is actually of the same order or sometimes more. In one of the region/zone we add few hundreds hosts every week but that is after telling them we plan to upscale this in this region upto some big X number.


"Instant elasticity in cloud is a myth"

This times a million. I think SQS standard queues are probably the only thing that IME actually fulfill that promise.


This is the same with disaster recovery too. The idea that "oh, our main DC went down, we'll just spin it up in another region" is great until you realize that means you need reserved instances in another zone, that just like another physical DC, you won't be using.


Why not go fully on-prem then? You can run kubernetes locally.

Are managed data stores that attractive? You can pay for on-prem management.

What workloads are in the cloud versus on-prem?


Right now there is no specific distinction between what we want to run in Cloud vs On Prem. Important thing to note here is we use Cloud as an IaaS only. We have our own stack which sort of prepares the hosts before it is ingested into clusters as usable capacity.

We actually recommend not using custom cloud providers Databases or any other value added services.

Why not completely either way (on prem vs cloud) is something that happened way before I joined the group but I think the main reason is to have a tactical edge in the long run such that we avoid lock in. I guess in some ways it helps us negotiate pricing better.

Imagine moving a certain workload from GCP region to an AWS region as part of a failover drill.


As these are generally scheduled events, end of year, end of quarter, etc. they can be planned. Beats owning the machines.


Elasticity? Fine. So their, say, single rack will sometime have limited load and be under-utilized.

About the up-front investment - most hi-tech companies are a massive initial up-front (or nearly-up-front) investment.


I was talking about at scale, not a rack. If you can get by with a rack, you will pay more for the people to support it than the incremental cost of the cloud.


> If you can get by with a rack, you will pay more for the people to support it than the incremental cost of the cloud.

Probably a whole lot less.

At larger scale - I would guess it's the same thing. If an organization needs more than a rack during peak use, it can probably benefit from setting up its own infrastructure. Only in the uncommon case of short extreme peak use and almost no use most of the time does such elasticity make a could solution attractive. IMHO.


That is very common with the infrastructure startups that I work with, like Snowflake and others.


Don't the various clients even out the usage?


> My impression is that hardware has become a lot more capable even relative to its tasks.

Indeed. The margins are bonkers high. As an example, the amount of ram that you can stuff into a physical machine has at least doubled in the last five years, but the price of the average virtual machine has not.


You still want ha, failover, and disaster recovery. Then you need to set up stuff like bgp, dns, security rules, etc, etc etc. Complexity mounts pretty quickly.


Indeed. It seems that most of the people saying that cloud hosting is expensive have never run into the issues of making their own SAN, managing the provisioning of 20 different teams, etc.

The organizational complexity and specialist knowledge is mind-boggling and there is zero chance that your in-house knowledge is better than what Amazon can provide.


This is true, but unrelated.

We're talking about Dropbox scale.

At that scale you can (nee should) hire all the specialists you need.


Installing rack servers and setting up services to run a site used to be a sort of rite-of-passage 15-20 years ago, but that time period of the web was different. Still, I would consider basic familiarity with the infrastructure necessary also today.

Increasing hardware performance relative to task load created the rational for virtualization. Virtualization also turned out to be rational with respect to consistency, convenience, maintenance, and so on. At that point, outsourcing to a cloud can be rational.

But fewer people get hands-on experience with the infrastructure, and it sounds like many consider it almost mythical. For example, realizing the amount of work that can be done in 4U today. What does amazon charge for 96 cores and 256GB?




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: