I'm not saying what you think I am saying. The thread was about how the cloud is...

Pyxl101 · on Oct 24, 2015

If availability and scale are not important, and you can tolerate having to engage a human in the event of a hardware failure, then sure a $20 VPS might suffice. You could also run a single virtual machine in one zone in the cloud.

But I think you might underestimate the amount of use-cases that do legitimately benefit from and desire a greater degree of reliability and automation. When one of my machines dies, I don't want to be notified, and I don't want to have to do anything about it. I want a new virtual machine to come online with the same software and pick up the slack. Similarly, as my system's traffic grows over time, I want to be able to gradually add machines to a fleet, to handle my scaling problem, or even instruct the system to do that for me.

Plenty of use-cases may not require this, but I'm not convinced that the majority of systems in the cloud do not. Every system benefits from reliability, and it's great to get it cheaply and in a hands-off way. In the cloud, I can build a system where my virtual machine runs on a virtual disk, and if there's a hardware failure, my VM gets restarted on another physical machine and keeps on trucking without my involvement. As an engineer and scientist, I can accomplish a lot more with a foundation like this. I can build systems that require nearly zero maintenance and management to keep running, even over long time scales.

I don't think I disagree with you that some people overengineer systems, but I think I disagree with you about how much effort it requires to achieve solid availability and a high level of automation. It's not a lot of effort or cost, and it's a huge advantage. Once I build a system I never want to touch it again.

A certain segment of users are adopting these technologies because they want to be prepared to scale. One of the advantages of "big data" products even for small use-cases is: all successful use-cases grow over time. If you plan for success and growth, then you may exceed the capabilities of a traditional technology. If you use a "big" technology from the beginning, then you can be confident that you'll be able to solve increases in demand by scaling up, rather than by rearchitecting. As these platforms mature and become easier to use, the scales begin to tip, and they no longer require more engineering time than the alternatives; a strong hosted platform actually requires less time in total, especially when you consider setup and maintenance. Many of these technologies do an excellent job "scaling down" for simple use-cases too. While they have been difficult to use, they're getting easier. For example, MapReduce-paradigm technologies are becoming fairly easy with Apache Hive, and fast with Spark. They're becoming easier to set up due to hosted variants like AWS's ElasticMapReduce or Google Cloud Dataproc, etc.