I'll give you that this 80% number seems pretty out there. I don't know how that is measured or what it would be referencing.
If you step back and remove all the commercial software from the argument (something like 50%+ of enterprise workloads, the kind of things you buy from a 3rd party and just run it, like Sharepoint, SAP, or similar) and then look at how many business applications take on a trivial amount of load over time, then the author's post becomes more of an outlier. Few folks have apps that do 100rps realistically. And so for data processing/streaming/batch or web/api workloads serverless actually does work out pretty well. Is this 80%, I am not sure.
There is 100% an inflection point where if your operator cost is low enough(human work+3p tools+process+care and feeding) then the "metal to metal" costs can be comparable. Even the author admits that's leaving something on the floor and so it really comes down to what your organization values most.
I would love for most of our serverless app workloads to be top-down organizationally driven but the reality of it is that it comes often from developers themselves and/or line of business organizations with skin in the game of seeing things move faster in most organizations. This will then typically require buy in from security and ops groups. If these folks you know have the trick to driving top down incompetent strategic management towards serverless I'd buy in on that newsletter.
In terms of HN sentiment and in being a member of this community for almost a decade, I don't know if I'd say it widely represents most of the dev world as it tends to lean way more open-source and less enterprisey. I think there's also a larger number of people that represent IT vendors that would love to see AWS fail here :)
> And so for data processing/streaming/batch [...] serverless actually does work out pretty well.
This is my field of expertise. Serverless in the sense of lambda/functions is not usable for serious analytics pipelines due to the max allowed image size being smaller than the smallest NLP models or even lightweight analytics python distributions. You can't use lambda on the ETL side and you can't use lambda on the query side unless your queries are trivial enough to be piped straight through to the underlying store. And if your workload is trivial, you should just use clickhouse or straight up postgres because it vastly outperforms serverless stacks in cost and performance[1]
For non-trivial pipelines, tools like spark and dask dominate. And it just so happens that both have plugins to provision their own resources through kubernetes instead of messing around with serverless/paas noise.
IaaS is the peak value proposition of cloud vendors. Serverless/PaaS are grossly overpriced products aimed at non-technical audiences and are mostly snake oil. Change my mind.
The issue of the application artifact size is definitely real and it blocks some NLP/ML workloads for sure. Consider that a today problem that isn't hard in Lambda.
Missosoup i see you making changes to your comment and it greatly changes the tone/context. i won't adjust my own reply in suit but leave it as it was for your original comments on this.
I'm not going to make any elaborations on my comment now. Please feel free to edit yours or post another to answer anything I raised. Your original reply containing some generic sales brochures isn't what I expected from someone representing aws stepping into this discussion.
That article appears to be discussing a migration from Redshift to Clickhouse. Redshift is a managed data warehouse, not a serverless solution in the same vein as Lambda.
I don't understand the point you are trying to make.
Edit: The comment I am replying to was originally just 'Please explain' and a link to the article in question, and contained no other context or details.
Clickhouse is a really strange thing to compare to Lambda here. One is a method of performing small compute jobs, the other is an analytics database. They serve vastly different functions and saying "Clickhouse or postgres is cheaper and more performant than lambdas" is nonsensical.
Yup! Haven't done it in years and created this different account to be more clear/direct in who I am. That is also why I called it out at the start and bottom of all my responses.
No. I've been at AWS for over 7 years in a few different roles. Came to the serverless space >2.5 years ago because I felt passionate about it (could have literally done almost anything). Again, sorry for mis-posting under my older personal account, it was rarely used fwiw.
I wasn't criticizing you. I was pointing out that an equally likely and more charitable interpretation is that you posted as a fan of AWS before you started posting as an employee.
Turns out I was wrong in this case, but you've explained the situation and everything is hunky dory.
> In terms of HN sentiment and in being a member of this community for almost a decade, I don't know if I'd say it widely represents most of the dev world as it tends to lean way more open-source and less enterprisey.
They also like to join the latest hype more often than not so that should even out the anti-enterprise sentiment. I don't think serverless is over that point yet.
General opinion regarding the topic: I haven't done serverless in any way yet, but if it's similar to "regular"/other cloud services then in my experience it only makes sense of you're so big that building a scalable infrastructure yourself is too expensive (unless you're Facebook or Google). The other use case is if your load is actually fluctuating a lot, to a point where having enough resources available just to handle the peaks at all time is too expensive.
Whenever you can somewhat predict your load, having your own infra is almost always less expensive (at least here in Europe/Germany).
Hey there, I lead Developer Advocacy at AWS for Serverless (https://twitter.com/chrismunns) and have been involved in this space since pretty much the start. Thought I'd toss my hat in the ring there as there seem to be some fairly passionate responses across the board here.
This post is accurate. It's accurate in the new dev experience of someone picking up a framework tool and following the standard walkthroughs/examples that folks like my team and I have been creating the past few years. As far as I'm concerned Einar (the author of the post) did everything I would have done in their position with their experience with this tech. The math Einar did also seems to be accurate given a 1gb RAM and <100ms function duration as well as the API-GW costs, which don't fluctuate based on request duration or Lambda configuration.
Einar got feedback later on that they should have looked at ALB (Application Load Balancer) which does provide a different HTTP request interface back to a Lambda invoke. ALB is the right choice for dead simple proxy requests where you aren't using any of the more advanced features of API-GW. It would lower the cost significantly as well as increase the performance due to a slightly lesser overhead in request path. It would change both the numbers in the blogs title though I can't say it would be 1:1.
We've done a shit job of making this clear and thanks to Einar's post lighting the fire under our butts we'll make work to correct that.
What I am happy about in Einar's post is that they were able to get up and running in almost no time, explore, poke, test, and write this post about their experience very very quickly.
The other thing to highlight here is that I am happy to lose a battle to Elastic Beanstalk, or ECS, or Fargate pretty much any day of the week. All of these exist to significantly reduce the operator burden which compared to running your own container management can be incredibly high. Most companies I've run into in the past 7 years that I've been at AWS have struggled to really measure operator cost but when they compare it to just raw time difference both the technologies I just mentioned and "Serverless" tech like Lambda shift the time spent factor so greatly that cost savings becomes painfully obvious.
Anyway, we're here, we're listening, there's a lot going on behind the scenes (see recent announcement about how VPC cold-start pains will be gone soon).
"AWS costs almost twice more than analogous DO servers, but AWS has way more features."
When you us AWS you are paying for more than just a cheap VPS, which essentially is all DO is. It's comparing fast food to a fancy steak house. You can get meat at both, but at one side its microwaved. Not to say DO isn't great, it gets the job done and provides a valued service, but your money gets you what your money gets you.
When you use AWS you pay extra for the fancy steak. But you pay steak prices even if what you buy from them is a cheap burger.
It's really hard to find scenarios where AWS isn't ridiculously overpriced.
Consider that Digital Ocean is also an expensive alternative, but I deploy caching proxies for some clients who for various reasons insist on using AWS on DO because you can save lots by deploying droplets on DO to cache rather than pay AWS bandwidth costs for all your traffic, for example (you serve more than 1-2TB a month out of AWS you can start saving money that way).
Not everyone needs or wants a fancy steak, which I guess is why the OP is asking specifically about DO even while acknowledging that "AWS has way more features".
I find the better pattern here is to limit and discourage SSH and then monitor and log the hell out of it. There are numerous tools out there that can centralize any actions being taken on a host and sending it to a log that can be centralized. Outright removing all SSH puts you in a rough spot if things go south with some piece of software that your system monitoring/centralized logging don't cover 100%, and makes it way hard to do things like strace on a process.
Which is certainly a valid case, but I would argue that if that IS the case, and it cannot autorecover, then that is a good candidate for the instance to self terminate.
Also, make it part of the process that each time ssh is used, logging or monitoring is set up to catch what it was used for, much the same way that a test is added when a bug is discovered.
I've always wondered why this type of information hasn't been attached to a driver's license much like organ donorship is in many states. Seems like a prime system to keep such a thing organized and available.
Calling Postgres the leading open source DB is probably a bit more than a stretch these days, but yes its quite popular with startups(though one would argue nosql DBs are #1 in the eyes of most startups sadly).
Unfortunately one of the places its not popular in, is Amazon and the majority of their business customers. MySQL, Oracle, and SQL Server are all much more popular than Postgres, which is why they have probably focused on it more over the years. That said, the features have seemed to be coming pretty regularly.
Can you completely snapshot those volumes at any time, recreate them and attach them to new servers? Could you take these snapshots and easily copy them around the world?(again assuming you could snapshot). Are those SSD's automatically replicated to 2 different storage devices behind the scenes to give you near-instant failover? When they go boom are you then driving out to the datacenter to replace them (assuming you have replacements and don't need to wait for them to arrive). Can you do all of this without any upfront cost or excess in capacity??
Probably not. At all.
EBS is NOT harddisks inside a server. Comparing them to such is missing out on all the things that makes it a SERVICE and not disks you buy from Newegg/PCmall/<insert vendor here>. Yes there are disks you can buy to physically put in a server and they are super blazing fast. In fact AWS has those in their i2 instances and they get hundreds of thousands of IOPs as well.
This isn't even comparing apples to oranges, its apples to space monkeys.
Yes we can and do snapshot them, at several levels actually - I don't think that's a particularly hard thing to do so I'm not sure why that's relevant.
Yes there is replication both to separate disk arrays AND seperate physical servers with live failover and load balancing - again nothing new here?
No we don't send out storage to other countries - in fact that would be illegal, and if we were to do so our clients would suffer as Australia's international peering is pretty woeful.
We also gain on-disk compression and encryption on a LUN by LUN basis as we require it, storage is automatically provisioned to new application instances, all the software is 100% open source and mature, we don't have to phone a large corporate that doesn't really care about us, we pass security audits because we can prove where things are and how they're configured.
By the way, none of this is your 'new egg' gear you referenced, we use Intel DC P3600/P3700 PCIe storage.
Oh and as a bonus - there's no licensing or monthly invoices that need attention.
Is shared hosting / hardware outsourcing / cloud computing amazing - yes! Of course it is!
But you must remember it is their intentions to sell their product as the only right answer and to tell you what you should care about. In some cases it applies and in some it doesn't. The danger in jumping on the bandwagon and becoming an Amazon 'fanboy' (I'm really sorry for using that term - I hate it) is that you quickly become silod from external opperuntities and security / high vertical performance solutions.
If I was in a small team of devs working on launching a web app that's going to be targeted at an international audience, my growth is highly unpredictable, our future uncertain and our skill set focused on developing great software - I wouldn't think twice about using AWS/Rackspace etc...
But when you understand your environment well, when you have a limited budget, when you have a predicable customer base with strick security requirements and when you're pushing databases pretty hard - would I use AWS? No, it's not cost effective for us, nor is it legally (and perhaps morally) viable. Do we waste lots of time looking after our hardware? No! It's 2015 - hardware is easy.
You say LUN, are these SAN devices, or is it direct attached storage? Was the replication, load balancing, and snapshotting all something that you set up and manage yourselves?
--edit--
Ahh you've been editing your comments so the thread is a bit out of wack! (no problemo)
Fair enough, but again, your comment is about hardware that you are managing, that you've built, thats glued together from a lot of different components, both software and hardware, and this post is about a cloud service that doesn't even compare. So your initial post comes off a bit as trolling for the sake of trolling.
I've done my fair share of rack-n-stack, and I've now spent the past few years "in the clouds" as it were. Wouldn't go back for anything, but I dont think this makes me a fanboy. Sure there is kit that you'd only ever be able to build/buy yourself (for now at least), but most ppl will never need more than 100k IOPs, let alone 500k+.
--edit again--
In regards to security, if you think you are a capable of running an infrastructure more secure in a datacenter yourself, than on one of the major 3 cloud provider's infrastructure ( AWS, GOOG, MSFT ) where they have some of the best sec teams in the world, then you are probably not as deeply aware of whats possible in cloud from a security standpoint. Banks, Medical institutions, government agencies, and so forth are all trusting their infrastructure on the cloud, across many countries in the world.
Yeah sorry I didn't want it to end up sounding like a threaded argument - and I was sort of brain dumping as I go.
Hardware wise - We use standard servers (super micro), packed with several tiers of SSDs (Intel for the high end, SanDisk for the lower end).
Software wise, again all off the shelf, well understood tools: Debian Linux, DRBD, iSCSI, LACP, LVM, Puppet.
Our compute servers are blades with Debian VMs running Docker containers Of our applications.
Edit: something we've gained greatly from that isn't off the shelf is that we moved to running very modern Linux Kernels - we have CI builds triggered as new stable versions are released and they are stock standard except that we do patch them with GRSecurity and ensure SELinux is enforcing.
All this doesn't cost much time to manage at all - we don't even have a storage admin and to be honest - if we needed one we'd be doing something wrong - apart from physical failure (which is very rare these days) there really isn't anything to do with storage - it's almost boring!
I actually have to get some sleep now - it's after 1AM here in Aussie, I wanted to stress that I'm absolutely not against using cloud hosted services - just that they're not the answer to all situations and there's a lot to be gained from ensuring you don't get sucked in to too much of the 'Spin' that vendors provide.
> most ppl will never need more than 100k IOPs, let alone 500k+
These types of statements are always false. If there is anything the computing industry has taught us is that people always need more resources. Always.
I can think of many examples why even small businesses need more than 100k IOPS. Case in point: 5 years ago I did consulting work for an email marketing company that was generating a daily report on a database of about 1TB. The report took 10+ hours to generate due to the SQL queries aggregating data from joined tables in more or less random patterns. I upgraded their DB server from a 2-way RAID0 on 15kRPM HDD (about 500 IOPS) to a single SSD (20k IOPS), and it cut down report generation time to 15 minutes. 4 years later their database has continued growing and generation took 1 hour. They called me up again, I upgraded them to a 4-way SSD-based RAID5 (I benchmarked 250k IOPS) and again it cut down report generation to 6-8 minutes. This was a small company: a dozen marketers, 1 software guy.
FYI - there are laws around where data is allowed to be hosted and what country that parent company is located in. Even if there weren't laws on this - I think it would be pretty irresponsible to trust all your data and servers to one large off shore corporation, especially one that has a fragile political climate (mind you, want countries don't!)
I think the fallacy in you comment is that you think about storage as it was not using energy and network. If add those costs to the bill, are you sure that you are still cheaper? On the other hand, I much rather pay a monthly fee that I can turn off if things go sideways, than buy extremely expensive gear that I cannot get rid off at all.
Just to summarize:
- monthly cost is almost all the time better for small businesses
- your security is way worse than Amazon's
- the overall cost of your operation has to include electricity and network for the complete comparison
My experience is that companies rarely need expensive network storage gear and most of the time it is better for everybody to split up the problem and make it horizontally scalable. There are also other solutions, using distributed storage engines running on commodity HW. Having said that, there are quite few companies out there with SAN/NAS solutions, because this is what traditional computer vendors were selling for a long time. I think by time we are going to see more horizontally scalable storage solutions going forward.
Amazon's bandwidth rates are more than 10 times what we can get locally, and power is included in our colo rental fees. I'm assuming that will pretty much be the situation for mrmondo too.
In terms of monthly costs, all the gear I deal with is lease to own: We pay less per month when the servers are new, and 3 years down the line our bills drop. There's no reason to have large capital expenditures just because you want your own gear.
As for security, it's not really that simple. Amazon's physical security may be top notch, and their patching for Xen and network security may be just fine, but beyond that you're pretty much on your own with Amazon just as you are with your own gear. You still need to understand how to configure firewall settings, and understand how to keep your VMs secure. Amazon's security needs to be top notch because it adds an additional layer that you don't have direct control over, but that does not provide any additional security that you would not have in most reasonable colo facilities where the physical network devices past the service providers network drop is totally in your control, in a locked environment.
> My experience is that companies rarely need expensive network storage gear
The thing is, this gear isn't expensive. For about $250/month I can lease to own a 2TB PCIe SSD delivering 2.8GB/s read, 1.9GB/s write, 450k read and 150k write IOPS. That's in the UK, with 20% VAT, and without shopping around. Or if I want something with the performance profile of Amazons new offering, I can pay $25/month. Amazons cheapest EBS offerings, which are nowhere near what this article about, costs $200/month for 2TB space. Go for provisioned IOPS and the EBS cost skyrockets.
AWS is the expensive network storage option, not leasing your own.
I can lease servers to put it in to get me "free" compute capacity for the difference in cost of the raw storage and still have money left over after spares and hosting/power.
Factor in bandwidth and it gets downright comical - Amazons bandwidth prices are so totally out of whack that where for managed/colo setups CDN is an expense, for AWS setups a good CDN can save you vast amounts of money by cutting your bandwidth charges. And that's without discounted rates. Start putting decent volumes through and host at a carrier neutral facility and paying even 1/20'th for bandwidth vs. AWS is well within reach with peering arrangements and a good mix of transit providers.
I was going to reply to the parent comment but you've summarised exactly what I would have said - sounds like you've done your research and come to similar conclusions to us as well.
> Can you do all of this without any upfront cost or excess in capacity??
The price premium of using AWS is high enough that it's trivial to afford leasing tons of excess capacity to handle failures and still save tons of money.
But by and large it's not really necessary - most hosting providers can provide rapidly provisioned managed servers or VPS's in the same data centres as their colo offerings these days, which provides an excellent fallback if we get into capacity issues, meaning that thanks to the existence of cloud services, the cost of running your own base load can be pushed down significantly (everything I deploy is deployed in VMs or containers, and sometimes containers in VMs (don't ask...), and whether they run on our hardware or on a cloud providers hardware is merely a configuration issue.
In fact I have a couple of Xen based VPSs we rented in New Zealand to serve a customer that's tied seamlessly into our UK based infrastructure because it's not somewhere we can justify operating our own setup.
AWS certainly is convenient, but it's also so expensive I'm charging my highest day rates ever for projects to help clients move off AWS these days. It's easy to justify high fees when people see how much they can save.
"The price premium of using AWS is high enough that it's trivial to afford leasing tons of excess capacity to handle failures and still save tons of money."
"AWS certainly is convenient, but it's also so expensive I'm charging my highest day rates ever for projects to help clients move off AWS these days."
This is FUD and nothing more than FUD. As bad as the article the other day that said a company saved 50% moving off AWS. Show me a company that can save 50% by moving off AWS, and I'll show you a company that isn't using AWS properly at all.
When you compare a single piece of hardware that you can buy and run yourself to an instance in EC2, you are leaving a lot off the equation.
Monitoring tools (CloudWatch (sure it leaves some to be desired)
Machine Image building/tracking tools
Hardware provisioning tools
Security tools ( Security Groups, NACL's in VPC, plus the stuff you don't see in the infrastructure)
More varied hardware than you'd have in house.(different amounts of ram/cpu/storage).
More hardware than you'd have in house. (no need for spare parts cabinets, waiting on vendors for replacements).
Storage (EBS, S3, local).
When you pay for EC2, you get all of this. Doing this yourself isn't free in any way shape or form, even with opensource tools. A company with a good ops team that is cloud savy is going to be several times more effective at a smaller size than a team that has to manage a datacenter, all the hardware, and all these other bits. Folks discredit all of these when they do apples to apples comparison of hardware you can buy to a service that you use.
Let alone that most people don't understand per-server network, space, and power cost over its lifecycle. (I've spent months with companies doing datacenter ROI analysis and having no idea what things cost).
And going to a managed hosting provider, you're either locking yourself into the frozen tech world they are in, small development resources, and typically constrained resources. You don't hear of managed hosting providers building any of the above themselves, so you continue to have a higher management overhead than a true cloud provider like GOOG, AWS, MSFT.
there are languages inside of Amazon that are first class citizens (java & ruby mostly, node.js more now). Go is too new, moving too fast, and hasn't seen the same kind of adoption yet to warrant an SDK.(despite what the folks here might think..)
This sort of argument is very odd. Amazon can't justify putting the resources into building a golang SDK for AWS but it appears that Stripe can. Perhaps even more strangely, someone who works at Stripe can justify doing it on their spare time, but Amazon what hasn't got the money to justify it?
Companies that provide programmable systems should be the first to provide SDK's, even for technologies not yet widely adopted. That goes for any company that provides something programmable.
Amazon appears to have no trouble at all justifying the massive development effort required for every new back end web service they build, but front end programming API access just isn't valued as much.
Amazon appears to have no trouble at all justifying the massive development effort required for every new back end web service they build, but front end programming API access just isn't valued as much.
Honestly, is that really a surprise at all?
REST web services work well. Most languages have frameworks with excellent support for them.
Historically, the community has provided wrappers for popular APIs pretty quickly.
In the specific case of Go, it's still in early adopter territory. That means they are unlikely to have a problem using the raw APIs.
> Amazon appears to have no trouble at all justifying the massive development effort required for every new back end web service they build, but front end programming API access just isn't valued as much.
Their back end services make them money. Creating a Go API does not.
Makes it hard to justify investing development resources.
I'll give you that this 80% number seems pretty out there. I don't know how that is measured or what it would be referencing.
If you step back and remove all the commercial software from the argument (something like 50%+ of enterprise workloads, the kind of things you buy from a 3rd party and just run it, like Sharepoint, SAP, or similar) and then look at how many business applications take on a trivial amount of load over time, then the author's post becomes more of an outlier. Few folks have apps that do 100rps realistically. And so for data processing/streaming/batch or web/api workloads serverless actually does work out pretty well. Is this 80%, I am not sure.
There is 100% an inflection point where if your operator cost is low enough(human work+3p tools+process+care and feeding) then the "metal to metal" costs can be comparable. Even the author admits that's leaving something on the floor and so it really comes down to what your organization values most.
I would love for most of our serverless app workloads to be top-down organizationally driven but the reality of it is that it comes often from developers themselves and/or line of business organizations with skin in the game of seeing things move faster in most organizations. This will then typically require buy in from security and ops groups. If these folks you know have the trick to driving top down incompetent strategic management towards serverless I'd buy in on that newsletter.
In terms of HN sentiment and in being a member of this community for almost a decade, I don't know if I'd say it widely represents most of the dev world as it tends to lean way more open-source and less enterprisey. I think there's also a larger number of people that represent IT vendors that would love to see AWS fail here :)
Thanks, - Chris Munns - AWS - Serverless - https://twitter.com/chrismunns