Some orgs are looking at moving back to on prem because they're figuring this ou...

jumploops · on Feb 10, 2024

Funny story time.

I was once part of an acquisition from a much larger corporate entity. The new parent company was in the middle of a huge cloud migration, and as part of our integration into their org, we were required to migrate our services to the cloud.

Our calculations said it would cost 3x as much to run our infra on the cloud.

We pushed back, and were greenlit on creating a hybrid architecture that allowed us to launch machines both on-prem and in the cloud (via a direct link to the cloud datacenter). This gave us the benefit of autoscaling our volatile services, while maintaining our predictable services on the cheap.

After I left, apparently my former team was strong-armed into migrating everything to the cloud.

A few years go by, and guess who reaches out on LinkedIn?

The parent org was curious how we built the hybrid infra, and wanted us to come back to do it again.

I didn't go back.

smitty1e · on Feb 10, 2024

My funny story is built on the idea that AWS is Hotel California for your data.

A customer had an interest in merging the data from an older account into a new one, just to simplify matters. Enterprise data. Going back years. Not even leaving the region.

The AWS rep in the meeting kinda pauses, says: "We'll get back to you on the cost to do that."

The sticker shock was enough that the customer simply inherited the old account, rather than making things tidy.

banku_brougham · on Feb 10, 2024

Is R2 a sensible option for hosting data? I understand egress is chesp.

stickfigure · on Feb 10, 2024

R2 is great. Our GCS bill (almost all egress) jumped from a few hundred dollars a month to a couple thousand dollars a month last year due to a usage spike. We rush-migrated to R2 and now that part of the bill is $0.

I've heard some people here on HN say that it's slow, but I haven't noticed a difference. We're mainly dealing with multi-megabyte image files, so YMMV if you have a different workload.

banku_brougham · on Feb 11, 2024

awesome. I remember reading about this a while ago, but never tried. Since it has the same API i can imagine its not daunting as a multi-cloud infrastructure.

I guess permissions might be more complex, as in EC2 instance profiles wouldnt grant access, etc.

stickfigure · on Feb 12, 2024

Just to make sure nobody is confused by this - R2 has the same API as S3, not GCS. We had to build a simple abstraction around GCS/S3 to perform the migration. But if you're migrating from S3, it's pretty much drop-in. We even use the AWS-provided S3 Java library (especially convenient for making signed URLs).

hhsectech · on Feb 10, 2024

Eh? I've never had a problem moving data out of AWS.

Have people lost the ability to write export and backup scripts?

interroboink · on Feb 10, 2024

My (peripheral) experience is that it is much cheaper to get data in than to get data out. When you have the amount of data being discussed — "Enterprise data. Going back years." — that can get very costly.

It's the amount of data where it makes more sense to put hard drives on a truck and drive across the country rather than send it over a network, where this becomes an issue (actually, probably a bit before then).

fcarraldo · on Feb 10, 2024

AWS actually has a service for this - Snowmobile, a storage datacenter inside of a shipping container, which is driven to you on a semi truck. https://aws.amazon.com/snowmobile/

xmcqdpt2 · on Feb 10, 2024

They do not!

> Q: Can I export data from AWS with Snowmobile? > > Snowmobile does not support data export. It is designed to let you quickly, easily, and more securely migrate exabytes of data to AWS. When you need to export data from AWS, you can use AWS Snowball Edge to quickly export up to 100TB per appliance and run multiple export jobs in parallel as necessary. Visit the Snowball Edge FAQs to learn more.

https://aws.amazon.com/snowmobile/faqs/?nc2=h_mo-lang

Why would they make it convenient to leave?

fcarraldo · on Feb 10, 2024

Oh, TIL! Thanks for correcting me.

brickteacup · on Feb 10, 2024

That's only for data into AWS though, not data out

Shorel · on Feb 10, 2024

Just in network costs, there's a huge asymmetry. Uploading data to AWS is free. Downloading data from them, you have to pay.

When you have enough data, that cost is quite significant.

Draiken · on Feb 10, 2024

The ingress/egress cost is ridiculously high. Some companies don't care, but it is there and I've seen it catch people off guard multiple times.

varjag · on Feb 10, 2024

Oh come on from the description both accounts could be sitting on the same datacenter LAN.

mijoharas · on Feb 10, 2024

There's a cost for data egress (but not ingress)

LadyCailin · on Feb 10, 2024

It’s the cost of data egress, which isn’t free.

mciancia · on Feb 10, 2024

But there is no paid egress when we are moving data between account within one region, rigth?

storyinmemo · on Feb 10, 2024

There is. You pay a price for any cross-VPC traffic.

CubsFan1060 · on Feb 10, 2024

This isn't true, at least not anymore.

You can peer two vpc's and as long as you are transferring within the same (real) AZ, it's free: https://aws.amazon.com/about-aws/whats-new/2021/05/amazon-vp...

Even peered VPC's only pay "normal" prices: https://aws.amazon.com/ec2/pricing/on-demand/#Data_Transfer

"Data transferred "in" to and "out" from Amazon EC2, Amazon RDS, Amazon Redshift, Amazon DynamoDB Accelerator (DAX), and Amazon ElastiCache instances, Elastic Network Interfaces or VPC Peering connections across Availability Zones in the same AWS Region is charged at $0.01/GB in each direction."

nyc_data_geek · on Feb 10, 2024

Yes, I do believe autoscaling is actually a good use case for public cloud. If you have bursty load that requires a lot of resources at peak which would sit idle most of the time, probably doesn't make sense to own what you need for those peaks.

hhsectech · on Feb 10, 2024

There are two possible scenarios here. Firstly, they can't find the talent to support what you implemented...or more likely, your docs suck!

I've made a career out of inheriting other peoples whacky setups and supporting them (as well as fixing them) and almost always its documentation that has prevented the client getting anywhere.

I personally dont care if the docs are crap because usually the first thing I do is update / actually write the docs to make them usable.

For a lot of techs though crap documentation is a deal breaker.

Crap docs aren't always the fault of the guys implementing though, sometimes there are time constraints that prevent proper docs being written. Quite frequently though its outsourced development agencies that refuse to write it because its "out of scope" and a "billable extra". Which I think is an egregious stance...doxs Should be part and parcel of the project. Mandatory.

smokel · on Feb 10, 2024

I agree that bad documentation is a serious problem in many cases. So much so that your suggestion to write the documentation after the fact can become quite impossible.

If there is only one thing that juniors should learn about writing documentation (be it comments or design documents), it is this: document why something is there. If resources are limited, you can safely skip comments that describe how something works, because that information is also available in code.

(It might help to describe what is available, especially if code is spread out over multiple repositories, libraries, teams, etc.)

(Also, I suppose the comment I'm responding to could've been slightly more forgiving to GP, but that's another story.)

adrianmsmith · on Feb 10, 2024

> Quite frequently though its outsourced development agencies that refuse to write it

It's also completely against their interest to write docs as it makes their replacement easier.

That's why you need someone competent on the buying side to insist on the docs.

A lot of companies outsource because they don't have this competency themselves. So it's inevitable that this sort of thing happens and companies get locked in and can't replace their contractors, because they don't have any docs.

lazyasciiart · on Feb 10, 2024

Unfortunately it’s also possible that e.g the company switched from share point to confluence and lost half the entire knowledge base because it wasn’t labeled the way they thought it was. Or that the docs were all purged because they were part of an abandoned project.

thelastparadise · on Feb 10, 2024

> the first thing I do is update / actually write the docs to make them usable.

OK so the docs are in sync for a single point of time when you finish. Plus you get to have the context in your head (bus factor of 1, job security for you, bad for the org.)

How about if we just write clean infra configs/code, stick to well known systems like docker, ansible, k8s, etc.

Then we can make this infra code available to an on prem LLM and ask it questions as needed without it drifting out of sync overtime as your docs surely will.

Wrong documentation is worse than no documentation.

jumploops · on Feb 11, 2024

Just to be clear, after I (and a few others left), they moved everything entirely to the cloud.

Even with documentation on the hybrid setup, they'd need to get a new on-prem environment up and running (find a colo, buy machines, set up the network, blah blah).

ZoomerCretin · on Feb 10, 2024

Documentation? What for? It's self-documenting (to me, because I wrote it)!

maxrecursion · on Feb 10, 2024

"Crap docs aren't always the fault of the guys implementing though, sometimes there are time constraints that prevent proper docs being written."

I can always guarantee a stream of consciousness one note that should have most of the important data, and a few docs about the most important parts. It's up to management if they want me to spend time turning that one note into actual robust documentation that is easily read.

oooyay · on Feb 10, 2024

Context: I build internal tools and platforms. Traffic on them varies, but some of them are quite active.

My nasty little secret is for single server databases I have zero fear of over provisioning disk iops and running it on SQLite or making a single RDBMS server in a container. I've never actually run into an issue with this. It surprises me the number of internal tools I see that depend on large RDS installations that have piddly requirements.

dvfjsdhgfv · on Feb 10, 2024

The problem with single instance is that while performance-wise it's best (at least on bare metal), there comes a moment when you simply have too much data and one machine can't handle. Your your scenario, it may never come up, but many organizations face this problem sooner or later.

oooyay · on Feb 10, 2024

I agree, my point is that clusters are overused. Most applications simply don't need them and it results in a lot of waste. Much of this has to do with engineers being tasked with an assortment of roles these days, so they obviously opt for the solution where a database and upgrades are managed for them. I've just found that managing a single containers upgrades aren't that big of an issue.

DeathArrow · on Feb 10, 2024

>making a single RDBMS server in a container

On what disk is the actual data written? How do you do backups, if you do?

BirAdam · on Feb 10, 2024

In most setups like this, it’s going to be spinning rust with mdadm, and MySQL dumps that get created via cron and sent to another location.

stingraycharles · on Feb 10, 2024

That’s made possible because of all the orchestration platforms such as Kubernetes being standardized, and as such you can get pretty close to a cloud experience while having all your infrastructure on-premise.

nyc_data_geek · on Feb 10, 2024

Yes, virtualization, overprovisioning and containerization have all played a role in allowing for efficient enough utilization of owned assets that the economics of cloud are perhaps no longer as attractive as they once were.

nextos · on Feb 10, 2024

Same experience here. As a small organization, the quotes we got from cloud providers have always been prohibitively expensive compared to running things locally, even when we accounted for geographical redundancy, generous labor costs, etc. Plus, we get to keep know how and avoid lock-in, which are extremely important things in the long term.

Besides, running things locally can be refreshingly simple if you are just starting something and you don't need tons of extra stuff, which becomes accidental complexity between you, the problem, and a solution. This old post described that point quite well by comparing Unix to Taco Bell: http://widgetsandshit.com/teddziuba/2010/10/taco-bell-progra.... See HN discussion: https://news.ycombinator.com/item?id=10829512.

I am sure for some use-cases cloud services might be worth it, especially if you are a large organization and you get huge discounts. But I see lots of business types blindly advocating for clouds, without understanding costs and technical tradeoffs. Fortunately, the trend seems to be plateauing. I see an increasing demand for people with HPC, DB administration, and sysadmin skills.

layoric · on Feb 10, 2024

> Plus, we get to keep know how and avoid lock-in, which are extremely important things in the long term.

So much this. The "keep know how" has been so greatly avoided over the past 10 years, I hope people with these skills start getting paid more as more companies realize the cost difference.

lanstin · on Feb 10, 2024

When I started working in the 1980s (as a teenager but getting paid) there was a sort of battle between the (genuinely cool and impressive) closed technology of IBM and the open world of open standards/interop like TCP/IP and Unix, SMTP, PCs, even Novell sort of, etc. There was a species of expert that knew the whole product offering of IBM, all the model numbers and recommended solution packages and so on. And the technology was good - I had an opportunity to program a 3093K(?) CM/VMS monster with APL and rexx and so on. Later on I had a job working with AS/400 and SNADS and token ring and all that, and it was interesting; thing is they couldn't keep up and the more open, less greedy, hobbyists and experts working on Linux and NFS and DNS etc. completely won the field. For decades, open source, open standards, and interoperability dominated and one could pick the best thing for each part of the technology stack, and be pretty sure that the resultant systems would be good. Now however, the Amazon cloud stacks are like IBM in the 1980s - amazingly high quality, but not open; the cloud architects master the arcane set of product offerings and can design a bespoke AWS "solution" to any problems. But where is the openness? Is this a pendulum that goes back and forth (and many IBM folks left IBM in the 1990s and built great open technologies on the internet) or was it a brief dawn of freedom that will be put down by the capital requirements of modern compute and networking stacks?

My money is on openness continuing to grow and more and more pieces of the stack being completely owned by openness (kernels anyone?) but one doesn't know.

nyc_data_geek · on Feb 10, 2024

Even without owning the infrastructure, running in the cloud without know-how is very dangerous.

I hear tell of a shop that was running on ephemeral instance based compute fleets (EC2 spot instances, iirc), with all their prod data in-memory. Guess what happened to their data when spot instance availability cratered due to an unusual demand spike? No more data, no more shop.

Don't even get me started on the number of privacy breaches because people don't know not to put customer information in public cloud storage buckets.

hardolaf · on Feb 10, 2024

I was part of a relatively small org that wanted us to move to cloud dev machines. As soon as they saw the size of our existing development docker images that were 99.9% vendor tools in terms of disk space, they ran the numbers and told us that we were staying on-prem. I'm fairly sure just loading the dev images daily or weekly would be more expensive than just buying a server per employee.

nicbou · on Feb 10, 2024

Is there a bit of risk involved since the know-how has a will of its own and sometimes gets sick?

If I had a small business with very clever people I'd be very afraid of what happens if they're not available for a while.

pinkgolem · on Feb 10, 2024

Keep in mind, there is an in between..

I would have a hard time doing servers as cheap as hetzner for example including the routing and everything

jwr · on Feb 10, 2024

I do that. In fact I've been doing it for years, because every time I do the math, AWS is unreasonably expensive and my solo-founder SaaS would much rather keep the extra money.

I think there is an unreasonable fear of "doing the routing and everything". I run vpncloud, my server clusters are managed using ansible, and can be set up from either a list of static IPs or from a terraform-prepared configuration. The same code can be used to set up a cluster on bare-metal hetzner servers or on cloud VMs from DigitalOcean (for example).

I regularly compare this to AWS costs and it's not even close. Don't forget that the performance of those bare-metal machines is way higher than of overbooked VMs.

DeathArrow · on Feb 10, 2024

100% agree. People still think that maintaining infrastructure is very hard and requires lot of people. What they disregard is that using cloud infrastructure also requires people.

pinkgolem · on Feb 10, 2024

I was more talking about physical backbone connection which hetzner does for you.

We are using hetzner cloud.. but we are also scaling up and down a lot right now

fgonzag · on Feb 10, 2024

You usually just do colocation. The data center will give you a rack (or space for one), an upstream gateway to your ISP, and redundant power. You still have to manage a firewall and your internal network equipment, but its not really that bad. I've used PFsense firewalls, configured by them for like $1500, with roaming vpn, high availability, point to point vpn, and as secure as reasonably possible. After that it's the same thing as the cloud except its physical servers.

pinkgolem · on Feb 10, 2024

i mean, yes.. but you pay for that, and colocation + server deprication in the case i calculated was higher then just renting the servers

swores · on Feb 10, 2024

Could you please explain what you mean by "physical backbone connection", as I can't think of a meaning that fits the context.

If you mean dealing with the physical dedicated servers that can be rented from Hetzner, that's what the person you replied to was talking about being not so difficult.

If you mean everything else at the data centre that makes having a server there worthwhile (networking, power, cooling, etc.) I don't think people were suggesting doing that themselves (unless you're a big enough company to actually be in the data centre business), but were talking about having direct control of physical servers in a data centre managed by someone like Hetzner.

(edit: and oops sorry I just realised I accidentally downvoted your comment instead of up, undone and rectified now)

pinkgolem · on Feb 10, 2024

With "routing" I meant the backbone connection, which is included in the hetzner price.

Aka if I add up power (including backup) + backbone connection rental + server deprication I can not do it for the hetzner price..

That was quite imprecise, sorry about that.

swores · on Feb 10, 2024

No worries, easy to not foresee every possible way in which strangers could interpret a comment!

But I think that people (at least jwr, and probably even nyc_data_geek saying "on prem") are talking about cloud (like AWS) vs. renting (or buying) servers that live in a data centre run by a company like Hetzner, which can be considered "on prem" if you're the kind of data centre client who has building access to send your own staff there to manage your servers (while still leaving everything else, possibly even legal ownership and therefore deprecation etc. to the data centre owner).

What you're thinking of - literally taking responsibility for running your own mini data centre - I think is hardly ever considered (at least in my experience), except by companies at the extremes of size. If you're as big as Facebook (not sure where the line is but obviously including some companies not AS big as Meta but still huge) then it makes sense to run your own data centres. If you're a tiny business getting less than thousands of website visits a day and where the website (or whatever is being hosted) isn't so important that a day of downtime every now and then isn't a big deal, then it's not uncommon to host from the company's office itself (just using a spare old PC or second hand cheap 1U server, maybe a cheap UPS, and just connected to the main internet connection that people in the office use, and probably managed by a single employee, or company owner, who happens to be geeky enough to think it's one or both of simple or fun to set up a basic LAMP server, or even a Windows server for its oh-so-lovely GUI).

DeathArrow · on Feb 10, 2024

I think no one talked about having physical server on their own premises but colocating servers in a data center or renting servers in a data center.

tormeh · on Feb 10, 2024

When talking about Hetzner pricing, please don’t change the subject to AWS pricing. The two have nothing in common, and intuition derived from one does not transfer to the other.

KronisLV · on Feb 10, 2024

> The two have nothing in common

If all you need are some cloud servers, or a basic load balancer, they are pretty much the same.

If you need a plethora of managed services and don't want to risk getting fired over your choice or specifics of how that service is actually rendered, they are nothing alike and you should go for AWS, or one of the other large alternatives (GCP, Azure etc.).

On the flip side, if you are using AWS or one of those large platforms as a glorified VPS host and you aren't doing this in an enterprise environment, outside of learning scenarios, you are probably doing something wrong and you should look at Hetzner, Contabo, or one of those other providers, though some can still be a bit pricey - DigitalOcean, Vultr, Scaleway etc.

dvfjsdhgfv · on Feb 10, 2024

> please don’t change the subject to AWS pricing

Why? The only reason I'm using Hetzner and not AWS for several of my own projects (even though I know AWS much better since this is what I use at work) is an enormous price difference in each aspect (compute, storage, traffic).

jwr · on Feb 11, 2024

> the two have nothing in common

Well, in my case at least, what they have in common is that I can choose to run my business on one or the other. So it's not about intuition, but rather facts in my case: I avoid spending a significant amount of money.

I (of course) do realize that if you design your software around higher-level AWS services, you can't easily switch. I avoided doing that.

throwawaaarrgh · on Feb 10, 2024

It's not an either/or. Many business both own and rent things.

If price is the only factor, your business model (or executives' decision-making) is questionable. Buy only the cheapest shit, spend your time building your own office chair rather than talking to a customer, you aren't making a premium product, and that means you're not differentiated.

Ir0nMan · on Feb 12, 2024

Yep. This.

chii · on Feb 10, 2024

i would imagine that cloud infrastructure has the ability for fast scale up, unlike self-owned infrastructure.

For example, how long does it take to rent another rack that you didnt plan for?

And not to mention that the cost of cloud management platforms that you have to deploy to manage these owned assets is not free.

I mean, how come even large consumers of electricity does not buy and own their own infrastructure to generate it?

tpetry · on Feb 10, 2024

Ordering that amount of amount of servers takes about one hour with hetzner. If you truly want a complete rack on your own maybe a few days as they have to do it manually.

Most companies don‘t need to scale up full racks in seconds. Heck, even weeks would be ok for most of them to get new hardware delivered. The cloud planted the lie into everyone‘s head that most companies dont have predictable and stable load.

hardolaf · on Feb 10, 2024

Most businesses could probably know server needs 6-12 months out. There's a small number of businesses in the world that actually need dynamic scaling.

rajamaka · on Feb 10, 2024

What would be the cost/time of scaling down a rack on Hetzner?

pinkgolem · on Feb 10, 2024

rental period is a month you can also use hetzner cloud, which is still roughly 10x less expensive then aws and that does not take into account the vastly cheaper traffic

gorm · on Feb 10, 2024

One other appealing alternative for smaller startups is to run Docker on one burstable vm. This is a simple setup and allows you to go beyond the cpu limits and also scale up the vm.

Might be other alternatives than using Docker so if anyone has tips for something simpler or easier to maintain, appreciate a comment.

pinkgolem · on Feb 10, 2024

>I mean, how come even large consumers of electricity do not buy and own their own infrastructure to generate it?

They sure do? BASF has 3 power plants in Hamburg, Disney operate Reedy Creek Energy with at least 1 power plant and I could list a fair bit more...

>For example, how long does it take to rent another rack that you didnt plan for?

I mean, you can also rent hardware a lot cheaper then on AWS. There certainly are providers where you can rent out a rack for a month within minutes

sseagull · on Feb 10, 2024

Some universities also have their own power plants. It’s also becoming more common to at least supplement power on campus with solar arrays.