Hacker News new | past | comments | ask | show | jobs | submit login
Run Ordinary Rails Apps Globally (2021) (fly.io)
269 points by goranmoomin on Jan 26, 2022 | hide | past | favorite | 113 comments



Alas, I feel like the word "Ordinary" is doing a lot of heavy lifting in this title!

The upshot is, so long as your postgres + rails app isn't doing some kind of DB write with every request, and doesn't do much - or ideally anything - with Redis, and you want 200ms time-to-first-byte globally (or at least, in multiple locations across the world) then you can easily get better performance with Fly.io than most other solutions.

It's a really clever solution, and one day I'd love to work on a project that fulfils all of those constraints. Haven't found one yet though.


Running a Fly app backed by DynamoDB Global Tables is an option. DDB keeps a copy of your data in all the regions you specify, each Fly instance can connect to the nearest region, and writes are propagated with eventual consistency & last write wins.

And most Redis commands can be mapped to DDB, I worked on a lib to do that.

https://github.com/dbProjectRED/redimo.go

https://github.com/sudhirj/aws-regions.go

https://aws.amazon.com/dynamodb/global-tables/


The other option here is using Spanner from Google (https://cloud.google.com/spanner) which now has full PostgreSQL compatibility.

That gets you global scalability with minimal meaningful changes to your application at all. They even made a Rails specific guide for it here

https://cloud.google.com/blog/topics/developers-practitioner...

If you combined that with say Cloud Native Buildpacks (buildpacks.io) and Cloud Run you now have basically an infinitely scalable solution that happens to hook into all of Google’s infrastructure at just the right points that requires almost zero work on your behalf. Plus you get to keep the one command to deploy that a lot of the Rails community is used to from the Heroku days.

No K8s to configure, no servers to maintain, per second billing on your app server, everything autoscales, very little to learn in the way of new technology. You can complicate things at your own pace as far as the rest of “the cloud” goes but there’s no need to do so unless you want to. It’s a pretty sweet deal in terms of cost to benefit ratios.


> full PostgreSQL compatibility

By "full", you mean "some"? From [the documentation] on Spanner's PostgreSQL Interface:

> This means that applications written against the PostgreSQL interface can be readily ported to another PostgreSQL environment, obviously without the non-functional benefits, like scalability and availability, that are unique to Cloud Spanner. However, 100% compatibility with PostgreSQL is not the goal.

> The PostgreSQL interface provides a rich subset of the open-source PostgreSQL SQL dialect, including common query syntax, functions, and operators

It seems like existing simple Postgres applications could switch to Spanner, but it does require running a sidecar if your app wants to continue speaking the Postgres protocol on the wire, instead of integrating a Spanner API client.

When it comes to [datatypes], Spanner's PostgreSQL interface is very limited. According to the docs, only these types are supported:

> bool, bytea (byte array), float8 (64-bit IEEE-754 float), int8 (signed 64-bit int), numeric/decimal, timestampz, text.

Notably missing for [my use-case] are UUID, JSON/JSONB, array types like UUID array, and custom enum types. The indexing support is also quite sparse. Full Postgres compatibility, this is not :(

[the documentation]: https://cloud.google.com/spanner/docs/postgresql-interface

[datatypes]: https://cloud.google.com/spanner/docs/reference/postgresql/d...

[my use-case]: https://www.notion.so


I didn’t know Notion was a Rails product, that’s pretty cool!

Just for disclosure, I used to do a bunch of Rails but I haven’t touched it since the start of covid so it’s applicability to some of what I mentioned was not something I’ve actively built and so I haven’t figured out where all the rough parts are yet because it’s just not a use case of mine these days.

But that second link I mentioned sounds maybe more promising? I initially skimmed it and thought it was there to handle the PostgreSQL to Spanner integration but looking at it a second time it sounds more like native Spanner integration but keeping the otherwise Rails native ActiveRecord experience.

If you’re building against a bunch of PostgreSQL specific functionality then maybe you’re still stuck with the same problem but I don’t think there is a generally speaking more capable publicly available SQL database out there than Spanner than I am aware of is there?

But as a side note if you’re Notion in this situation and not a random small team of a few people I don’t imagine you would have much of a problem reaching out to the GCP team to explain your current set up / requirements, they would be more than happy to give you a custom roadmap to migrate to Spanner, they have entire teams dedicated to exactly those kinds of projects.


Notion isn't a Rails product - but we do use Postgres as our primary datastore.


The difference being if you want multi-region availability with Spanner, be ready to pay minimum $2k per month per region your data is in ($3/hr * 24 hours * 31 days)


I think there is probably an argument to be made that if that’s a scary and unreasonable sounding number for your use case then it’s not the right tool for the job and maybe you might want to consider PostgreSQL knowing that you can migrate later if needed or if you don’t have a lot of many to many data relations then even a document database might get you that kind of scalability benefits but at a fraction of the cost.


Yes, lets throw data integrity out the window and make schemas changes painfully tedious and hard.


Does DynamoDB give transactions, rollbacks, joins?


Yes, you can apply a set of updates atomically so they all fail or none, with conditions. No save points or nested transactions, though. And the transactions need to be grouped into a single request.

Joins, no. Most of the NoSQL systems distribute data by partitioning them on the primary key, and give you scalability by not doing any work across partitions. Which means that traditional joins won’t be supported. You can either join the application, or maintain a secondary schema that has your joined data organised in the access pattern you plan to use.

Google DynamoDB Alex DeBrie or Rick Houlihan for a ton of tutorials on how to restructure data for partitioned/distributed NoSQL systems.


and max 100 keys in a transaction or whatever the limit is


> The upshot is, so long as your postgres + rails app isn't doing some kind of DB write with every request, and doesn't do much - or ideally anything - with Redis

Yeah that's the thing I never understood about Fly.

Globally scaling a stateless web app is very nice to reduce latency but what happens when you have your web apps in US West, US East, EU, India and Australia but Postgres and Redis are sitting in US East.

Anytime you perform a DB / Redis write aren't you back to waiting 70ms-300ms depending on where in the world you are?

Background jobs that flow through Postgres or Redis are super common in a bunch of tech stacks (even Elixir / Phoenix with Oban), and with Rails specifically any type of websocket broadcast goes through Redis too which kind of defeats the purpose of trying to achieve low latency websocket connections since every broadcast goes through Redis in US East while your web servers are around the globe.

Am I missing something important here? Personally I haven't developed a single web app in the last 8 or so years that didn't heavily use Postgres and / or Redis. These are mostly Flask, Rails, Django and Phoenix apps.

I want to like Fly but every time I think about the above I talk myself out of considering it because each web app instance costs money. If you have let's say 6 instances spread across the globe with 1 dedicated CPU + 2 GB of memory each that's $31 x 6 ($186) just for your apps on Fly[0]. If you can't get around high latency for DB / Redis writes and writes are fairly common, it's a hard sell to pay $186 / month for that when you can get a 1 CPU + 2 GB VPS on DigitalOcean[1] for $10 / month and not think about globally distributing your web apps.

I get that the $186 vs $10 isn't a totally fair comparison because with Fly you're getting 6 CPUs and 12 GB of compute while with DO you're getting 1 CPU and 2 GB but with Fly even if your app can happily run with 1 CPU + 2 GB of memory you have to pay for each distributed instance if you want that global reach. With DO you have the option to use that single $10 / month VPS to run your app and forget the idea of global distribution, so if your app fits on that instance size it really is directly and fairly comparing $186 vs $10.

[0]: https://fly.io/docs/about/pricing/

[1]: https://www.digitalocean.com/pricing


> Anytime you perform a DB / Redis write aren't you back to waiting 70ms-300ms depending on where in the world you are?

But that's always going to be the case surely with Postgres? If you're running your one DO box config in say US East, every request is going to have that 70-300ms response time for people not in US East region.

With Fly, I think what they're saying is if you are doing writes to DB then you are going to get no optimisation over having one box. However, if you are just doing reads, then you are going to get a lot of optimisation from having it closer to the user.

If you had a simple blog for example, reading blog articles would be fast as it could serve locally, but if you created a blog article it would be 'slower' as you would have to write to the DB. Given for most blogs are 99.99% reading articles you are going to get a significant speedup.

The problem is though that many applications do writes regardless of what the user is doing (analytics/audit logs and that kind of thing).

I think what Fly needs (maybe they have this already) is a specialised message queue which sits in the same region as your servers and doesn't require waiting for global sync. You could then put many write operations through that, and only have to wait for global sync when you need to show something in the UI immediately.


> But that's always going to be the case surely with Postgres? If you're running your one DO box config in say US East, every request is going to have that 70-300ms response time for people not in US East region.

Absolutely, but in DO's case you'd be paying $10 / month instead of $186 on Fly since you're accepting the situation for what it is (you have 1 server in 1 location).

> However, if you are just doing reads, then you are going to get a lot of optimisation from having it closer to the user.

Yes but this also comes at an additional cost I think. Their shared CPU + 2 GB PG cluster is $27.40 a month. Based on their pricing page it's not clear on how this would be priced across multiple regions and read-replicas only but if it's like their app servers then it's multiplied by the number of regions, so if you had 6 of them across the world it would be $164.40, then you have your $186 for your app servers for $350.40 a month and you still have high latency writes. We haven't gotten to Redis costs either. Keep in mind this is for a 1 CPU + 2 GB set up too, the costs get multiplied if you need more compute power.

> The problem is though that many applications do writes regardless of what the user is doing (analytics/audit logs and that kind of thing).

I think there's a lot of cases where you'll have a good amount of writes independent of audit lots. For example a chat system, or leaving comments on anything. There's a lot of scenarios where writes are common and this isn't a problem related to DB scaling. Each one of those writes will cause the user to wait a long time (location dependent) since your controller action will wait for your database to return before it sends an HTTP response back to the user. But I do agree there's tons of opportunities to take advantage of read-replicas for reads.


> I think there's a lot of cases where you'll have a good amount of writes independent of audit lots. For example a chat system, or leaving comments on anything. There's a lot of scenarios where writes are common and this isn't a problem related to DB scaling. Each one of those writes will cause the user to wait a long time (location dependent) since your controller action will wait for your database to return before it sends an HTTP response back to the user. But I do agree there's tons of opportunities to take advantage of read-replicas for reads.

Yes definitely but many applications are 99.9% read, and stuff like chat doesn't really require instant <100ms responses. It really enables a load more performance for those usecases.

For many of those B2C sites $350/month would be nothing to save a load of latency and increase conversions. It's obviously not a solution for every single web app out there.


I'm right there with you. You need roughly 1 data center per continent to get fast enough response times. The 10-30ms you save in transit time from the edge to a central location isn't worth the complexity or cost.


This is why we don't say "edge". Most apps work best with 3-6 regions. Apps that benefit from ~20ms websocket round trips might use more.


How can I ping the regions without having to register?


Like you point out, compute and network are but one part of the infrastructure equation for stateful apps, one which Fly is adept at. Fly's database story will only improve with time (and I believe they are already looking to partner with other database SaaS vendors), but till then, read-replicas are a pretty resourceful solution.

What Fly really excels at is the dev-ex of deploying apps globally (they don't really compete with VPS providers, that's a different market, imo). There's no mucking around with deployment, orchestration, configuration (and to an extent monitoring) to distribute apps over Fly's infrastructure.

> If you have let's say 6 instances spread across the globe with 1 dedicated CPU + 2 GB of memory each that's $31 x 6 ($186) just for your apps.

Agree. They need to work on their pricing tiers. It is a steep jump from $2/mo with 256M RAM + 0.05vCPU (?) to $31/mo for 1vCPU + 2G RAM.


Unless they are a hobby company, they should increase the prices. The money is in the head, not in the tail. If someone is balking at spending $200/mo on the lowest level, they are simply not a customer.


> Unless they are a hobby company, they should increase the prices. The money is in the head, not in the tail. If someone is balking at spending $200/mo on the lowest level, they are simply not a customer.

It's not balking at the $200 by itself.

It's $200 with the message that I should now have a globally distributed app with low latency because that's what Fly talks about all over their site.

The balk is because all of that goes away as soon as you use Postgres or Redis (which is a ton of apps -- especially Rails). It makes Fly come off as being dishonest because this is a huge thing that's not mentioned anywhere on the pages that try to sell you their service.

It feels like they are leading you into using their platform, you hook everything up, finally deploy your app and then realize that a large percentage of your customers are still getting +200-300ms location related response times because they happen to be outside the location where you're hosting Postgres / Redis. This gets even worse if you heavily invested in Websockets and bet everything on having low latency but still have high latency. This is something that could make your break your business all because of mixed messaging on Fly's site.

I don't think they are doing this on purpose btw. I just think it's easy to amplify a problem so it looks really bad and scary so you can come in and offer a solution. This is a common sales tactic. I just wish they were more open about what happens when using common tech like Postgres and Redis.


The article you're responding to is exactly about Postgres. Here's Redis: https://fly.io/blog/last-mile-redis/

Here's a boring Rails app that uses Postgres: https://fly-global-rails.fly.dev/

And here's more Postgres:

https://fly.io/docs/getting-started/multi-region-databases/

https://fly.io/blog/globally-distributed-postgres/

Rails and websockets are not very good, Phoenix and websockets work great though.


Half of everything their website talks about is Postgres exactly because that is the main source of issues.

If you can make requests without writing even once, then Fly can be a great system (especially because you probably don’t need their larger instances in those cases).

I think they do a fairly good job of telling you that it’s not for you if you want to write to the DB on every request.

I agree with you that I cannot see any use-case for that, which is why I’m not using Fly.


> I think they do a fairly good job of telling you that it’s not for you if you want to write to the DB on every request.

The biggest thing they say on their home page is:

    > Deploy App Servers
    > Close to Your Users
    > Run your full stack apps (and databases!) all over the world. No ops required.
Technically it's true because "databases" includes reading, but they don't mention anything about how this model falls apart as soon as you want to perform a database write with Postgres.

The entire top area of their home page is optimized to make you think "my real world app is going to be deployed close to users and look how easy it is to get going, I just run 2 commands and it's multi-region deployed".

Right below that they have a whole section dedicated to Postgres but they don't mention anything about writes being limited. Instead they talk about highly available clusters and specifically throw in read replicas to leave the burden on the user to infer what they really mean is that writes aren't multi-region and will involve high latency based on the user's location.

It's a conflict of messaging. If this service is meant for folks who aren't hardcore into ops they might not even know to think about database writes on their own because the messaging is so geared towards "we do all of this for you, just run 2 commands and you're deployed".

> I think they do a fairly good job of telling you that it’s not for you if you want to write to the DB on every request.

I didn't find any of this on their home page and if it happens to be mentioned a few times in passing within blog posts or deep in their documentation this feels more like misdirection than helpfulness. It feels more like other vendors in sales where they have a sales page to optimize conversion rates but then somewhere on page 76 of their terms and conditions there's fine print that says "btw, except for when you want to do anything like DB writes with a SQL database, none of these benefits apply" and now if they get questioned they point to this condition and say you should have read it.

I Googled for "fly.io postgres writes" and only found 1 article related to postgres writes, but it's really unfortunate because the blog post misleads folks immediately.

It's at https://fly.io/blog/globally-distributed-postgres/

I took a screenshot of it in case it gets edit here: https://imgur.com/a/eY6EJ0e

In the first non-bolded paragraph they have:

> We won’t bury the lede: we’re going to talk about how you can deploy a standard CRUD application with globally-replicated Postgres, for both reads and writes, using standard tools and a simple Fly feature.

Right off the bat they are saying they solved the problem of having a globally distributed postgres set up for both reads and writes using Fly. They even go as far as bolding "for both reads and writes" so that if someone skims this page that might be enough for them to think "awesome, ok Fly is perfect for me" and now head to their page to sign up because all they wanted to know is if that's possible or not, the details can be filled in later.

But then later down in the article, below the fold they write:

> But these schemes break down when users do things that update the database. It's easy to stream updates from a single writer to a bunch of replicas. But once writes can land on multiple instances, mass hysteria! Distributed writes are hard.

So now they contradict their premise that they solved reading and writing in a distributed way and go back to say on second thought, nevermind, we can't handle distributed writes because it's hard. Then later on in the article near the bottom they mention you should avoid using postgres if you have a lot of writes and use cockroachdb instead.

What kind of messaging is that?

I appreciate what they're trying to do but they send mixed signals around solving a problem they haven't solved which also happens to be tied into their biggest selling point (globally distributed deployments so that end users have low latency responses).


> So now they contradict their premise that they solved reading and writing in a distributed way and go back to say on second thought, nevermind, we can't handle distributed writes because it's hard.

Fwiw, even Postgres-centered "real-time" platforms like Supabase do not do distributed writes (but they sure can benefit from distributed reads).

> I appreciate what they're trying to do but they send mixed signals around solving a problem they haven't solved which also happens to be tied into their biggest selling point (globally distributed deployments so that end users have low latency responses).

I get where you are coming from, but I'd imagine Supabase on Fly is instantly faster than its default single-region deploys on AWS. Fly's CEO, Kurt Mackey, is an investor in Supabase, so let's see how that partnership pans out.

Fly's primary/read replica setup is as good as it gets without complicating matters any further. My money would be on Fly.io to rope in CoakroachDB, PlanetScale, YugaByte, CitusData as solution partners, but that's easier said than done.

> I took a screenshot of it in case it gets edit here: https://imgur.com/a/eY6EJ0e

You could use archive.is (https://archive.is/rlxsS) or web.archive.org/save, too (:


Increase their prices? Their prices are already completely insane. A dedicated 8-core CPU with 64GB RAM for $558.16/mo? And that's not even bare metal, you're sharing all of the other resources of the server. How is it possible that netcup can give you two more cores for under 50€ [0]? Or Hetzner a dedicated server, not just CPUs, for a similar price [1]? Housing servers is expensive without your own datacenter? No, it's not, it's cheaper than ever [2]. What is Fly doing with their servers? Or is that all just markup? It seems their customer is somebody who doesn't care at all about money or somebody who is completely unaware of how much servers actually cost.

[0] https://www.netcup.eu/bestellen/produkt.php?produkt=2632

[1] https://www.hetzner.com/sb?ram_from=5&ram_to=8&cpu_from=9000...

[2] https://dc6-cz.translate.goog/cenik-server-housingu/?_x_tr_s...


A few things here:

1. We pay basically the same prices for dedicated hardware you would. Hetzner is cheap, in part, because they run in a single facility and optimize for price. We run in 22 locations and optimize for "cheap enough for most app devs".

2. All our servers are AMD Epyc CPUs – meaning more expensive than consumer CPUs. Our margin on hardware is almost exactly 70%, if that helps.

We're not competing with Hetzner, Hetzner is a great f'n company. If they do what you need we'd rather you use Hetzner. I could give you some TCO nonsense but I won't. I do think we're a pretty good value though!


The money is in the head, not in the tail. If you cannot afford $200 on a distributed application, go rent a server at hetzner


> If someone is balking at spending $200/mo on the lowest level, they are simply not a customer.

>> Ultimately, we want to accomplish two things:

>> It should scale well, from small mostly text based web apps to large, intense apps that handle a lot of data. You should be able to build a CDN on Fly.io without worrying too much about your bandwidth costs. You should also be able to run a hobby project for close to free.

>> Transparency and predictability are good. People should be able to glance at pricing, do minimal mental math, and judge how well it works for their app.

from: https://fly.io/blog/we-cut-bandwidth-prices-go-nuts/

Btw, technology adoption begins at the "lowest level", if Geoffery Moore and Clayton Christensen are to be believed: https://archive.is/42NS6#selection-586.0-586.1


Eric Schmidt disagrees. You might have heard about the company that adopted that as the approach. It is called Google.


Oh, thanks for clearing that up. I thought you might have been talking about Novell.


Novell was trying not to be the expensive kid on the block. We know how well it worked out for them.


Just a note that people interested in this solution should really take a look at Ruby on Jets as it lets you run an essentially standard, drop-in replacement for Rails on lambda, and the performance seems to be better as well in terms of time to first byte, request latency, etc. We have a 0.995 appdex score running 100% on Jets. Right now we are single region but this could easily be set up in AWS to exist in every AWS edge location and use latency based routing to map to the proper API gateway. Database-wise you're free to tackle that how you like -- no limitations on that end.


There are an incredible number of websites that have incredible read:write ratios. I would suggest doing everything you can to avoid writing to the database on every request.


I wonder, of those websites with incredible read:write ratios, how many have the read-only requests pretty cacheable at CDN level.


Not many. I worked on Ars Technica back in the day, it's the most CDN-like workload you can imagine. We couldn't ship features we thought were valuable, though, because the CDN was in the way.

CDNs have gotten better, and you can write an app in JavaScript to sit in front of your Rails app to make things more dynamic if you want. I believe most devs are better off running their fullstack app where they need it and skipping the additional infrastructure layer.

CDNs are an architectural misfeature that only exist because Fly.io wasn't around 20 years ago. I'm being extreme, but I think that's fundamentally true. ;)


> CDNs are an architectural misfeature that only exist because Fly.io wasn't around 20 years ago. I'm being extreme, but I think that's fundamentally true. ;)

Spicy take :) Although building for a dynamic CDN (I'm thinking Varnish here) feels like building inside-out, I can't see a way to deliver pageviews as efficiently (in terms of CPU or Watts per view) without using a CDN and a lot of caching.

With Fly I guess you flip the architecture around and run code but cache a lot of partials that you assemble together. Nicer than working with ESI and its "cache it all but punch holes in the page and render those bits again" approach, but does the efficiency hold up?

My architectural misfeature would be ESI [1,2]. So useful and so painful to use. And that's before we get to varying support in different proxies and caches...

1. https://www.mnot.net/blog/2011/10/21/why_esi_is_still_import...

2. https://twitter.com/peterbowyer/status/1366324396026118149


It is a spicy take. I agree that Varnish and commodity CDNs are more efficient per pageview than hitting any kind of full stack app process. Most devs can probably ignore efficiency, though. Making things fast for users will take them a long way.

My spicy take is really "in the context of full stack apps". CDNs are amazing for, like, Netflix. And Wikipedia. They're really, really good for one to many files.

With Fly, you do flip it around. And you might find out you don't really even need a cache. Rails + Postgres read replicas are pretty dang fast with one less moving piece. We have an awful lot of users who just do in process caching and hit their DB when they need to, it's pretty cool.

ESI seemed so promising when I first read about it. Then I realized I couldn't actually _use_ it anywhere. I'm still kind of aggravated at how excited it made me.


I am currently finding Rails view generation to be shockingly slow. (no DB query involved, just the view generation computation). I guess I do too many complicated things in my view generation.

(But one thing I don't currently do that has always looked shockingly slow when I've measured it, is Rails i18n. But in general, I'm kind of surprised to find you saying Rails view generation is pretty dang fast! It's always seemed to me like the Rails answer was "yeah, we know, that's why you cache." But now I wonder what I'm doing terribly wrong...)


> I am currently finding Rails view generation to be shockingly slow.

I haven't checked this in a few years, but the last time I looked each `render` call reads the partial off disk, parses it, and executes it.[0] So the common pattern of looping over a collection of objects and rendering a partial for each one is going to hurt a lot. And of course Rails makes tons of allocations, so often you'll see one of those iterations take x00ms as Ruby does some garbage collection.

[0] https://softwareengineering.stackexchange.com/a/365912/29612...


> each `render` call reads the partial off disk, parses it, and executes it

I think it does not do that in 'production' settings, but I'm not sure. I know it does less touching of disk and parsing in production mode, it doens't do all that you mention (that's what the `cache_template_loading` setting controls) -- but I'm not sure it does none of any of that, it may still do things that conceivably seem like they oughta be be cached, not sure.

In default "development" mode settings -- yes, partial template is definitely much slowed down by going to disk and parsing on every request and/or every invocation.


I just re-tested on Ruby 3 & Rails 7, and even in development it no longer minds if I rename the erb file in the middle of iterating---so that's an improvement for sure!

I found a Rails issue tracking partial performance in a loop (https://github.com/rails/rails/issues/41452), and it looks like they are still working on improvements. No activity since May though.


Rails view generation is shockingly slow. I would definitely cache partials, I just wouldn't run Redis to do it if I could help it!

Rails is basically the worst case framework for our model, so we were excited when we got it working. You still have to work hard to make Rails fast, we just give you the last mile for almost-free.


OK, phew, it's not just me!

I guess I misinterpreted when you said "And you might find out you don't really even need a cache. Rails + Postgres read replicas are pretty dang fast."

You actually meant, you might not need a CDN cache... if you have some other kind of cache (which btw ideally isn't redis if you're using our model anyway)... because Rails actually isn't pretty dang fast at all?

OK then! :)


Rails very slow view rendering is one of the worst parts of it. GitHub's view_components library has some performance improvements that can be substantial in some cases. I wish someone would port how Elixir Phoenix does views, which is extremely fast.


ESI, at least on fastly, are run in serialized fashion. On cloudflare workers, you can have up to 6 concurrent connections to try to do ESI as fast as the slowest request.


There's now a free Varnish VMOD to run ESI requests in parallel: https://code.uplex.de/uplex-varnish/libvdp-pesi


I agree that Ars Technica seems like a very CDN-friendly workload. I'm curious what those features were that you couldn't ship because of the CDN. It's gotta just be ads, right?


No, the opposite actually. When the ad market fell apart we launched a paid subscriber program. It immediately turned us into an app that cared about a smaller number of high value users. Almost everything we wanted to add was for those folks - personalization and collaboration features, mostly.

Ads have never been fun to build around.


There are good reasons to do it independent of simple ratio; for example, if you want the site to function as much as possible with a read-only database - not uncommon in recovery, failover or big migration situations.


You posting on this website is a thing it totally matches, it's just the value proposition keeping you from big VC bucks


Absolutely agree! (Though... I suspect there may be some other caching tech in there to help handle the load, and I don't think Fly.io would be cost effective enough for HN - would be interested to know for sure though)

But if we are being honest with ourselves, HN is very much a featherweight unicorn in an ocean of lumbering behemoths. And that's part of the attraction for this particular crowd.

Would be interested in other high-profile rare cases, like HN.


I think the thing is, for most people they won't need to do the global distribution that Fly.io is pioneering, what they want is a better cheaper Heroku. Fly.io is already exactly that too.

For the 1% of people who do need that global distribution, what Fly.io have built is incredible.

I currently host on Heroku and plan to switch to Fly.io once they have an equivalent to Heroku Postgres's WAL point in time restore.


I think that a lot of people don't do global distribution by default, and thus miss out of many customers from countries that aren't close to one of their datacentres. But they don't realise this (how can they - they have no data on what might be) and thus think that they don't "need" global distribution because in their current world, they're fine.


If the site doesn't make many requests in order to work (like for instance HN) then you can also use it just fine despite being very far away from their DC.

And by optimizing for few requests and small size, you also get the additional benefit that it will work nicer if the user has a bad internet connection!


You are quite right, and Fly.io can be that "gateway drug" to taking advantage of it. I suspect that is also their exact business model too, attract customers when they are small and run in one region, then as they grow encourage global distribution extracting business from them.

My business doesn't "need" global distribution but it could defiantly be argued that it could "benefit from" being so.


It can also be harmed - GDPR is not easy to comply with / stay on top of as an example


Exactly. Not even a better Heroku, just a cheaper one is the main reason why people switch from it.

I've said it many times but a startup whose elevator pitch is "Heroku, but cheaper at scale" (and I mean exactly the same UX) would be worth billions


As far as I can tell, that's Render's pitch.

They aren't there just yet, but it wasn't far off last time I tried it ~6 months ago.


We're getting there. Which features would tip the scales for you? (in the last 6 months we've added managed Redis, SSH access, DDoS protection, a public API, a free tier, an improved dashboard, one-off jobs, and more: https://feedback.render.com/changelog)


We specifically needed SOC2 compliance. Annoying, but necessary.

I see you're working on it, that's great!


Most people need to do something on the gradient of "run in one city that's nearest my users" to "run in every city everywhere". We have excited single region users in Sydney, because the internet sucks for Australia/New Zealand.

Running in the right first location is good for users. And Virginia is almost never the right first location.


Curious how Heroku compares to the new JS environments — like Netlify and Vercel. The experience and price there seems extremely nice. Wonder how Heroku has kept up in that regard.


Keep an eye on Render, AFAIK they have PITR and HA for Postgres on a roadmap. It's a solid Heroku replacement.


Yes, I think you're right. I've used Heroku a lot and it's great for small apps which I'm not too interested in optimising by the ms. I always thought fly was very optimisation heavy - but in reality it is a better cheaper heroku if you ignore all the global distribution stuff!


We're building PolyScale.ai[1] that solves the global latency challenges for apps using caching. Clearly that is a different proposition/architecture to database read-replicas, but if the use case is a fit, it's a powerful solution to a hard problem. PolyScale plugs into your existing database and maintains your transactionality and consistency (we automatically invalidate globally). DML queries pass through to the origin db and reads (SELECT’s, SHOW’s) get cached locally at the edge.

We move the database data and query compute closer to the end user for consistent 1ms query execution times at scale. It intelligently manages the cache for every individual SQL query so no configuration is needed, unless of course you want to set manual TTL’s.

1. https://www.polyscale.ai/


You should email me. :D kurt at fly.io


Will do :)


I appreciate that they have made globally deploying a rails app, Postgresdb, cdn, network routing fast and easy. There is always lot of finicky configuration issues and it takes a lot of time to get this all set up right. Like lots of IaaS, they are making something hard easier. They are not, though, removing the need to build your app to this paradigm. The conventions and restrictions you need to follow to realize the benefits remain and are similar to what you’d have to do yourself. So they’ve cleaned up lots of application and deployment gunk, but the app engineering work remains. This is progress. Very nice.


Maybe I missed this in the link but I didn't see how one of the common problems in a distributed application is addressed:

* A POST request is redirected to the primary data source, the resource is created and the id is returned

* A GET request is immediately issued for that id but the resource that has not been replicated to the regional data source

* :boom:


The article touches on this "create-and-redirect-to-show" pattern at the end of the "It's not actually magic" section. https://fly.io/blog/run-ordinary-rails-apps-globally/#it-s-n...


Our Rails gem pins requests to the primary region until the data is replicated.

We talk about this more here: https://fly.io/blog/globally-distributed-postgres/


Fly is compelling enough to use without all the global deployment options they offer[1] but it's assuring to know if you ever do need to deploy globally, they can support that better than most.

[1] - https://github.com/superfly/fly-ruby


Fly.io continues to impress me with their pragmatic and clever approaches to otherwise challenging problems.


I deployed a little side-project (https://thecitymapquiz.com) to fly.io yesterday. The deployment went really smooth, but friends testing it out complained their connection (websockets in Phoenix LiveView) dropped. It might be because its running on the cheapest VMs, I don't know yet. That said, it was really satisfying seeing the application running in Seattle, Hong Kong, and Amsterdam within minutes after launch.


This is probably a websockets bug that crops up during deploys – the connection breaks and reconnects keep getting sent to the now-gone VM for >60s. We're furiously rebuilding our service discovery plumbing because it's slow to propagate. I'm hoping we have the websockets issue fixed this week.


Sounds good. But I probably should implement something to store the state between websocket connections anyways. Loosing all progress when a connection drops is probably not optimal :-)


A bit off topic, but your game's brilliant fun. Just did the worldwide one, got a semi-respectable 23/34. Funny how easy it is to identify North American cities vs European ones. Will definitely share with some friends, great work!


Thank you. I think 23 is pretty good - I have looked a lot on these maps during development and I only managed 24 after I deployed the site yesterday. I'm not really satisfied with the site yet, but I through it was time to find out if any other than me found it interesting - now there is at least one other than me :)


If this global replicated architecture is now a nice, polished, improving hosting product - how come so many well-funded global brands don't appear to benefit from an architecture like it?

Like Airbnb taking many seconds to load many parts of their app from the UK, or business apps like Xero which respond tens of times slower than a local app from 30 years ago.

Surely these are the sorts of companies which should be on top of global hosting, now that a hosting firm can box and sell it?


This is almost exactly Shopify's architecture. It's out of reach for 99.9% of devs.

But really the answer is that AWS doesn't solve this for people. And AWS is the default hosting choice for most tech companies.


Do you have a reference for shopify working like this?


I wonder what the costs are to run something like this for a real world app. Fly.io charge outbound bandwidth fees. You've got instance / volume costs for all the regions for the DB and app servers. IP addresses are charged.

Can someone try it for a medium sized app and send through their monthly bill, that'd be great thanks :).


That is a very clever implementation, and indeed feels like magic for someone who makes relatively small Rails apps.


I'm really interested in fly.io after their Postgres post the other day, but I've not seen anywhere what their recommended solution is for ActiveStorage – is it still writing to S3 (or equivalent), or would it be somehow using their volumes? Are there any published examples (blog posts etc)?


Using volumes as some sort of s3 would require you to basically build s3 from scratch, you would still need to create an auth system, a server to handle uploads, a way to manage uploaded content maintaining their content type somehow (maybe using another db), a static server, .etc


Nobody's stopping you from running MinIO on top.


ActiveStorage has a local disk support, right? If you do need an S3 compatible storage on edge, you could deploy MinIO[0]

[0] - https://fly.io/docs/app-guides/minio/


S3 or the equivalent is best, yes.


Check out Filebase [0]. It's powered by Web3 technologies including decentralized and distributed storage networks, effectively creating one giant "global" S3 region. You could almost think of it as the storage-equivalent to Fly.io. It also works with ActiveStorage out of the box since Filebase has an S3 compatible API.

[0] https://filebase.com/


Nice to see that they support HTTP/2 (which BTW is 6 years old now), unlike Heroku


Amazing what started with Heroku* has turned into.

*Started for me at least. Heroku was how I got started as a Rails developer, and it made it so easy to get a deployment available to interact with from anywhere.


Can anyone speak to using a rails app to reach global users using fastly's surrogate tags or cloudflare's cache tags (Enterprise only)?

After reading the OP, the tradeoffs seems more complex than a single origin in us-east and having CDNs cache HTML pages.

The most appealing advantage I see with running rails at all these regions is you can authenticate users really quickly and redirect them to a html page in a CDN.


Before we built Fly, I spent a huge amount of time trying to bend CDNs to my will. Fastly's surrogate keys are pretty good _if_ you define a good HTTP cache schema and if every request is worth caching.

My general takeaway was that HTTP is a terrible interface for "programming" things. Worse, the whole stack was difficult to test and inflexible to iterate on. And the apps I've found myself wanting to build lately rarely serve the same HTML twice.

You may find differently, but I strongly believe "just run your app and DB, no other layers" is the simplest possible way to do most full stack work.


Your response got me to really look at fly...

i have to try it now. After reading the free postgres and last mile redis blog post, i realized something...

I can use probably the most under-used cache stores rails provides: the file_store, which is twice as fast as trying to read from a memcache/redis process. This was not even an option on heroku b/c of the ephemeral file system.


I'm struggling to deploy a basic docker application, that works fine on my machine, it handles https traffic by its self, so the Fly.io docs say to exclude services.ports in fly.toml and TCP will be passed-on as-is, but unfortunately it seems to still be messing with my TCP traffic, maybe because it's on a non-standard port


You'll need [services] to accept public TCP traffic, just set `handlers = []` to turn off our TLS and HTTP layers. We do run a proxy so we definitely mess with your TCP traffic! You can enable PROXY protocol to get some of the original connection info (ip, port, etc): https://fly.io/docs/reference/services/#proxy-protocol


Yup i've tried that but sadly the curl command that works on my machine fails with `curl: (35) error:0A000126:SSL routines::unexpected eof while reading` on fly.io

Locally I run with docker run -p 8888:8888, so I assume the correct thing to do when deploying on fly is to build the image with EXPOSE 8888 in the docker file, and set the service internal_port = 8888 in fly.toml. Have I assumed this correctly?


You do need the internal_port, you don't need `EXPOSE` though. Will you post your config file at https://community.fly.io so we can have a closer look?


I deployed my company website / blog with fly.io It's a simple phoenix app w/o a DB, and it was trivially easy to set up. After having used K8S for Rails and Phoenix hosting before, their product is definitely something to keep in mind.


Why do you need all these regions? Is there any real money to be made outside of G7 that will be greatly decreased due to increased latency?


That's a good question but not one you could answer easily without proper research into your chosen market.

If I was to speculate though I think you'd possibly find that you have long tail of potential customers spread outside of the G7 countries.

That's leaving out that there are a lot of very wealthy non-G7 countries.


To me the real key here is making multi-datacenter instances easy, whether those are multiple datacenters in the same country or 37 countries.

But even to stick to your question, the G7 spans probably a dozen time zones. Even if you stuck there, you'd still have a pain with a multi-region app and see benefits.


So you're saying that I can run my "ordinary" Rails-2.3.18/Ruby-1.9.3 app as well?


Yep. Anything you can build a Docker image for, but "fly launch" may just do the right thing for it.


To be fair, Rails 2.3.18 was released two days before Docker itself was, so OP might have some general issues to fix up first, hah. :)


Stop making me feel old.


Did you know that the first release of Rails was actually closer to when the Pyramids were first built than they were to us today?


When to support JVM languages?


Fly just runs container images, so you can already run any language you want, including the JVM.


They don’t do enough in my view to sell the basic story here. So it’s non obvious


We added this yesterday! https://fly.io/docs/getting-started/dockerfile/

It's getting less non-obvious, I hope.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: