> What kind of speedup is available for big Rails applications? If 90% of the time in an application is spent in database calls, then there’s little opportunity for improvement via JIT technologies.
This is written as speculation of course, but it matches some long-time "conventional wisdom", that most Rails apps spend most of their time on I/O waiting for something external.
That may have been true once, but i don't believe it's true anymore.
My Rails apps now, when things like "n+1 database query problems" are properly eliminated, all spend more of their time on CPU than I/O, 60-80%. (I think most of that CPU time is ActionView rendering. Unsure about CPU parts of ActiveRecord, that aren't waiting on DB, but converting db results to in-memory AR objects etc).
When this comes up and I ask around on social media who else has actually looked at their app and can confirm or deny, almost all who respond agree. I would be curious to see here too.
Definitely some more systematic rather than anecdotal investigation is called for. But I think it's time to stop repeating that it's likely that a Rails app is I/O-bound rather than CPU-bound, my experience leads me to think the reverse is currently likely.
[*edit* after I wrote this comment, I noticed multiple other comments in this thread saying basically the same thing. OK, so. Can we put this rumor to rest?]
> I think it's time to stop repeating that it's likely that a Rails app is I/O-bound rather than CPU-bound, my experience leads me to think the reverse is currently likely.
I'd like to offer an anecdotal counterpoint. I worked at Shopify for several years. The big problems were always due to I/O. Usually that was in the form of long-running network calls. You're calling the FedEx API for shipping rates in the cart. All of a sudden FedEx starts taking 5 seconds instead of 1 second, and all of your worker capacity evaporates. That kind of thing.
Or, if databases are more your speed, we faced another problem with the GraphQL API needing to rattle off a bunch of requests to the DB, without being able to do so in parallel. So the more different slices of data you needed to fetch, the longer the single GraphQL call would take.
We spent significant amounts of time finding ways to work around Rail's inability to do these sorts of things in parallel (or at least, the inability to do it without lots and lots of Spell-o-Tape). Writing proxy services in other languages that could parallelize an incoming request and fan it out to the actual Ruby boxes, etc.
I like Ruby but I feel like it's pretty unequipped to deal with the problem of long-running I/O and worker saturation. There's work happening there but it will be a long time before you have the same ease-of-use as languages with better concurrency support out of the box.
You could argue that these sorts of long-running calls are not the norm, but in my experience it's way more common for a web app to need to perform some call to an external service at some point during its lifetime than not, and that's not a use case that Ruby/Rails are built to elegantly handle.
Thanks for the anecdotal counterpoint, that is what I was looking for!
I think we're talking about different things though.
I think OP was suggesting/questioning whether a typical Rails apps may spend 90% of it's overall time in I/O.
I am suggesting the reverse, that typical Rails apps (so long as they don't have bugs where they fetch more than they need) will spend spend 60-80% of their time in CPU rather than I/O.
I am curious if this is true of shoppify's app; my guess is it still would be (especially cause I think shopify will not have very many bugs where it's fetching things inefficiently or that it doesn't need to fetch, shopify will be higher-quality than a random rails app).
You are suggesting, I think, that at shopify atypical, problem requests were always due to I/O.
I think all of this can be simultaneously true.
I think a side issue to what I'm talking about, but:
> We spent significant amounts of time finding ways to work around Rail's inability to do these sorts of things in parallel
Rails 7 actually recently introduced a feature to do multiple ActiveRecord fetches concurrently... perhaps motivated by/developed by Shopify? https://pawelurbanek.com/rails-load-async
I haven't used that one yet. In the past when I've needed to do things like external API requests in parallel, I have had success just using futures/promises from Ruby::Concurrent, without even telling Rails about it. I've had to do this only rarely, but I haven't run into too much trouble. But one thing you can run into if your threads try to store things in ActiveRecord themselves, is running out of database connections with Rails "long connection checkout" model, involving explicit handling of pool checkouts once you do your own threads. Sequel (vs ActiveRecord) has a much more connection-efficient way of pooling database connections, where they are transparently checked out of the pool, under the hood without you usually needing to manage it explicitly, for only as long as it takes to execute a single statement. When doing heavily concurrent things with AR, I have often wished it used Sequel's concurrency/connection-pooling model.
With Ruby::Concurrent, I found similar weird-isms until I wrapped in `Rails.application.executor`. Rails has its quirks and purists can be put off by all the "extra work" that needs to happen for things to play nicely together.
To that I say: Rails gives me superpowers and I need to abuse them appropriately.
Totally -- I've been trying to do multi-threaded concurrency like this since before Rails actually had `Rails.application.executor` -- it was possible, there were some gotchas. It is very helpful to now have API available at that level of abstraction. It can still get weird sometimes. There are still some gotchas. Usually about ActiveRecord.
With what we know now from seeing how it all works out, if we were able to do it over again greenfield, I think it's pretty clear that sequel's concurrency/pooling model is preferable to ActiveRecord's. Although I'm sure it would have it's own downsides I'd run into if I used it more and at scale.
You're right, I think we're using IO/CPU bound in different senses. You're talking about where an average request spends most of its time; I was thinking more in terms of Shopify's scaling bottlenecks. Even though individual requests probably spent a lot of their time in CPU, we didn't really have to worry about CPU as a bottleneck for the larger application. Whereas IO was a different matter, and we had a lot of difficulties ensuring that we had enough capacity for spikes in the IO-heavy stuff. So the overarching Shopify application was "IO bound" in that sense.
I think it also depends on architecture and what the specific endpoint does. Let's say it's a microservice architecture with lots of network calls (to fetch data from different services or even outside API calls) ...how will CPU be 80% of the time? Doesn't sound reasonable.
But if there's not much calls at all and it's rendering one huge view then yeah maybe it's mostly CPU...
> Usually that was in the form of long-running network calls.
That's why doing a network call to another service in the middle of a request is pretty much banned from the monolith. Anything that doesn't have a strict SLO is done from background jobs that can take longer, be retried etc.
Now you mention FedEx so I presume you were working on the shipping service, which by essence is kind of an API bridge so it's probably why it was deemed acceptable there, but that's far cry from what a typical Rails app look like, except maybe in companies that do a lot of micro-services and are forced to do inline requests.
> that's not a use case that Ruby/Rails are built to elegantly handle.
I'd argue the contrary, the elegant way to handle this is to perform these calls from background jobs. Look at shipit [0] for instance, it's syncing with the GitHub API constantly but doesn't do a single API call from inside a request cycle, every single API call is handled asynchronously from background jobs.
I was actually referring to background job workers in my comment. Queuing is still queuing, whether its request queuing due to a shortage of web workers or job queuing due to a shortage of job workers.
Yes, there are more levers one can pull for job workers, and it's probably easier to horizontally scale those workers than web workers for various reasons. But regardless of which workers are performing the long-running I/O, there's still a hard bottleneck imposed by the number of available workers. They're still going to inefficiently block while waiting for the I/O to complete. The bottleneck hasn't been truly eliminated; it's just been punted somewhere else in the application architecture where it can be better mitigated.
Background jobs may be the most elegant solution for handling long-running I/O in a typical Rails app, but that's still less elegant than simply performing those requests inline and not having to worry about all the additional moving parts that the jobs entail.
> that's still less elegant than simply performing those requests inline and not having to worry about all the additional moving parts that the jobs entail.
I strongly disagree here. Going through a persisted queue gives you lot of tools to manage that queuing.
If you were to just spawn a goroutine or some similar async construct you lose persistence, lots of control on retries, resiliency by isolation etc. When you have "in process jobs" re-deploying the service become a nightmare as it becomes extremely muddy how long a request can legitimately take.
Whereas if you properly defer these slow IOs to a queue, and only allow fast transactional request, you can then have a very strict request timeout which is really key for reliability of the service.
Those are all fair points. My only counterargument is that for a certain class of requests, the simplicity of not needing to worry about a separate background jobs queue outweighs the benefits that the job queue provides. There's some fuzzy line where those benefits become worth it. And you're probably going to cross that line earlier with a Rails app than with an evented one. There are lots of cases in Rails where problems are solved via background jobs that would most likely just stay in the parent web request in an IO-friendlier environment.
Rails or not, in any platform/framework, if you do an API request inline in a request that could take max time N, then the total request time could take max time N+m, so if you don't want requests to take >N, you don't do requests inline.
What am I missing, how does Rails make this especially bad?
Or is it that in another platform/framework, it's easier to allow requests to take long N+m to return if you want? True that would be easier in, say, an evented environment (like most/all JS back-ends), but... you still don't want your user-facing requests taking 5 or 10 seconds to return a response to the user do you? In what non-Rails circumstances would you do long-running I/O inline in a web request/response?
> Or is it that in another platform/framework, it's easier to allow requests to take long N+m to return if you want? True that would be easier in, say, an evented environment (like most/all JS back-ends), but... you still don't want your user-facing requests taking 5 or 10 seconds to return a response to the user do you? In what non-Rails circumstances would you do long-running I/O inline in a web request/response?
Yeah, that's what I'm getting at. It's true that even in evented backends there's a line beyond which it's probably better to put the long-running stuff in a background queue, but it's a higher bar than in Rails. I've run pretty high-throughput Node and Go apps that had to do a lot of 1-5s requests to external hosts (p95's probably up to 10s) and they didn't really have any issues. In my opinion, it wouldn't have been worth it to add a separate background queue; the frontline servers were able to handle that load just fine without the additional indirection.
byroot makes good points in a sibling comment about retries and more explicit queue control being advantages of a job queue pattern regardless of whether you're evented or not. I just think that those advantages have a higher "worth it" bar to clear in an evented runtime in order to justify their overhead (vs Rails).
I'm curious if you could share more on the GraphQL part of the situation you describe here, and if being architected on your more traditional "RESTful" (however you want to define that, but usually Rails defaults get you a wonderful implementation of "RESTful") API convention would have made any difference in the performance issues you encountered at Shopify.
We didn't have the same problem with the old REST API, because the requests were naturally sliced by whatever resource type people were requesting. Whereas the GraphQL API allowed clients to request big slices of disparate data all in the same request. If we had (for whatever reason) exposed a REST API that fetched lots of different slices of data and combined them like GraphQL does, those requests would have faced the same problem.
So it was really due to the nature of GraphQL that the problem materialized, rather than being a GraphQL vs REST issue per se.
The whole I/O is the bottleneck thing is the same as saying “memory is cheap”.
It’s true in a way, but that doesn’t mean we should all ignore it. It’s usually not a big problem, until it is.
I love Ruby, but I love crystal in terms of performance, except for compilation times.
My preference at this point is rail's in api mode with a separate javascript fronted. I haven't had had to scale enough to really see it, but I often see rendering discussed as CPU heavy and I do wonder how much it helps that all my rendering is the OJ gem.
If you go this route you end up having a drastically more complicated front end, and usually without the speed benefit. Unless you really need a SPA I would stay away.
> This is written as speculation of course, but it matches some long-time "conventional wisdom", that most Rails apps spend most of their time on I/O waiting for something external.
> Can we put this rumor to rest?
You have only given one example though. What rumor is there to put to rest? I think the statement is still accurate: "most Rails apps spend most of their time on I/O waiting for something external"
To your specific concerns though - I have also see ActionView and ActiveRecord casting take up non-trivial amounts of time (I recall a situation in a past job where the mysql driver being notoriously bad with allocations of enums - probably our own doing however).
For Rails specifically, I can't say enough good things about AppSignal. Being Rails specific gives it really great performance analytics right out of the box (it hooks into Rails standard instrumentation), and it handles error reporting for you as a bonus.
Of course mileage varies. I think it is fair to say that it is still the case that for many rails apps most of the request/response time is spent waiting on I/O. The DB I/O performance can be greatly improved by minimizing number of queries, indexing, query optimizations, ETC. DB I/O performance bottleneck is not a problem unique to rails, but maybe felt by rails devs more because the abstraction ActiveRecord provides doesn't make it obvious when you're misusing the DB.
Views are definitely one of the places that can get slow. Each partial is an IO read and they happen sequentially. Looping and rendering a partial is a common mistake that kills performance.
For example using the data from the db (model instance) as cache key is quite effective solution for being able to deliver most view/fragments from cache.
The downside is that some care must be taken with keys, which parts are cached (eg: logged in pages).
After almost 4 years, I recently started working on a small Rails app. Typical Rails responses used to take hundreds of milliseconds before, now it takes tens of milliseconds (with latest versions). I was not expecting that at all. Together with turbo and stimulus, responses seem almost instant. Kudos to the Ruby and Rails teams.
Not OP, but looking at our metrics the first response I picked was a load for a blog article. It does 3 db queries (because it is not very well optimized, the blog is not super high traffic), one for the tenant, one for the actual article and finally one for the (image of the) team member who wrote the article.
This endpoint runs in ~26 milliseconds average, 6 ms of which are in ActiveRecord and 19 ms are in ActionController. I assume the remaining millisecond is an artifact of rounding in the monitoring and/or some overhead in other bits of the framework. Most other endpoints are pretty similar, though there are some internal queries for reporting etc that take much longer of course.
At the point that you have large or lots of queries on the DB or something similar, you really aren't talking about Ruby performance any more, and you'd be dealing with more or less the same problem in any language.
But in my experience, here's some data from one of our most hit routes:
Basically this route will do 5 fast DB queries (load a single record by PK), a couple of Redis lookups (one of which is returning fairly large keys).
Our mean performance is 44 ms, 95th percentile is 80 ms. Almost all of that is taken up by DB / redis, actual ruby execution is not quite a rounding error, but would not be an optimization target.
The controller itself is fairly typical - 3 or 4 before actions, a decent amount of object instantiation, some branching / logic. It doesn't involve service classes and just models.
Ruby / Rails done in a fairly "vanilla" way can be pretty performant. We've had some performance issues with our app, but Ruby itself has never been the problem.
> If 90% of the time in an application is spent in database calls
This is what a lot of people say, but I'm not sure that this is the case.
I think apps spend about 50-70% of their time in compute in the interpreter, at the low end.
People don't often make metrics from their apps public, but here's one example from a web site that renders text - so it's reading from the database and not doing some unusually computational task with the result https://genius.com/James-somers-herokus-ugly-secret-annotate....
I agree, optimizing db calls in my experience has been low hanging fruit. But (in rails) going through and making your own poro + active model objects can give significant speed increases, especially in cases where the Rails overhead is too high.
Object instantiation and database serialization/deserialization seem to be a pain point that gets overlooked more than people realize.
But in this case I’m preaching to the choir on ruby app optimization.
Do you forsee projects like Rails needing to be rewritten in order to be more favorably JIT and interpreted by truffleruby & ruby jit?
More than once I run into db serialization performance issues. Postgres reports all nice and fast, yet requests are slow. It took me a while to figure out that activerecord was busy converting my data into one large query string, and then busy sending that string over the wire. The issue was a field using postgres jsonb and I was filling in a large array of text lines in some cases. Workarounds are easy but I never pinned down why it would take multiple 100‘s of milliseconds to serialize into sql.
I haven't seen too much ruby in the wild lately. Most of my customers use other stacks (python, go, php, jvm, etc.). I default to using Kotlin these days which gives me a nice compromise of an expressive language, nice frameworks, and excellent scaling abilities and performance. But I deal with whatever comes on my path.
Mostly, throwing money at the problem and scaling horizontally is a perfectly acceptable solution and when it isn't, using ruby is probably not the way forward either and you might want to consider more performant stacks. That being said, unless you are doing very strange things, you should be able to make ruby perform acceptably.
Something, I've observed in other people's ruby code is that it sucks you into doing sub optimal things. Rails and orm can lead people into doing things that just result in a lot of needless database traffic. That's not a ruby problem but just poor design. I've seen people get into trouble with java/hibernate as well. Solution: optimize your queries and database schema and think about what kind of indices you need.
Another common issue is people doing silly things with parsing and processing the same things over and over again. Caching helps, good algorithms help, etc. This too is not really a language problem but a design issue.
In terms of percentages, most ruby apps are single threaded and use blocking IO; which is why you fire up many ruby processes on a server. So, whenever it is dealing with IO, which should be most of the time, it shouldn't be using much CPU. That's why you can have many more ruby processes than cpu cores, most of those processes would be idling or blocked on IO most of the time (orders of magnitude slower than any computation). This is of course not a great use of memory and you are likely to exhaust that sooner than the cpu. If you can load test, you should be able to find the limit.
As for anecdotal evidence seen in a consultancy function: You are wrong in assuming that companies always hire database experts or data scientists to write their db queries who know how to not make the db the bottleneck.
In reality you'll find developers who claim "we can't use Ruby here because it's too slow!" while they're unaware that the reason their page needs seconds to load is because of hilariously inefficient queries to the db.
In my consulting experience, I'm shocked if I find that inefficient queries aren't the root cause of poor performance. In fact, thinking about it, I don't think it has ever happened. I always look at the DB layer first because it's virtually guaranteed that someone wrote a "SELECT * FROM MassiveTable" and added it to the common header code used by every page.
Matches my experience too, both consulting and in-house as the person who'd be the first to even consider looking at the database query logs or running "explain" on queries.
A reason for ORMs is that a lot of developers fear the database (EDIT: not the only reason, to be clear; I love and use ORMs). A result of ORMs is that a lot of developers think they can avoid understanding the database.
I'm pretty sure the highest value per character code I've written so far was a 10 line monkey punch back in the rails 3 era that would crash the app if you tried to use an active record query without a limit or with too high a limit.
Hehehe, that reminds me of a similar active record monkey patch at an old job, the name of which was used as a swear word everywhere except where that particular engineer was present.
+1 to this. I've lead performance optimization on enough real-world problems to be conditioned to just go straight to the database access patterns from the start...it's always there.
Modern languages, including Ruby, are all plenty fast enough computationally for the vast majority of business workloads that aren't Google scale. When things slow down...it's the database or something similar like N+1s calling external APIs.
>What kind of speedup is available for big Rails applications? If 90% of the time in an application is spent in database calls, then there’s little opportunity for improvement via JIT technologies.
This has always been the premise for enabling multi-threading on RoR applications despite the GIL. The convertional wisdom is you spend 80-90% of time waiting for the DB and other IO. After 15 years of hosting rails applications, I'm pretty sure that is not generally true for real world applications.
It depends on the application obviously how this works out, but for a larger, complex saas app I work on it works out to 64% ruby time for web requests. For another application, that does a lot more external calls, it's 52%.
So raw performance of ruby can have a pretty significant impact on real performance for RoR applications. To the point where a faster CPU (new gen on AWS) improved response times by 15-20% for us.
I agree. Also, it is hard to do things in parallel (or in a separate thread), e.g. make two remote calls in parallel, wait for them to complete and aggregate the results. Trivial to do this is Node.js or JVM not so much with Ruby.
True but the VM itself is non blocking on IO so it's not such a huge problem. If u really have to do it I think the new async gem might become standard at some point but I'm speculating.
It's one reason why ViewComponents exist. It's pretty easy to render a small partial too often and generating a lot of overhead by doing that. It's not really obvious that partial rendering is a magnitude slower.
To be precise, it's not so much the partial rendering that's slow, but the partial lookup.
When you do `<%= render "something" %>` Action View has to do a stupid amount of work to figure out which file it has to render. However, for having looked at it recently, I doubt we can improve that much without breaking backward compatibility.
This has been known for years now, and it's really depressing that nobody's been able to improve it.
Since you've looked at the code a lot more than me, I believe you when you say: "I doubt we can improve that much without breaking backward compatibility." -- although it doesn't make sense to me why partial lookup can't be cached at a given call site in a way that's perfectly backwards compat as well as fast (after first partial lookup from a given call-site).
But if true -- I wonder if it's time to consider backwards incompat? Perhaps in a way you can opt into, not just app-wide, but per-file/view/action or per-call, and the opt-in not just as a temporary deprecation situation, but planned for long-term support of being able to do it "both ways", the backwards compat way or the performant way.
But in the limited amount of time I have spent looking at the code, I quickly get lost, it has evolved into pretty convoluted code. If the problem is not really "there's no semantics for caching partial lookup that are both backwards compat and higher performance", but "the code is so convoluted we can't really figure out how to change it to do what we want without rewriting the whole thing in a way that's going to be hard to be backwards compat for all edge cases" -- that may be even harder.
> it's really depressing that nobody's been able to improve it.
That's quite unfair, several people improved this already...
> why partial lookup can't be cached at a given call site
Well first you have no state at the call site in which you could store that.
But even if we did, a single callsite can render different template based on request parameters, e.g. `render "foo"` may be either `foo.en.html.rb` or `foo.fr.html.rb` based on the request locale. But then you also have format, format variants, etc etc. So the number of possibilities is huge.
> I wonder if it's time to consider backwards incompat?
Of course we consider it, the explicit locals declaration is one step in that direction, but it's not like we can break people's code willy nilly.
It is to some extent, the problem is in the semantic <%= render "foo" %> may end up rendering different partials based the context it's invoked from, e.g. current locale, etc.
So the cache can't just be a simple hash.
Additionally, once you have identified the template to render, if some locals are passed, you then need to lookup the compiled template for that specific combination of parameters, because `render "foo"` and `render "foo", name: "George"` end up compiling two versions of the same template.
For that later issue we just introduced a way to declare what locals a template accept which should allow us to improve that part: https://github.com/rails/rails/pull/45602
- Ruby would benefit from object shapes and JIT compiling
- Good news, both of those are being added / improved
- The jury is out about C extensions
I hope ruby has a renaissance in the next few years. People can and will howl about its performance, but it’s far faster than it was even a few years ago, let alone 10. The changes mentioned in the article, the async gem, and ractors are all exciting things that could push it even faster and unlock previously unavailable design patterns.
Byte for byte, it’s one of the nicest languages (in my opinion of course), and for many use-cases Rails is like having developer superpowers.
I hope renewed excitement leads to more investment from folks, because it’s honestly just such a nice, ergonomic ecosystem to use.
I'm trying to remember why Ruby fell out of favor ~10 years ago. I don't think it was all down to performance because it was mostly on par with Python. I seem to recall that it came down to community practices that were a little too freewheeling - widespread monkey patching, etc. and these had become reflected in Ruby's flagship product (Rails). (I used to do a lot of Ruby programming in the 2001-2010 timeframe and like the language, but it's mostly been Python since then)
Ruby didn't "fall out of favor" 10 years ago, usage of Ruby as well as Rails moved into the enterprise space. Instead of startups using Rails, it was larger companies trying to circumvent the bureaucracy of their existing toolchains. I was a Ruby developer that whole time, and only stopped writing it at my day job in 2020.
Others in this comment thread have pointed out JavaScript, and I think that's a pretty good bet on why Ruby isn't really used by newer companies anymore. If you're trying to build an application for the web, it makes a lot more sense to work with _one_ language rather than two. Additionally, because JavaScript is always going to have a good amount of developers in the hiring pool due to its platform exclusivity over the web frontend, it's much easier to build out your engineering team than it was when your code is written in Ruby. Ryan Dahl described JS as "the best dynamic language in the world". I'm not entirely sure if I agree with that, but I do know where he's coming from. JavaScript is much easier and more ergonomic to use these days, and some of the most innovative technologies for web application development are coming out of this space.
I love Rails, but given the way things are going, I'm not sure I'll ever build a project with it again. We just don't need it anymore. That's not to say it doesn't have a place, and there will definitely always be jobs in Rails since it's being heavily used by the enterprise world, just that I don't think I'll be all that interested in it again now that I can use JS for everything.
Sure, but Ruby could've been a player in data science as well. But for some reason Python won handily in that space. Was it all down to numpy, scipy, etc?
Whenever this subject comes up, people seem to forget how poorly Ruby plays with Windows. WSL helps a lot and gives you a practical experience, but it'll never be a first-class citizen on Windows like Python is.
Maybe, but most data science jobs I see (and have been around) seem to be using Linux... with a few on MacOS. Lack of Windows support doesn't seem like a problem in the realm of data science.
Additionally, I think there are a lot of non-programmers, non-engineers working in data analysis or data "science" and Python syntax is much more approachable for a beginner.
Lots of starter programming courses whether university or online are in Python, etc. Again if you're not actually and engineer or developer of sorts, you're not likely to spend much time and effort moving on from the language you already know that everyone else in your field using, etc.
The answer to that is yes. Python was already used a lot in science and when Data Science started to blow up it already had numpy, scipy, scikit-learn and pandas. Also those libraries are really fast because they're wrapping highly optimized Cython, c, c++, and Fortran. I don't think Ruby had anything like that at the time.
Any language with FFI can have similar bindings to the same libraries used by Python, see Java, .NET (C#/F#), Julia, Swift,..... yet it doesn't seem to happen for Ruby.
My memory matches yours. Thanks to things like monkey patching you had the problem where changing the order of imports would break other code. And if a dependency of yours added a new import, your code would randomly break for no apparent reason.
As a result I saw more than one organization say, "No new projects in Ruby."
I'm sure that there were many different experiences in many different organizations. And I have no idea how widespread my impression is. But it certainly discouraged some people from using Ruby.
I see it as a pendulum. In the old times, we used to write web applications in CGI, where there was no fixed structure whatsoever; the application programmer was responsible for creating the entire web stack with their own bare hands. Then the pendulum started swinging towards more structured frameworks and reached a peak with J2EE, where the application programmer only had to write a tiny piece of code (servlet) that went into a massive framework (servlet container), in a language that had a huge standard library (Java), following clear industry standards (e.g. JavaBeans, design patterns). By the time Rails appeared, the pendulum had started swinging back towards less structure and less formalism. Rails was still a framework, with a strong set of conventions, but it was quite simplified compared to J2EE. The pendulum then continued and people switched to Node.js, which was much more barebones and flexible than Rails. Right now, the pendulum seems to be reaching the opposite extreme with Go, where not only there isn't a framework, but there also isn't a virtual machine (everything is compiled to native code), the language itself is almost as simple as C, and there is barely any standard library.
Meanwhile those of us on the JVM/CLR ecosystems, saw it come, influence of some the JEE/Spring/MVC designs, Groovy, coming and going as IronRuby, influence CoffeScript, and eventually fade away we still keep using JVM/CLR ecosystem, and the savy ones will even know how to AOT compile our applications, if needed.
Since several people are chiming in with suggestions for what might make rails slow I will throw in my two pence.
Profile things, and make sure you understand the results of the profile (because profilers can lie to you in a whole variety of ways).
When I last checked on TruffleRuby some time ago the slowest thing on a small rails app was all the layers of framework and all the points at which it might log things, while the actual database access and the use of that data was pretty quick.
This might not continue to be true for larger applications, but it probably means there is a high minimum amount of work that must be done per request, and it’s so deep on the stack that it’s unlikely to be optimised away.
The situation may have improved, we’ve done a lot of work on TruffleRuby since then and completely changed the inlining strategy at least once, and the method lookup and dispatch. But the profiler has also changed significantly so the way it lies will also be different now.
Profiler will not show the load time till all the framework and the includes are loaded though.
Which is were alot of time is spend. Just having that whole dependency tree parsed in and hashed up.
Wish there were more way to reduce libraries to subsets and not loose that on update.
Sure it will. It didn't dominate the profiles I was looking at because I was measuring peak performance so had waited until after everything had been loaded and had time to be JIT compiled etc. but it's quite common to see an extremely significant chunk of time used by initial requires and configuration.
I think implementing shapes is very likely to significantly improve the raw compute speed of Ruby applications.
That said, if you're implementing an application in Rails, there are a number of low-hanging fruit that will significantly improve performance. In most cases they'd apply in any language, but since it's trivial to get "something running" in Rails, it's easy to forget the basics for performance. E.g.:
* Index your database for all queries that matter.
* Use caches. In the case of Rails, use partials; they are amazingly effective at improving performance.
* Use a tool to detect N+1 queries, and fix them.
* When querying with an ORM (ActiveRecord in Rails), request just the specific fields you need instead of "downloading everything" into the object. Getting unnecessary fields increases the database response, the memory use, and the garbage collection effort, and all of that unnecessary work eventually adds up.
In general, you can get a lot of performance improvements by doing only what needs to be done and nothing else. Caching to avoid repeated work, requesting only the data you need, etc., can provide a lot of performance improvement and often require relatively little effort.
> If 90% of the time in an application is spent in database calls, then there’s little opportunity for improvement via JIT technologies.
Even if it is true (for a given application) it doesn't mean that you don't need to optimize CPU usage. It's just mean that you may run out of RAM first (if you don't use async I/O) before you'll saturate CPU. Or may not.
First, in most non small projects database and app (e. g. Ruby) layers usually either use separate hardware or shared HW but we can account hardware cost for each of them. If DB layer uses much more HW than app layer then yes, it makes little sense to reduce CPU usage by Ruby. But usually an app layer is bigger, especially if a language like Ruby is used. Let's focus on the app layer assuming our DB layer works well (request latency can be non negligible - if it is stable under any load we have - that's fine). Assume we need N servers with Ruby - let's look what prevents us from using N/2 servers and saving money: for web apps it is either CPU or RAM (disk I/O handled by DB layers, 10Gbit network bandwidth should not be a bottleneck in most cases). For contemporary server hardware and web apps in my experience it is more common to saturate CPU before you'll run out of RAM so by reducing CPU usage you can save money. But your mileage may vary.
I love Ruby so much as a scripting language, I ask myself why I don't just use it as an "acceptable lisp" [1] for my research, rather than Haskell. Haskell is more easily parallelized than any other language I've experienced. Alas, my experiments with Ruby 3.1 parallel extensions flunked. Who cares about single core speed?
I was always put off by Rust syntax (it's a fly-pollinated flower, to attract C programmers), but I'm starting to get the genius in its machine model. Learning Rust forces one to understand what's happening at the machine level, but offers better control as a reward. Its expressiveness is a lot closer to Ruby than I had imagined, it won't put me back in the C99 stone ages. So my provisional answer is that Rust is a Faster Ruby.
I think that one underrated point about web apps performance is JSON serialization for non-JS languages. JavaScript is obviously fast with that, but non-JS languages exposing API, like Rails in a separated FE/BE architecture, spend much time transforming data into JSON, and it doesn't seem stressed enough to me.
Why is JS faster than Ruby at JSON serialisation/deserialisation? The algorithm and data structures seem exactly the same in the two languages to me. In fact Ruby should be faster as it has simpler arrays without things like holes.
I guess that, since JSON is Javascript Object Notation, it should be faster for Javascript to serialize/deserialize it, but it might be just a psychological bug of mine. Anyway, my main point wants to be the fact that JSON serialization/deserialization seems an area of improvement on API web apps. As a Rails developer it requires a lot of effort to deal with that part of the application performance. Just providing a list of 50 db records with 10 attributes each, with some transformations (let's say that e.g. 4 attributes are virtual and some keys are renamed), add an important overhead to the response.
Take Ruby. One of the most common improvements you do as Rails developer to a Rails web app is adding `oj` gem, which replaces the default JSON ser/deser with a more performant one. But it doesn't come by default with Rails. Can you guess how many Rails web apps performing JSON ser/deser don't use it? I'd be curious about that.
Moreover, it's really hard to find a presenter library oriented to performance. There are wonderful DSLs, but none of them seems optimized to speed, IMHO.
JS VMs are generally better engineered than Ruby VMs due to more investment, including better JSON handling code yeah. But that’s nothing to do with JSON being (almost) a subset of JavaScript. Unless you can think of a concrete reason why?
> JS VMs are generally better engineered than Ruby VMs due to more investment
Is that still true? Shopify is investing a whole lot (u don't need me to tell u of course). Yes I know Google/Firefox etc had teams working on JS but did they invest that much more than Shopify?
Nono, I agree with you, I think I was wrong above, as I wrote above it was probably just a psychological bug. But I still think there's much room of improvement with the tools we already have.
> js does not need to de/serialized its own native format
This doesn’t make any sense. JSON is a string. Converting a Ruby array or hash to or from a string is exactly the same work as converting a JS array or object to or from a string.
Individual implementations may be better in JS, but there’s no inherent reason for that.
JSON is a text format that looks like a JavaScript Object when it (the object) is in text form. It has nothing to do with its memory layout. It's just a name.
The actual object in memory has to be serialized/deserialized to/from JSON.
ECMA-404 (the standard for JSON) allows values that cannot be possible represented in JavaScript - integers and float bigger than 53 bits; At the same time, it forbids totally legal JavaScript's NaN and Infinity.
JavaScript is faster and JSON because any modern JS VM is a lot faster than Ruby VM. If JSON didn't require serialization/serialization, it would be inherently unsafe at any speed.
: There are ways to handle exchange without serialization and serialization safely, but you're suggesting is that memory that holds json text is just cast to json object.
Another thing though: If at least one of your constraints are speed, a dynamically language probably shouldn’t be your first choice. If you need to maintain memory safety, Java or Rust would be more applicable.
That being said, once you have a codebase written in a certain language, you might just be stuck there due to hired talent being familiar with it. I don’t think I’ve really seen a company pull off a polygot move cheaply. Instead, a parallel move to something like Crystal may require minimal re-training.
… just thoughts I would have if I were in charge of investing R&D time at a large org like the ones mentioned in the article.
I would also recommend Go for web developers who want both speed and memory safety. It works well and is hard to beat, but I haven't seen any Go web framework that comes close to Ruby on Rails. If anything like that ever does come along, I think it will be very popular.
I like Ruby a lot. I'm not knocking it. It's probably the most enjoyable language I've ever used.
Rust is overkill for projects that don't need very high scale, also because of the paradigm shift of the frameworks. Golang is a more conventional choice, for the opposite reason.
Crystal is not ready for large scale deployment, which includes the fact that is not well supported as more widespread languages (e.g. on AWS).
I agree insofar as that there is no dedicated runtime for crystal in AWS Lambda, but it can well be used there or in fargate containers. I wrote a (proof of concept) minimal runtime for Lambda some time ago [1] and had pretty decent results.
Given that both fly.io and Heroku have working buildpacks for Crystal/Lucky, I’d say you can serve pretty heavy load before you’d need to reach for more specialized machines. Of course, that depends on the nature of your application as well. Maybe my understanding of 'large scale' is different from yours, though.
> I’d say you can serve pretty heavy load before you’d need to reach for more specialized machines
At large scale, I expect a programming language to support multithreading, in order to avoid the resource occupation of multiple processes. When one runs dozens of serving instances on a server, threads and processes make a significant difference in terms of memory requirements.
Crystal is only superficially similar to Ruby and doesn't seem to get you very far if you're a Ruby shop. It's not a bad language, but it is different, and there's probably more value to be had not trying to adhere to some similar-ish syntax and adopting something more mainstream.
> What kind of speedup is available for big Rails applications?
When working with both ActiveRecord and Mongoid, I noticed that the vast majority of time spent outside of the database is in runtime type-checking and type conversion. That is, when an object is passed in from the DB, it takes the model a bit of time to get all of the data set up correctly. This doesn't become a huge problem unless you have a lot of embedded data in the record, which tends to happen more frequently in MongoDB.
Somewhat related, what do people think of Crystal (a compiled language with types with Ruby-like syntax)? How is it for web applications? Is there a Rails equivalent for it yet?
Crystal is fine, we wrote a microservice in it for a particularly CPU-intensive task and it performed as promised. It definitely loses more and more Ruby-likeness as you start optimizing though.
There are a few Rails-like frameworks but they all lack what Rails has: critical mass. Rails has over 4500 contributors and quite a few people working on it full-time at companies like Github and Shopify. Almost everyone in the Ruby community knows and uses Rails, and almost every gem available takes Rails into consideration. The Crystal web framework with the most contributors seems to be Amber, with less than 100 contributors.
I don't mean to disparage Crystal, as I like the concepts that the language is built on, but at the same time I don't see it catching up to the Rails juggernaut anytime soon.
With all due respect, I feel like this is a vacuous statement. While there is no doubt overhead, I think the network latency of DB connections far outweighs a well-written ORM. What may be happening is an overreliance on ORMs in the sense of using them as a hammer and all functionality is a nail – generating thousands of instances of an ORM vs. one collection containing multiple records O(1) vs O(~n). This is no longer ORMs, but just bad practice in general.
In my experience with Laravel, the Eloquent ORM and the Fluent query builder both generate semi-optimal SQL for the majority of cases I've experienced them for, and they're easy to debug and see what's happening. Profiling tools like Debugbar let you see how many instantiations, objects, queries, and Models your page has, none of which you'd have if you skipped the ORM. These are critical to writing sensible code and also having a deeper understanding of how your application works when you utilize abstractions. But they also save you enormous amounts of time. I doubt writing it all by hand would be better or faster except by an expert-level programmer with deep DB knowledge.
I don't begrudge the value of knowing what's happening behind the scenes – in fact, I'm saying these tools can help open up understanding what's happening. I learned more about subqueries by seeing the SQL generated by the query builder, as well as how to write them efficiently for MySQL, than by reading about them for years.
The thing I learned from working with EntityFramework (aspnet) and Django ORM is that people don't really care to learn how those ORM work, they (the devs) ended up generating utterly unoptimized queries, from adding unnecessary inner joins to loading a ton of nested tables. I'd say for both ORM I mention you can pretty much have total control to tailor your sql to your needs (and still have the benefit of easier refactor).
Ofc in the end a raw sql string will always give an edge (see Dapper).
When your next project runs in a different framework or language, you'd be using a different ORM. It feels like a waste of time learning ORM. Plus it's difficult to optimize.
In 2 past jobs as a non ruby person in ruby shops i was trying to convince them to write raw sql queries instead of relying on activerecord, but with no luck. I won’t argue about performance but most of the times didn’t even know what and how many queries were being generated..
There has to be a really good reason for that.
It's just like joining a Spring shop and telling everyone to not use JDBC (or whatever the name of the ORM in Java land)...that's not gonna fly well usually. For a good reason imo.
The good reason was that queries were not optimal and when we came up with optimal ones which were not following the access pattern or syntax that activerecord introduces, then the excuse was “…but activerecord…”.
If i was a dev i would appreciate some data guy’s opinion in queries, indices, etc while my db crashes every now and then, but it’s either a power play/politics thing to ignore the new guy or the ‘i know activerecord/hibernate/kafka/whatever’ thing and i won’t bother sith anything else that will solve the problem.
It's a tradeoff.
Everything is a tradeoff...I think ORMs like ActiveRecord do make the code easier to reason about in most cases and are worth the performance hit.
All abstractions cost us performance..we could be all writing on bare metal C/C++ but not many web shops want to do that.
If you really want a speed bump then that’s exactly what you want to do. And I think Ruby’s overlooking a golden opportunity to join forces with Crystal.
This is not true at all. Unoptimized, sloppy code will still leave a lot of performance on the table, whether compiled or interpreted. You might speed it up, but you'll still have problems scaling and your external services will still be hit just as hard (e.g., excessive queries vs. intelligent ones).
I generally agree with you that compiling to machine code is not a magic wand, code that does more is always slower than code that does less in a fair fight. But interpretation adds additional overhead to make slow code even slower, the more complex\unoptimized the interpreter the more it adds overhead. It's good for a language to have a baseline expectations of "Don't make things worse", shit code will always be slow, but at least it's not made slower by things outside of the developer's responsibilities.
No it's not. You could describe Elixir, with a lot of caveats, as a "ruby-fied Erlang" because a lot of the niceties are there. It's faster than Ruby, but it's totally not Ruby.
Perhaps the "faster ruby" moniker could be arguably assigned to Crystal.
Not really, it's a pretty different language that in some ways looks superficially like ruby.
To start with, ruby is object-oriented, everything in ruby is an Object of a Class; while Elixir is functional, and doesn't even have Objects and Classes.
But sure, other languages than ruby certainly exist, some of which will be faster, in various use cases/contexts!
Yes but it has its own ecosystem. It can't use gems from Ruby. The main selling point for this is because they use the Erlang VM and that thing is uncrashable. Thought from the start to be fault-tolerant and distributed.
So boosting the speed of Ruby itself is always a good news for everybody using it.
This is written as speculation of course, but it matches some long-time "conventional wisdom", that most Rails apps spend most of their time on I/O waiting for something external.
That may have been true once, but i don't believe it's true anymore.
My Rails apps now, when things like "n+1 database query problems" are properly eliminated, all spend more of their time on CPU than I/O, 60-80%. (I think most of that CPU time is ActionView rendering. Unsure about CPU parts of ActiveRecord, that aren't waiting on DB, but converting db results to in-memory AR objects etc).
When this comes up and I ask around on social media who else has actually looked at their app and can confirm or deny, almost all who respond agree. I would be curious to see here too.
Definitely some more systematic rather than anecdotal investigation is called for. But I think it's time to stop repeating that it's likely that a Rails app is I/O-bound rather than CPU-bound, my experience leads me to think the reverse is currently likely.
[*edit* after I wrote this comment, I noticed multiple other comments in this thread saying basically the same thing. OK, so. Can we put this rumor to rest?]