Hacker News new | past | comments | ask | show | jobs | submit login
RJIT, a new JIT for Ruby (github.com/ruby)
336 points by pmarin on March 8, 2023 | hide | past | favorite | 131 comments



Honest question, I do not know Ruby's semantics well. But, as someone who has worked on many JITs in the past, how is it in these results, three different JITs failed at getting more than a 2x performance improvement? Normally, a JIT is a 10-20x improvement in performance, just from the simple fact of removing the interpreter dispatch loop. What am I missing?


I'll take a stab at this.

YARV (Ruby's VM) is already direct threaded (using computed gotos), so there's no dispatch loop to eliminate. YARV is a stack based virtual machine, and the machine code that YJIT generates writes temporary values to the VM stack. In other words, it always spills temporaries to memory. We're actively working on keeping things in registers rather than spilling.

Ruby programs tend to be extremely polymorphic. It's not uncommon to see call sites with hundreds of different classes (and now that we've implemented object shapes, hundreds of object shapes). YJIT is not currently splitting or inlining, so we unfortunately encounter megamorphic sites more frequently than we'd like.

I'm sure there's more stuff but I hope this helps!


> direct threaded (using computed gotos)

I've seen different people mean different things by this, do you mean the IR is a list of bytecode handler addresses, and then the end of every handler is a load+indirect jump? Or is there also a dispatch table? In my experience the duplication of the dispatch sequence (i.e. no dispatch "loop") is worth 10-40% and then eliminating the dispatch table on top of that a bit more.

CPUs work hard to predict indirect branches these days, but the BTB is only so big. Getting rid of any indirect call or jump, regardless if that is through a dispatch table, is a big win, perhaps 2-3x, because CPUs have enormous reorder buffers now and can really load a ton of code if branch prediction is good, which it won't be for any large program with pervasive indirect jumps.

> it always spills temporaries to memory. We're actively working on keeping things in registers rather than spilling.

In my experience that can be a 2x-4x performance win.

> It's not uncommon to see call sites with hundreds of different classes

Sure, the question is always about the dynamic frequency of such call sites. What kind of ICs does YARV use? Are monomorphic calls inlined?


> I've seen different people mean different things by this, do you mean the IR is a list of bytecode handler addresses, and then the end of every handler is a load+indirect jump? Or is there also a dispatch table? In my experience the duplication of the dispatch sequence (i.e. no dispatch "loop") is worth 10-40% and then eliminating the dispatch table on top of that a bit more.

It's the former. Each bytecode is the handler address and every handler does a load + jump. There's no dispatch table (though there are compilation options that allow you to use a dispatch table, but I doubt anybody does that since you'd have to specifically opt in when you compile Ruby).

> Sure, the question is always about the dynamic frequency of such call sites. What kind of ICs does YARV use? Are monomorphic calls inlined?

In one of our production applications, the most popular inline cache sees over 300 different classes and ~600 shapes (this is only for instance variable reads, I haven't measured method calls yet but suspect it's similar).

The VM only has a monomorphic cache (YJIT generates polymorphic caches), and neither the VM nor the JIT do inlining right now.


Thanks for the replies. I could keep picking your brain, but maybe it's more efficient for me to read some documentation. Are there some design docs or FAQs or summaries of the execution strategies that you can point me to? Thanks.


> In my experience that can be a 2x-4x performance win.

What's the state-of-art in reg allocation? I see that the Android Runtime makes use of SSAs to allocate registers in linear-time [0]. Are other language runtimes pushing the boundaries further and in different ways?

[0] https://www.arxiv-vanity.com/papers/2011.05608/


> In other words, it always spills temporaries to memory. We're actively working on keeping things in registers rather than spilling.

Curious: What register allocation algorithm do the current Ruby JITs use? Is that influencing the work on this new JIT too?


Author of Ludicrous JIT here (one of the earliest Ruby JITs).

It is easy to get a 10-20x speedup, if you limit yourself to compiling a subset of Ruby. When I first wrote Ludicrous JIT, I saw huge gains, but as I implemented more of the language, performance improvements over MRI (and later YARV) became more modest. Off the top of my head:

Implicit promotion from Fixnum to Bignum adds run-time type checking and overflow checking for integer math. This significantly cuts into performance on math-heavy benchmarks.

Ruby does not store Fixnums and Floats in their native representation. This means they must be converted from Ruby's internal representation when doing math.

Anything that uses eval prevents local variables from being optimized away, stored in registers, etc. The call to eval may lie outside the method being compiled. For example, a method may pass a block to another method. The other method may convert that block to a Proc, which it can use as a binding when calling eval (I don't know if Rails still uses this idiom, but this was one of the reasons why performance gains when running Rails on JRuby were more modest than otherwise might be expected).

Any program that uses set_trace_func can get a binding for any method invoked while the trace func is active. A Ruby implementation that supports set_trace_func is severely limited in what optimizations it can make. IIRC JRuby disables set_trace_func by default.

Exceptions in Ruby are implemented using setjmp/longjmp (or at least they used to be -- I haven't done low-level ruby programming in a long time). Other languages can use zero-cost exceptions without breaking backward compatibility.


Dynamic language, which allows anything to be changed at any time.


Common Lisp is also a dynamic language.

In CL, classes can be redefined at runtime and the changes effect already instantiated instances! This is just an example, not the only dynamic feature of course.

Common Lisp has SBCL which generates very fast AOT native code despite the dynamic nature of the language, so I am not sure that being dynamic is a great excuse for being slow.


It's clearly more complicated than that. JavaScript is also highly dynamic and JITs there often give 10-100x speedup.


My impression is that Ruby is more dynamic than pretty much everything else. I think this is true in terms of language features, but also in terms of the style that code is written in practice.


This matches my (day job) experience.

My greatest practical frustration from this is the difficulty of trying to claw back performance when it becomes important. I'd like more ways to say "from this point on, none of X, Y, or Z will change", and get some performance guarantees in return. For example, we have some code that dynamically generates a whole lot of classes from protobuf definitions. It's bad enough that it takes nearly a minute just to load an run all that code, but even after that I'm paying the cost of assuming that any of those definitions might change at any moment. So I have awful load times, and awful runtime performance.

I guess what I'm asking is: do you see a future where there is more explicit control afforded to people who want to pick their own tradeoffs without resorting to writing everything performance-sensitive in extensions written in C/Rust/whatever?


> I guess what I'm asking is: do you see a future where there is more explicit control afforded to people who want to pick their own tradeoffs without resorting to writing everything performance-sensitive in extensions written in C/Rust/whatever?

An approach exists already in the present, and it's Stripe's Sorbet AOT compiler (https://github.com/sorbet/sorbet/tree/master/compiler).


Unfortunately the Sorbet compiler doesn't seem to have much activity past early 2022.


> do you see a future where there is more explicit control afforded to people who want to pick their own tradeoffs without resorting to writing everything performance-sensitive in extensions written in C/Rust/whatever?

In Ruby: probably not. In general: yes. Julia is probably the closest we have to this today with it's gradual typing. I would like to see more of this (and I suspect we will at some point).


JSVMs will optimize top-level script variables to assumed-const and then inline said constant into accesses that are known through scope resolution to bind to those globals, deoptimizing that code if the global is ever modified. Is Ruby dynamically scoped in a way where that is infeasable?


Not as far as I can tell. It is very dynamic but so is Javascript (and Self).

I think the problem is different: Ruby is severely underspecified and the only way to get good enough compatibility is to piggyback on the official implementation with its interpreter, gc, libraries, C interface, build tools, package management, etc. I also think, but this is just a hypothesis, that most programs use library code implemented in C almost all the time, precisely because Ruby is so slow. That means a JIT has to reimplement lots of library code in a way that makes it JITable/inlineable and that is a lot of work + it's hard to keep it exactly compatible.

It is hard to do that in a way that removes all the (probably very obvious) inefficiencies.

Python has the same problem + it used to have a JIT-hostile leadership. Ruby is quite friendly towards JITs.


One potential option would be to rewrite those bits in Crystal-lang. The languages are often code-compatible, and it doesn't sound like that task has a lot of external dependencies.


This tends to come up, also as reason why Python has yet to have a proper JIT, yet we have Smalltalk, Common Lisp, SELF and Dylan to prove otherwise.

In Smalltalk's case, it was its JIT that ended up powering Hotspot.


C2 was a clean rewrite by the Rice folks and C1 was, afaik, part of the Animorphic acquisition, which was written from scratch, though by Lars Bak and co who did indeed work on Smalltalk before. But AFAICT all they reused was the assembler.


Sure, I also don't mean that the code was taken 1:1, rather that there are a couple of languages that are as dynamic, with relatively good performance on dynamic compiler implementations.


They didn't profile against TruffleRuby, which does indeed get massive speedups of the type you're expecting. It's just really hard to JITC Ruby to the level you'd expect having worked on V8, and GraalVM is the only VM that can do it. However the Ruby community seem to want having their own JIT written in C more than they want performance.


> However the Ruby community seem to want having their own JIT written in C more than they want performance.

YJIT is written in Rust, not C, but it's also not just a matter of wanting to write our own JIT for fun. There are a number of caveats with TruffleRuby which make a production deployment difficult:

1. The memory overhead is very large. Can be as much as 1-2GB IIRC.

2. The warm-up/compilation time is much too long (can be up to minutes for large applications). In practice this can mean that latency numbers go way up when you spin up your app. In the case of a server application, that can translate in lots of requests timing out.

3. It doesn't have 100% CRuby compatibility, so your code may not run out of the box.

There's a reason why you don't see that many TruffleRuby (or TrufflePython, TruffleJS, etc.) deployments in the wild. Peak execution speed after a lengthy warm-up is not the only metric that matters.


Thanks for the correction re: C vs Rust.

W.R.T. memory usage that's true, but I think they've been making big improvements there lately with things like node inlining so it may not be true in the near future.

W.R.T. 2, what are you comparing against here? Is the TruffleRuby interpreter that much slower than the CRuby interpreter, also once you use the native image version? Because it seems like once it starts compiling the hotspots the program must get faster than a purely interpreted version. Whilst it may take minutes to reach peak performance for that engine, how long does it take to reach the same performance as YJIT?

W.R.T. 3, yes, but is it easier to fix that or to develop a new JIT from scratch? What is your solution to the C extensions problem for example? My understanding is that this is a major limit to accelerating Python and Ruby without something like Sulong and its ability to inline across language boundaries.


I love all the attention Ruby performance is getting lately!


Congrats to the Ruby developers, now they are on the way to have more than one production-grade JIT available in the reference implementation. I hope Python catches up soon, and the proposal to merge CPython and Pyston goes forward.


Several points discussed in these comments are addressed by the author in the linked https://bugs.ruby-lang.org/issues/19420


I work everyday with Rails.

In my experience, Ruby is not super slow.

In my machine, I can create 1M of empty hashs on 0.17sec.

  Benchmark.measure{1000000.times{Hash.new}}
  @total = 0.1706789999999998s
It's very good for a dynamic language.

But ActiveRecord (and Rails) are incredibly slow.

In my machine in 0.17sec only 2000 Models can be created.

  Benchmark.measure{2000.times{User.new}}
  @total = @total=0.17733399999999833.

Some SQL+Network runs in less than 10ms, in these cases Active Models creation is slower than that.

Yes, Rails can be slower than database access.


I would guess Hash.new does little more than one or two allocations (one for the object, possibly one for an empty hash table that can later be resized), and if it did two allocations, linked one to the other.

If so, that probably is more a benchmark of your memory allocator, which probably is written in C than of ruby.

I also guess you ran the benchmark from a new ruby instance. That means memory wasn’t fragmented. That certainly doesn’t make the allocator’s job more difficult.


Yes, the benchmark is not scientific or correct or ... It's just to give you an idea of how slow Rails is.

And anything in Rails is slow, from templates to ActiveJob.


> It's just to give you an idea of how slow Rails is.

IMO, it doesn’t. Your first benchmark mostly hits your memory allocator, the second exercises both ruby and ActiveRecord (a component of Rails), and, I guess, does more (depending on the complexity of your User class)

That it takes more time may be because ruby is slow, because ActiveRecord is slow, or because of it doing more. It probably is a bit of each, and may be largely because ActiveRecord is slow, but that benchmark doesn’t tell us that.

A fairer benchmark would create a pure ruby class similar to your User class, populate it similar to what ActiveRecord does, and compare timings.


A better benchmark, with rails in position 131 of 142. https://www.techempower.com/benchmarks/#section=data-r21&tes...


Again, that Rails is low in that list doesn’t support your claim “ruby isn’t slow, Rails is”.

Pointing out an item a lot higher up on that list that’s also written in ruby and about as easy to use would be a way to support your claim.


In fact, Rails is one of the slowest of the major frameworks. Based on TechEmpower's composite benchmarks:

   1. drogon (C++)        92.3
   2. actix (Rust)        90.7
   3. asp.net core (C#)   83.7
       ~~ mind the gap ~~      
   4. gin (Go)            23.0
   5. spring (Java)       21.8
   6. phoenix (Elixir)     8.1
   7. express (Nodejs)     7.3
   8. laraval (PHP)        4.4
   9. rails (Ruby)         4.3
  10. django (Python)      3.2
https://www.techempower.com/benchmarks/#section=data-r21&tes...


I have no idea why people/companies don't pick asp.net core more often. The amount of headroom you have before bottlenecks is immense. It has world class tooling to boot too.


It's been a while since I did C#/dotnet but I imagine the ecosystem (i.e. Windows, SQL Server etc) in terms of cost and, shall we say, inflexibility, would be high in people's minds. That, and the kind of boilerplate required that means people don't start with Java nowadays either.

I moved from C# to Ruby, and along the way I read Design Patterns in Ruby[1] (which I highly recommend), and it went through several of the patterns I used to have to apply in C# to make it workable, and it would then show me they weren't needed in Ruby because of the structure of the language. Maybe C# has improved in these kind of things, but things like that matter when developers are also expensive and you want to crank out features.

[1] http://designpatternsinruby.com/


Speaking as a C# dev, IME most people are working off of severely outdated views of the .Net ecosystem. .Net has had first class support for Mac and Linux for a good 6-7 years now. Postgres support is every bit as good as SQL Server, and plenty of other db‘s can be used as well. All of the new projects I’ve started in the past 5 years have used Postgres db’s and deployed on Linux. If you don’t like boilerplate then I have good news, nowadays you can make a fully functional Asp.Net API with about 20 LOC.


So what is it then? Is there a quick-start for Asp.Net projects? Is there a canonical "make a blog"-type tutorial project? If I wanted to try making and deploying a couple tiny asp.net apps in order to learn the framework, is there a service that can easily host them like Heroku used to do?


Well bear in mind that Asp.Net is a large framework. The minimal approach I mentioned is for creating APIs [0] which you can use with whatever JS front end you please. If you don't want a JS front end Asp.Net has four different options for building web front ends [1].

The hosting question has me a little confused I guess. You don't need anything special to deploy or run modern .Net code? You can run your app anywhere that the Asp.Net Core runtime is supported [2]. Or bake your app into a container and run it anywhere that understands containers. MS has tutorials for deploying to Azure [3], but I honestly know nothing about it (Azure). Every place I've worked at that had cloud infrastructure has used AWS, and I've never needed to deploy a hobby app.

0: https://learn.microsoft.com/en-us/aspnet/core/tutorials/min-...

1: https://learn.microsoft.com/en-us/aspnet/core/tutorials/choo...

2: https://dotnet.microsoft.com/en-us/download/dotnet/7.0

3: https://learn.microsoft.com/en-us/azure/app-service/quicksta...


> The hosting question has me a little confused I guess.

I'm not sure how much you've played with Node, Ruby, etc BE ecosystems but there are plenty of "just `git push` and watch you app automatically deploy" services out there that makes it really easy to get started playing around with the actual framework and not get bogged down in docker files or AWS configuration.. Maybe I'm spoiled, but I'm also... busy. And I'd like to be able to try out a framework first before I decide if I wanna invest a lot of time working on a full-fledged project with it and these sorts of tools are perfect for that

Thanks for the links


Do you not run your code locally when testing and evaluating things? I guess in my mind where and how an app is hosted is largely orthogonal to the choice language or framework.

Edit: Assuming a modern framework that can run basically anywhere. Something like old school .Net Framework that is essentially restricted to Windows and IIS would be a different story.


Kind of true, however I still mostly work on .NET Framework, because of third party stuff that might never be migrated to .NET Core.

When it does get migrating, surprising as it may be, many companies are more willing to move to another platform than keep it on .NET, as full rewrite is almost required anyway. Dependencies on missing APIs, COM wrappers and such.

It is quite different to use an ecosystem born and raised on UNIX, and one that only in the last 5 years kind of matured into UNIX.

Additionally there is the whole issue with politics in VS4Mac/VSCode versus VS for tooling, which makes anyone outside of Windows to shell out for Rider if we want the same level of tooling.


Completely agree with you.

Btw, regarding Postgres support: the guy maintaining the Postgres .NET drivers works for Microsoft on Entity Framework Core.

Regarding the boilerplate and the minimal hosting model and the minimal api… it starts to look like a nodejs express setup. Personally I think boilerplate was not that bad to begin with and saving a couple of lines of code (sometimes at the cost of legibility) is not worth it for me in a decent sized project.


Historically it was pricing of Windows IIS and that I'm a person completely enveloped by the Apple ecosystem.

Not sure if either of those points matter 8+ yrs on since I last looked, but I know I'm not the only developer who has these concepts burned into their soul.


Asp.net core is fully cross-platform and self-hosting, so you do not need to incur Windows/IIS licensing costs.

For this site's audience, I would recommend taking a long look at F# over C#. If you aren't aware of it, it's functional-first, OCaml-inspired, but you have access to all the Asp.net core framework you need (similar to accessing Java from Clojure). A batteries-included framework that seems popular is the "SAFE Stack"[0].

[0]https://safe-stack.github.io/docs/overview/


Given that game dev in Unity is also driven by C# I wouldn't actually be against picking it up as a web dev. But it's hard to find any beginner-friendly resources out there. Is there anything like Heroku, Vercel, Fly, etc that lets you really quick build and deploy a full-stack app? My path to learning a new framework and language is full of half-finished project ideas and this process of quick and easy iteration has proven really useful to me being able to learn something quickly

Genuinely curious if anyone out there has some wisdom to share about this since I know next to nothing about the ecosystem


You can get similiar process as Heroku (just push to github repo and everything automagically deploys) on Azure using App Services. The problem is there is no free tier anymore, although i didn't really look up exact prices. There are plenty of very basic tutorials in official Microsoft documentation, if you search for stuff like "how to deploy .Net core app on azure" you surely find some tutorials that go through whole process from scaffolding basic project to running it in the cloud.


confusing names


Microsoft highly games this benchmark for ASP.NET core. I’m not disputing it’s faster than Rails, but the actual real-life gap will be much smaller.


Almost every one does, if you look at a number of the implementations that are used for Techempower they're very very carefully created to provide optimal behaviour under the benchmarking. They work in ways that you wouldn't normally write stuff.

It makes for a really interesting value proposition for the benchmark.


It's a large set of benchmarks so I'd be really curious to hear how they can game it. Got any good readings?


This article was quite eye-opening: https://dusted.codes/how-fast-is-really-aspnet-core

Now please note, this only talks about what Microsoft is doing, maybe others do the same.

Edit: I saw the article does in fact look at other frameworks.

In the end, it just means these particular benchmarks are pretty much worthless for any kind of real-life comparison.


Thanks for sharing. From the article, even the worst of asp.net was often still outperforming Go and other major frameworks. And the versions explicitly pointed out as actually fairly realistic still placed in the 73-86 ranges. In addition, you could do the same exercise with gin, rails, etc and find a lot more variation (though rails and django do basically always suck in comparison)

It's interesting because even the maker of actix (Rust) was pretty explicit about their goal to optimize for these benchmarks. And it really worked to improve the perception of Rust for web dev applications.[0] The current leader is actually JUST which is a JavaScript based framework that's pretty much solely focused on these benchmarks.

I think the strategy for optimizing like this and sometimes even fudging numbers a bit is pretty standard for a new framework or platform trying to make a name for itself (ahembunahem).

Actix is new, JUST is more of an experiment, and tbh I've never even heard of drogon. But when it comes to frameworks like ASP.NET core, Laraval, Rails, Express, etc you can at least trust that they've been around for a long time and there are many proven examples of their application to both commercial and hobby projects. IMO the fact that ASP.NET has been around for so long yet can still top benchmarks like these is still a noteworthy feat. And its undeniable that Django, Rails, Laraval, etc are leagues behind in terms of performance

[0] https://www.arewewebyet.org/


> From the article, even the worst of asp.net was often still outperforming Go and other major frameworks.

Hmm no:

„the expensive Go implementation ranks 22nd overall in the TechEmpower Fortunes Benchmark with an equally impressive 381k requests/sec. Not quite as fast as the Java one but still more than 2x faster than the equivalent test in ASP.NET Core.“

But as said, in the end the whole discussion probably says more about the value of this benchmark than the value of the frameworks (also considering of course that pure performance is only one factor among many)


God ASP.Net is incredible. I mean it is what runs StackOverflow on "minimal" hardware.

I'm a Rails, Django guy and we run into exactly what you're talking about. The speed we can pump out features is fantastic in both, but once you get to a large number of objects, it's better to drop into pure sql/array handling.

This is mostly for data processing, background jobs where we need to work with >1k separate objects.


I think the idea of Ruby being slow was back in 1.9, this was way before Matz announced the goal of 3x3 with Ruby 3.


I think the idea comes from many other mainstream languages being relatively much faster. The classic rejoinder was about expressiveness and developer time, but with newer languages that have learnt from Ruby and others (most obviously Elixir and Crystal, but any of the newer generation could probably be chosen) that's less of an argument. Now it relies on the legacy of Rails more than anything, which isn't so bad, is it?

I wouldn't pick it for a new project though if I could write Rust or Elixir or Crystal etc unless they lack something in their ecosystem that Rails has, but over time this will become less of a difference.


No, it is actually from even before that, from Ruby 1.8 which did not have a proper VM.


How involved is your user model and what type of machine do you have?

I'm on an 8 year old i5 3.2ghz CPU based workstation running in WSL 2 with Docker.

With Rails 7.0.4 and Ruby 3.2.1 (YJIT not enabled):

1M empty hashes:

    Benchmark.measure{1000000.times{Hash.new}}
    @total=0.192643
But in 0.192 seconds I'm able to make 12,500 user models:

    Benchmark.measure{12500.times{User.new}}
    @total=0.19225799999999982
That is over 6,000 more than your ratio but based on your first number something tells me you have a much faster dev box than me.


11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GHz, WSL 2.

A better benchmark https://www.techempower.com/benchmarks/#section=data-r21&tes...


May I recommend OccamsRecord? It’s a gem allowing you to retrieve queries as hash without instantiating all the objects - and is much faster than AR.

Also, Sequel exists. Yes it’s not the Rails way but is fantastic.


> May I recommend OccamsRecord?

If ActiveRecord is affecting code speed I use raw sql which returns an array of hashes. And if the application needs to be fast, using ruby, the best way is to use Rack directly without frameworks.


Quite interesting. My takeaway is that it can be on par with YJIT or even outperform it despite being in early development.

Btw one project I work on switched to YJIT in production and there are no problems so far (but no noticeable perf gains either)


You're right that the peak performance could be on par (or even better), but, and I acknowledge that I'm biased since I'm tech lead of the YJIT team, my takeaway is:

1. Kokubun, who works with us on the YJIT team, is leveraging insights he's learned while working on YJIT to build this. He has said so in his tweets, some of the code in RJIT is a direct port of the YJIT's code to Ruby. This is his side-project.

2. One of the challenges we face in YJIT is memory overhead. It's something we've been working hard to minimize. Programming RJIT in Ruby is going to make this problems worse. Not only will it be hard to be more memory-efficient, you're going to increase the workload of the Ruby GC (which is also working to GC your app's data).

3. Warm-up time will also be worse due to Ruby's performance not being on par with Rust. This doesn't matter much for smaller benchmarks, but for anything resembling a production deployment, it will.

On your second point, if you're not seeing perf gains with YJIT, we'd be curious to see a profile of your application, and the output of running with `--yjit-stats`. You can file an issue here to give us your feedback, and help us make YJIT better: https://github.com/Shopify/yjit/issues


Maxime, just wanna say I'm a big fan of your work https://twitter.com/fulligin/status/1524652417559646208


> Programming RJIT in Ruby is going to make this problems worse. Not only will it be hard to be more memory-efficient, you're going to increase the workload of the Ruby GC

Or it could be a win. Java went that way too.


Yeah, I have the same experience with SaaSHub. I moved to YJIY a week ago - no issues, but no noticeable perf gains either (unfortunately).


What about in memory usage?


From compiler nerd point of view, this is great piece of work.

Another meta-circular JIT implementation, instead of going the tried and through path that most keep on trailing.

Looking forward how it evolves from there.


Does RJIT get JIT compiled... by itself? That would be lovely in the sense that as RJIT finds more optimizations to speed up code, it would become itself faster.


I mean, that's what happens with JITs like Graal. It can present a warmup issue which is part of the reason Graal did so much work to enable AOT compilation.


How many levels of JIT would be too many I wonder


AOT + JIT + PGO (profile guided optimization) is where things are at.


Are they adding a new jit each version now?


It is replacing MjIT by the same author.


Is there any use of ruby in the deep learning space? It's my language of choice, but Python seems to be ubiquitous.


It looks like there was a little movement in the space not too far back:

https://ankane.org/new-ml-gems

But I don't know how often this is used in production. I've ended up training the models in python and then loading them in Ruby.


Thank you! I think I just need to embrace python and get over my hate of whitespace as syntax; People call ruby omasake, but at least I can structure it how I like.


Why would one use this over YJIT?


Being pure Ruby, it should make it much easier for Ruby programmers to hack on. I am really impressed with the latency numbers. This will be some enjoyable code to read. It also sounds like through this and other's work, that Ruby will have a defacto JIT interface to the VM. This could open up the door for domain or framework specific jits, AI powered jit, jits that reload their past state, etc.

I am a huge Rust fan, but going up stack and writing your jit as first party is pretty cool. Maybe after RJIT is well factored, the internals could done in Rust again.

Really happy for Ruby!


Right now maybe you wouldn't, very much (though you should profile your code to see which performs better). But having a JIT written in Ruby potentially makes further development on it more accessible to the community. We will see!


seems strange, since most rubyist aren't compiler engineers. I feel like you'd still want to keep writing your compiler in Rust, and try to eek out your performance there.

I'm still scratching my head, other than accessibility, Why Ruby over Rust.

Note: I'm a Ruby dev, I don't know Rust. I've written few toy interpreters in Elixir and OCaml. This is my very limited understanding of compiler design, etc.


There's benefits to be had for the average developer from being able to easily read the compiler/jit/vm code of their language.

I write Crystal code and the fact that the Crystal compiler is written in Crystal allows me to make my code better/faster by taking a peak into the compiler code occasionally to see how things are done without having to get good at another language.


Using a JIT seems different from creating a JIT, though, and doesn't seem to require compiler engineers. From what I understand this converts Ruby code directly into C object code? Or does it transpile to C and then compile it? Either way, transpiling doesn't seem as complicated as compiling.


The JIT being replaced did that, this one compiles directly to machine code.


> most rubyist aren't compiler engineers.

You could say that about any language.


Can you say that about Standard ML? theres traditionally certain langs with particular ergonomics that lend them self to this kinda work. Like having (parser, lexer libs as apart of the lang's stdlib)

Rust's ADT seem particularly useful in this context. it really makes refactoring a breeze. (OCaml has a similar type system.)

Your point is generally true though.


What if you turn it around? "Most compiler engineers aren't Rubyists."

I don't know if that's true or not, but I imagine most compiler engineers tend to be more engrossed in languages like *ML, Rust, or Haskell. Or languages that are common in general, like C++ or Python. Ruby isn't that popular (outside of the Rails niche at least?), and it doesn't fit very well in a compiler niche either, I think.


But why would a compiler engineer that's not familiar with Ruby work on Ruby? They have so many other languages to choose from. I don't necessarily think that writing a JIT in Ruby is a good strategy to attract people to work on it, but if you are going to attract compiler people, you most likely want those who are also Rubyists.


Speaking mostly for myself, at least some compiler engineers like difficult source or target languages. The very dynamic ones are difficult to compile efficiently and thus more interesting than some alternatives. Plus Ruby hasn't had the attention paid to it that JavaScript has so the design space is closer to greenfield. I can see the attraction.


I think you could say yjit fills that role; in any case, having multiple active jit projects leaves open a lot of room for experimentation. In the end, you might be right and rjit will fade away — we will see!


To add to sibling comments, YJIT needs the Rust compiler, so if you have to build your own Ruby binary, and you have to build it on a system where you can't get a Rust compiler, then RJIT will make your life easier. Not sure how common this is in the real world though.


there is no concrete reason to use it right now, and it's marked as experimental, but being pure ruby would allow for exploration and experimentation more easily.


Is jRuby still a thing? I didn't see it in the perf comparisons.


Yes it’s definitely still a thing.

They actually had a new release at the beginning of March.

https://www.jruby.org/2023/03/08/jruby-9-4-2-0.html


Neither JRuby nor TruffleRuby are compared to, and both are quite fast especially TruffleRuby.


These comparisons seem to be to other ruby implementations. How does this compare to LuaJIT ?


That is kind of irrelevant if you are running a Ruby application. And I think that if a developer is looking to start working on a new web server where performance is a significant concern, they are more likely to look at Go, Rust, or even JavaScript rather than either Lua or Ruby.


LuaJIT should be much faster than javascript

Also how slow is still ok? People can talk about things that aren't 'performance sensitive' but at some point it's going to matter. If a program is serving up web pages, that's an interactive application and people are waiting on the program.


    Also how slow is still ok? People can talk about things 
    that aren't 'performance sensitive' but at some point 
    it's going to matter
Done a fair amount of Rails perf tuning over the years. One of my favorite things to work on.

"Fast enough" for me, is when your web framework is nowhere near your bottleneck. On your average web app endpoint you're probably spending 95-99% of your time on external calls to Redis/Postgres/etc and Postgres is probably your specific bottleneck.

For these apps, Rails is most definitely fast enough. You could rewrite your app layer in well-tuned C or assembly and guess what, it's getting maybe 1-5% faster if you're lucky. Maybe you get from 100ms down to 95ms and all you had to do was rewrite 100,000 lines of Ruby in 200,000 lines of C.

For other cases, obviously, maybe Rails is your bottleneck. Maybe you're providing a read-only API and everything is cachable in RAM. Rails will be fast, maybe 10ms per request, but a faster framework can spew out responses in 2ms and now you have 5x the capacity and your P95s during peak hours are really smoothed out.


For Rails as an API? Yes. There are plenty of examples where Rails as an API, and specifically not using ActiveRecord, gets you 10ms response time excluding DB.

But most of the Rails app, especially those with ActiveRecord and Server side rendering spend 60+% of response time, And in many cases even higher before hitting DB. And they fall into 100 to 200ms response time category.


Even a slow framework is still fast for humans. My Django site renders the homepage in 10ms, and django is kind of in the realm of Rails performance wise.

It's all about cost, really. But you can just tell Nginx to cache pages and then it's not a problem for the vast majority of use cases.


Oh well if all the Rails site could render a medium size App with 10-20ms response time then no one should be complaining.


Well if you start fetching thousands of objects without projections and maybe make a couple writes per page load then yeah, it's gonna be slow.


JavaScript is so much more widely used for so many applications that it’s hard to justify LuaJIT, even if it is faster. They’re both much faster than most scripting languages like Python and Ruby.

If JavaScript really isn’t fast enough, then I think you should be thinking about something like Go or Rust instead.

LuaJIT is an incredible piece of technology, but in this performance tier JavaScript has won out due to sheer ubiquity and the size of its ecosystem.

Edit: I never said slow was okay. I’m not advocating for slow, I’m just saying that the target audience for RJIT is developers who are already using Ruby. For significantly better speed, you probably want to look elsewhere.


Performance is a weird metric for a web application. You can say Go or Rust will be more performant than Ruby or Lua, sure. But with web applications so often your performance has nothing to do with the language or hiccup. But you aren't just processing N requests and spitting out a response, you are communicating with other services and databases. IO is almost always a bigger source of latency in response than language speed, until you have a large enough service where you can start to worry about those small issues. Before the Developer performance matters more.


Jesus Christ, can this meme die already?


It probably doesn't. LuaJIT is still state-of-the-art, despite being in maintenance mode for almost a decade...


Do you have a particular task in mind? The languages have different semantics and that has an effect on performance. E.g., in Ruby nearly everything is a method call. Ideally a JIT would eliminate that overhead, but you'll likely see varying degrees of performance depending on what your task is. And that's before you get to core library methods, which often are written in C and not handled by the JIT.


> …many methods are direct translations of the Rust code into Ruby.

Impressive


Boy I lost track of all the Ruby Jit attempts.

According to the computer language shootout all micro-optimizations


YJIT sped up my Rails app by about 30%. It has a memory overhead, but it's worth it.


For anyone curious, we've been working to reduce the memory overhead and have added some stats to keep track of memory usage over time. On this graph, you can see a comparison with the CRuby interpreter:

https://speed.yjit.org/memory_timeline#railsbench


What happened on jun 14? (the dramatic drop in memory usage)

Edit: I guess this (https://github.com/ruby/ruby/pull/5944) PR was merged.


Yes. Prior to that point we used to allocate a large chunk of executable memory upfront. We switched to mapping that memory on demand, and that alone was a huge improvement.


That is huge, did it also reduce(~~bring in~~) tail latency?

*edit, fix confusing vernacular


Nope -- P99 also decreased. I've heard similar things for other Rails apps as well.


Sorry, in my usage "bring in" means move tail latency P99/P100 more to the left not as introduce, I'll be more clear next time.

So yes! That is great news.


https://ruby-compilers.com/ is a comprehensive list of the various Ruby compilers. There's a table summarizing them along with detailed descriptions for some.


Is there any alternative to RubyMine as an IDE for Ruby newbies?


For IDEs, Shopify provides the Ruby Extension Pack for VS Code [1]. Closely related to RubyMine is the Ruby plugin for IntelliJ IDEA. Those seem to be the biggest set of IDEs for several languages. There's Ruby support for editors like Vim and emacs, but that's a different experience from an IDE.

[1] -- https://github.com/Shopify/vscode-shopify-ruby


For those who can Emacs, robe and inf-ruby provide the best developer experience.

Robe runs inside a Ruby process, like Common Lisp's SLIME. This is (far) superior to typical IDEs because it's no longer restricted to static analysis. In a highly dynamic language like Ruby, static analysis can do only so much, so RubyMine and LSP servers have to guess data types. Robe doesn't have to.

Inf-ruby allows interactive evaluation of code, even including redefinition of methods & classes. The items being redefined don't have to be at the top-level, since inf-ruby automatically extracts the surrounding method and class definitions. This means you can just open any source file of your program, edit any method/class that may or may not be deep inside layers of modules & classes, and just ask inf-ruby to send the edited definition to the running Ruby process. Instant code reload.

Of course, it's still Emacs, so sharp corners and rough edges abound. But when it works, it's a boon.


Ye, I noticed trouble with Ruby, RubyMine is the only thing that "works" i.e. provides what I've come to expect out of an IDE for language. Similar experiences for me were Emacs and Clojure or Scala and neovim.

Ill try the Emacs setup.


Vscode + solargraph + ruby extension


I'd definitely be more apt to have this as part of production system instead of the Rust one.

Rust has got to be the ugliest, most unfriendly programming language I've ever laid my eyes on. And I wrote Perl for 10+ years, so that's really quite a feat of aesthetics failure.

Anyhow, I'm pretty impressed with the performance thus far, I like the idea of having multiple JITs available for a single-language ecosystem, regardless of how disgusting the language used to implement them. I think having competition means that there will be a race to the bottom and towards the "center" of general work. It's already really cool to see how the different approaches have clear preferences of the tasks they excel at and where they fall short.

This is hugely valuable because it pushes Ruby forward for everybody, and will hopefully result in not only a faster Ruby for X, but a faster Ruby for everything, which is just an objectively good thing.

Python is in a weird spot in this arena, because it is very clearly and very strongly orienting itself to continue to dominate practical data science work, and that means the need for JITs to handle regular jobs like text munging and whatnot fall by the side in order for the latest NumPy and Jax stuff, whatever is the current hot shit in the AIverse. Ruby doesn't suffer from that because it's pretty solidly lodged in the web development sphere, while also having a capable presence in netsec tools, application scripting, and probably a few more areas that I'm not aware of.

If you're interested in some of cutting edge Python stuff, I'd recommend taking a look at exaloop/Codon. Codon will soon be able to output Python extensions that are compatible with Python's setuptools, so it will soon be possible to just include some .codon files with your project, use setup.py, and have decorators that can (literally) 100x your hot loops.


Be honest, you mostly write this post so you can say you hate Rust, didn't you?


> Rust has got to be the ugliest, most unfriendly programming language I've ever laid my eyes on.

No, that would be C++ or PHP.

I've written Ruby for 10 years. I can't wait for a future in which my team never needs to worry about mutated shared state. Rust is beautiful in this regard. The only languages that come close are Clojure and Haskell, but Rust accomplishes this without immutability or a bloated runtime.


Wait until Rust gets generic keywords for async with ? as prefixes.


C++ is a beaut', compared to Perl. It's ugly too, don't get me wrong. I'll be the last one to hold up C++ or PHP as beautiful, but Rust is just too gross to look at, I have NO idea how people program in it.

I think the only thing Rust has really brought to the table is the idea that alternative memory schemes like ownership can be valuable. That idea will slowly disseminate into Python/Ruby, and other languages from there, and those of us who appreciate beauty will continue to program with proper tools.

I'm actually looking into Ocaml lately. The 5.0 release is really interesting, and the next release of OPAM will have proper Windows support and everything. Rust takes a lot from Ocaml, as I understand the original compiler was written in it, and inspired by it.

I think mutable shared state is bad, as you do, but I think so less because of the advantage that a tool like Rust or Erlang brings to the table in handling that state, and moreso because of the programming practices it brings about regardless of the language. Treating a session as stateful, or really treating any io-capable object as mutable is definitely a big mistake. That much is clear. I think as the years go on, more languages will embrace this wisdom and replace Rust's incredibly rigid and unfriendly experience with things like proper object capability systems, effect handling, and whatever instruments would be useful from formal verification methodology.

And again, all of that could conceivably land in Python/Ruby eventually, or as extensions, whatever.

I guess what I'm saying is that over a long enough timespan, it would seem that ideas are always going to win out and language designers will slowly advance towards that front, while retaining beautiful code.

It's interesting that you mention Clojure, because a lot of really cool ideas bouncing around the Ruby/Python sphere originated or at least came to widespread popularity in Clojure. I'd put Erlang up there as equally influential as well. Clojure also shares the feature of highly-writable and highly-readable code, which qualifies it for beautiful, imo. How often do you come across a Clojure function that's just cluttered with {}{}<T><T><<T>RefCell>(); bullshit, to the point where it's not immediately clear what is happening? Now, compare that with Clojure. If ever there's a complicated Clojure function, I'd wager that it either a) deals with Java interop, which, God bless you. or b) is just a highly abstracted function, and can be grokked by walking backwards up the calls one-by-one. Terseness can be daunting, but it's almost always a simple case of gathering context to understand those functions ending in )))))})).

Anyhow, I've wasted too many bytes here.

Since you've a decade in Ruby, have you tried Crystal at all? I heard about it last year sometime, but was turned off by no Windows support.


Treating a session or IO object as mutable seems perfectly reasonable. Problems arise only when multiple parts of a program mutate these without coordination.

Even everyday decisions like whether to copy an array passed to a constructor can be fraught with danger. Is the caller going to retain a reference and mutate the array after I check invariants? This simple yet intractable problem lurks in every imperative language except Rust.

Channels in Go are another great example. CSP is a sound approach for safe concurrency, yet the Go language offers no assurance that references passed over a channel are exclusive to the receiving goroutine.

I don't see how languages like Ruby or Python or Go could incrementally add a borrow checker. It would break too much. Returning to other programming languages after internalizing Rust compiler warnings is really eye-opening.

I have not tried Crystal. I also haven't written OCaml, though I did dabble in F#. Rust's match expression was definitely inspired by OCaml.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: