An experiment in static compilation of Ruby: FastRuby

kintamanimatt · on Sept 17, 2012

This is just transcompilation to Java, not actual static compilation. The performance gains will be offset by the increased memory footprint of the JVM.

Interesting nonetheless.

headius · on Sept 17, 2012

I disagree :)

A transcompiler would have to be able to produce pretty much 1:1 mapping between one language and another. In this case, it would have to be emitting the equivalent of dynamic calls, which isn't really possible on the JVM (other than through invokedynamic, which does work very well).

Instead, this is using a static view of the world to produce the resulting code...assuming only the method names it sees statically at compile time will ever exist, and generating a static picture of that world. Hell...it's actually turning all dynamic dispatch into static dispatch. How could a transcompiler possibly do that?

Also...performance gains being offset by the size of the JVM? I don't understand this at all.

kintamanimatt · on Sept 17, 2012

No, what you've produced is a transcompiler which takes the source code from one language and converts it to source code in another language at the same high level. The fact there isn't a 1:1 mapping between the source language and the destination language isn't relevant. Strangely, it doesn't appear the Dragon Book mentions transcompilation, but Wikipedia has a decent definition: https://en.wikipedia.org/wiki/Transcompiler

This isn't static compilation in any sense of the definition as you're not producing a low-level language, i.e. assembly or machine code. Ultimately the JVM will do some combination of static and JIT compilation, but your code isn't doing this.

In no way am I deriding your effort, you've just got your definitions mixed up.

The JVM tends to run in a relatively large memory footprint compared to the MRI, although the JVM will execute faster when it's warmed up. The JVM isn't known for being frugal when it comes to memory usage. For long running processes, your transcompiler could be very useful.

kristianp · on Sept 17, 2012

I guess headius works with what he knows best, which is the JVM.

kintamanimatt · on Sept 17, 2012

You may have missed the point of my comment: this isn't static compilation, but rather transcompilation. This isn't a bad thing, it's just mislabeled. FastRuby is implemented in a similar way to CoffeeScript.

headius · on Sept 17, 2012

CoffeeScript is a transcompiler because it's largely just taking language A with semantics similar to JavaScript but different syntax and converting it directly to JavaScript. There's no static worldview as in fastruby, no exploitation of that worldview, and no fundamental differences between CoffeeScript and JavaScript.

kintamanimatt · on Sept 17, 2012

CoffeScript was designed with transcompilation to JavaScript in mind. This is why you've correctly noted they're conceptually similar in many respects, and therefore relatively easy to transcompile.

Transcompilation can be relatively simple in the case of CS to JS, or significantly more involved in the case of Ruby to Java. One of the most significant examples of transcompilation is PHP to C++, as in the case of Facebook's HipHop compiler.

headius · on Sept 18, 2012

Yeah, I still don't agree with you. If I were compiling to JVM bytecode, would you no longer consider it a transcompiler? And if that bytecode were exactly what javac produced after compiling the Java source I emit now...what exactly is the difference?

FWIW, I'm probably going to just emit JVM bytecode anyway, since it's easier than emitting syntactically correct Java source.

kintamanimatt · on Sept 18, 2012

This is where the lines get blurry. Anything that translates source code to bytecode would clearly not be a transcompiler as bytecode can hardly be considered source code, even if an abstract implementation of the machine ends up executing it.

Something is considered source code if it is intended to be written and read primarily by humans. Once upon a time, machine code would have been the same as source code as there were no compilers, and punchcards were punched with opcodes and data directly. Now, nobody in their right mind considers machine language to be source code, and I could only make a very tenuous argument in favor of JVM bytecode being source code.

Compilation is the process of producing something the "machine" understands natively. Traditionally this was the CPU. Your CPU doesn't understand Ruby, but it (probably) understands x86 machine code, along with various extensions. In the case of Java, the machine is no longer the CPU, but an abstraction of it. The machine in this case only understands JVM bytecode, not Java. Same with the CLR which only understands IL, not C#, F#, VB.NET, etc.

If you emit bytecode directly you'd have a compiler not a transcompiler.

ryanbraganza · on Sept 17, 2012

I don't disagree. Although, I think CoffeeScript is more similar to JRuby than FastRuby.

batista · on Sept 17, 2012

In what possible way is CoffeeScript more similar to JRuby than FastRuby???

JRuby is an actual compiler to JVM bytecode. FastRuby is translating to Java. If anything, FastRuby is the SAME THING as CoffeeScript.

python-guy · on Sept 17, 2012

I had tried this with python a while ago. Simply translating python code into C code using the objects in Python/C API gave up to 30% speedup. However, that is not much, considering the fact that python is up to 100 times slower than C. For speedups that are more significant, you really need type inference.

jhchabran · on Sept 17, 2012

This reminds me Cython who provides optional types to speed up the critical path. The speed gains are pretty impressive. Having a such tool in Ruby would be wonderful !

headius · on Sept 17, 2012

Did your experiment still do dynamic calls where Python would do dynamic calls? fastruby does not; all dynamic calls are converted into Java virtual dispatches, which optimize and inline like normal Java code.

The 30% improvement I got on "fib" was not as much as it would be for inferring the actual numeric types, but that's indeed possible here too. You should also keep in mind that the resulting code was by definition as fast as Java code would be doing the same work (i.e. all boxed math) so anything not math-related would actually be Java speed too.

jvoorhis · on Sept 17, 2012

This reminds me of HipHop for PHP. HipHop lowers PHP to C++, with a direct mapping for control flow (control structures, exceptions) and a runtime library for everything else. Apparently it works very well.

qxcv · on Sept 17, 2012

Interestingly, Facebook sponsored a project to build a prototype PHP interpreter using the PyPy[0] toolchain for JIT compilation, which turned out to be even faster than HipHop[1]. I have to wonder whether statically compiled Ruby would actually be any faster than a JIT interpreter like Rubinius, cold start JIT penalty aside.

Edit: looks like JRuby can/does do JIT compilation, but it is also capable of doing AOT compilation[2]. If JRuby can do AOT, then what is the point of FastRuby?

[0]: http://pypy.org/

[1]: http://morepypy.blogspot.com.au/2012/07/hello-everyone.html

[2]: https://github.com/jruby/jruby/wiki/JRubyCompiler

headius · on Sept 17, 2012

JRuby does have an AOT compiler, but the resulting code must be shackled to the rather large JRuby runtime. Is also does a slower dynamic call that fastruby (except on invokedynamic).

My goal with fastruby is more to have a still dynamically-typed Ruby-like language without requiring anything more than virtual invocation and a modest runtime library (that can be statically optimized to only what's needed by e.g. Android toolchain).

hencq · on Sept 17, 2012

Could this be mixed with Mirah to have it dynamically typed, but with types added where needed for performance? Sort of like Dart's optional typing, but for the JVM.

regularfry · on Sept 17, 2012

I've had thoughts about doing a Rubinius bytecode interpreter in PyPy, but I can't find a decent reference to Rubinius' bytecode format other than the source. That'll take more time to wade through than I've got right now.

masklinn · on Sept 17, 2012

An other project in the same class is Shed-Skin[0], a Python to C++ compiler (only handles a subset of Python though)

[0] http://shed-skin.blogspot.be/

damian2000 · on Sept 17, 2012

Anyone tried this approach via C# (CLR)? - it would be interesting to see the performance compared to the JVM. I'll give it a go if I get some time.

fsiefken · on Sept 17, 2012

Is this related to using IronRuby vs JRuby? In benchmarks JRuby is significantly faster, odd as I suspect the CLR is slightly more efficient theoretically then the JVM.

headius · on Sept 17, 2012

Efficient is a loaded term, so I'll skip that...but the JVM is most definitely a faster platform than CLR.

ps2000 · on Sept 17, 2012

Mirah? Anyone?

VeejayRampay · on Sept 17, 2012

Extremely exciting.