Why is it that 35 years after the Futamura projections were first described, when we want to produce a new compiler, we don't write an interpreter and pass it through a high quality, general partial evaluator?
Whenever the Futamura projections are described, it's always at a conceptual level, in general terms. Reading a blog post like this one, I'm always left to wonder (a) why does this not end with a link to a github implementation of a general partial evaluator and (b) why is it the case that none of the compilers I use were produced with these techniques?
There must be some set of tradeoffs preventing this approach from living up to the "compilers for free" promise, but I haven't found anyone with a real explanation. Either there must be hard challenges which make implementing a general partial evaluator very difficult, or the results that we'd get from using such a partial evaluator are poorer than what we get through other means right?
Perhaps the greatest success in this direction is PyPy. At some level the RPython toolchain is general enough that people have used it to produce JITted versions of ruby, php and a few other languages. But these are a handful of alternative implementations to existing languages, where the predominant implementation doesn't use this toolset. And PyPy's own docs note that the JIT compiler needs some amount of hinting from the interpreter; you can't just throw any interpreter (written in RPython) at it.
If you're interested in looking into some active research in the area of partial evaluation, you should look at Truffle [1] and Graal [2] from my team at Oracle Labs. Graal is a new JIT compiler for HotSpot written in Java, with an API to interact with the compiler from Java. Truffle is a toolkit for implementing programming languages as self-optimizing AST interpreters. Truffle uses the Graal hooks to perform partial evaluation of Truffle-based interpreters to generate highly optimized native code.
The partial evaluator doesn't know anything about the language being compiled, save for a few annotations that Truffle provides to control inline caches and such. We're using this approach to create new optimizing runtimes for Ruby (TruffleRuby), JavaScript (Graal.js), R (FastR), Python (GraalPython), and even LLVM bitcode (Sulong). There are languages being implemented outside of Oracle Labs as well [3]. So, it's being stressed by a wide range of language semantics.
As for performance... I work mostly on TruffleRuby. There we're seeing considerable performance advantages against other Ruby implementations across a wide array of benchmarks and workloads. We really do get the JIT for free by virtue of being a Truffle-based language implementation. It's quite liberating as the compiler technology is free to evolve separately from our Ruby implementation, in contrast to the tight coupling you'd have with a language-specific JIT.
Fair enough. You could date the project back to the Maxine days as well :-)
Joking aside, the Java-Level JVM Compiler Interface [1] was only added about a year ago, in Java 9. This is what allows using Graal with a stock JVM. So, for many people this is the point at which Graal became available.
Graal has a history going through Maxine, to HotSpot C1. It's been evolving for years, yes, but its application as a partial evaluator only goes back to 2012. TruffleRuby was started in 2013.
> Why is it that 35 years after the Futamura projections were first described, when we want to produce a new compiler, we don't write an interpreter and pass it through a high quality, general partial evaluator?
I think the rationale for abandoning partial evaluation for PyPy [1] explains most complications with it. Or put in different words, PE-based compilers don't have all necessary knobs that real compilers need.
> Why is it that [...] when we want to produce a new compiler, we don't write an interpreter and pass it through a high quality, general partial evaluator?
As far as I know, it has been tried numerous times and has been an area of active research in the past, but results turned out unsuccessful. Mainly because error reporting and performance, both compile-time and run-time, is hard to achieve.
It's a little bit like pure functional programming. Super elegant in principle, hard to get right and usable in practice.
> As far as I know, it has been tried numerous times and has been an area of active research in the past, but results turned out unsuccessful.
TruffleRuby is a Ruby implementation that works exactly by writing an interpreting and passing it through a general partial evaluator. It's successful enough to pass almost all the Ruby spec and to run real Ruby applications including their C extensions.
> Mainly because error reporting and performance, both compile-time and run-time, is hard to achieve.
Do you have more information on the error reporting bit? Is it due to losing important metadata during compilation? If so, couldn't a different, more metadata-rich representation than the basic AST most of us are familiar with help here?
wild guess, maybe it's like yacc vs hand made recursive descent parsers.. the former doesn't allow enough control so people prefer doing the effort so they can change bits here and there
The more I learn about interpreters and compilers, the more everything looks the same.
Snigl [0] is designed as a VM interpreter. But it has a separate compile stage where code may be evaluated using the same interpreter and the results reused at runtime.
And it traces code before running it, evaluating as much as possible statically.
Finally, it supports emitting C code for an interpreter preloaded with the current, traced program. Which is where compile time evaluation really shines, as the results are embedded in the executable.
I have written far more compilers/interpreters in my career than I probably should have, and I reached the same conclusion. For most of them, I probably couldn't even tell you weather or not what I had involved a compiler. At this point, my working distinction is that a compiler produces a serialized representation. If you feed the AST directly into the runtime, then you just have an interperater. I don't see any practical benifit to this distinction, but it seems to match up with what people consider a "compiler".
As an asside, I had a simmilar insight when I built a computer from logic gates [0]. Essentially, the control unit had an EEPROM of microcode, and the logic to execute an actual instruction was to use the upper 4 bits of the instruction as an index into a lookup table, then write the result to the microcode program counter.
Eg:
switch(OP & 0xf0):
case 1: jmp &add
case 2: jmp &sub
...
Which looks a lot like an interperator.
It wouldn't suprise me if modern CPUs actually added an additional step, where they first translate (compile?) the input stream into some internal instruction set, and then have microcode that acts as an interperator for that.
All modern x86 CPUs do this to some extent. The details changed over the decades, but the basic principle these days is that the CPUs decode the x86 instruction set into a secret, internal only, more orthogonal sequence of instructions. My understanding is that these sequences can be quite long for the more involved CISC operations. And according to the manuals written by Agner Fox, some instructions even have special cases based on their operands to do something equivalent, but much faster. One particular example that I remember is xor ax,ax which is often used to set ax to zero and is a one byte instruction. Modern CPUs don't compute the xor in the ALU, but just rename ax to a register that is set to 0.
So in practice, nobody uses actual x86 assembly anymore, but everybody uses it as a target for compilation. It’s just like The Birth & Death of JavaScript¹, but non-fictional, and on a lower level – a Birth & Death of x86 assembly, if you will.
"It wouldn't suprise me if modern CPUs actually added an additional step, where they first translate (compile?) the input stream into some internal instruction set, and then have microcode that acts as an interperator for that."
Among Java processors, there was quite a mix of methods ranging from CISC to RISC to in-hardware to interpret-to-regular-CPU's:
To me the difference is that an interpreter read the code and execute every operation immediately as it encounter them, while I expect a compiler to process a large portion of the code and potentially perform more transformation on it before executing it. In some way I see an interpreter as potentially invoking a compiler on a single operation one-at-a-time and executing the result of the compilation before moving to the next operation.
The lines are grey, to some extent. Many interpreted languages (such as Tcl) compile to bycode, then interpret that. The compilation happens not at the operation level, but at the block (or maybe file) level.
The Haxe compiler is also an interpreter of the Haxe programming language, and it is properly treated as one of the many back-end targets of the compiler.
Moreover, the interpreter is also used as the macro runtime, that means Haxe macros can run arbitrary "real" Haxe code during compilation, e.g. checking the current git branch name or current time and inject the result as a static read-only string in the code.
DMD also comes with a built in D interpreter, RDMD, allowing you to use D as a scripting language with a compiled binary. This can be embedded or called from your D program allowing your program to have a scripting layer written in the same language as the compiled binary. As far as I know, executing a .d file with RDMD is nearly as fast as running a compiled binary.
I've read about D's compile time evaluation but have no personal experience. I know both Rust and C++ are pushing in the same direction.
From what I can see, it's simply an embedded interpreter for the same language; which means that it probably runs about as fast as interpreters in general (that is, significantly slower than D). I would love to be wrong and learn something new, though.
Starting from an interpreter means it's already there, the only difference is when code is evaluated. And embedding it simply means linking the interpreter library. Snigl and D are sort of approaching Lisp from different angles.
The description immediately reminded me of a book called "Understanding Computation." Scrolling down I see the video is by Tom Stuart, the author of that book. I'd really recommend it (the book) for anyone interested in learning about "big step" (operational) and "small step" (structural) semantics.
Tom Stuart, the author of the post, also has written an interesting article called "Programming with Nothing" in which he demonstrates how to use first-class functions in Ruby to explain the Lambda Calculus.[0]
I have implemented the method of parsing an expression into objects with an evaluation method in the scripting language that I build into an editor for performing complex manipulations of images. See http://www.iwriteiam.nl/MySample.html
It depends on what you want. If I wrote aerospace code I'd probably buy adacore's ada and spark compilers and tools.
Since I mostly do scripting work, I usually stick to Python, Bash, & Powershell though.
Some commercial compilers are very affordable with extremely sane licenses. One that has interested me lately is called 8th and is $250 for a professional license which lasts forever and can create encrypted binary executables on Linux, Mac, Windows, raspberry pi, iPhone, android, or embedded all with a cross-compiler, builtin graphics, sound, database support, full REPL, good support...etc. There is also a free version that just has some features missing (not a dumb temporary trial) and a $50 hobby license with many of the features. The Enterprise license would be expensive for me, but fairly cheap for most companies and includes priority support and the source code, so that is neat. I'm all for open-source, but if we're honest, some commercial tools are great and don't abuse users. It is not an only option situation.
I will say closed source can really stink too. I once had to use a product that all of the vendor's customers had to use, but it wasn't officially supported (although it was critical for infrastructure) and I was running into a bug they couldn't figure out or didn't care to look into. The terminal would tell me the problem was in line x of the encrypted script, but wouldn't let me know what line x did so I could get the app working.
Yea it's this one. The main developer wrote an open source Forth called Reva-Forth awhile back. I think he does a lot of paid development a across various fields from phone app development to embedded work, and machine learning stuff. When he was trying to do phone apps he realized (like most of us) that it is a pain to need to know Java for Android (+ API) and Objective C or Swift for iOS and then there is WinForms or Gtk or WxWidjets for desktop development, so you have to learn a lot of different technologies if you want to write software to run on all major platforms and 8th helps solve most of that problem. 8th is also a Forth, but he has added some features like using JSON as a generic data structure (not generally done in Forth), bundled the C++ JUCE GUI framework and associated licenses, added ODBC/SQLITE/MYSQL support, sound, cross-compilation to many platforms...etc.
Most of my coding needs fall into either general scripting (Python, Powershell, Bash, Perl...etc) or numerical work (Python + Numpy or Julia). 8th seems to be pretty speedy, but Forths in general usually aren't lightning fast due to their stack usage and emphasis and you need really fast code for numeric work. He has made it a priority to work on performance though and continuously adds scientific features like adjacency graphs, so he is on the right track and I'll keep my eyes on it. For general scripting work and utility apps it should work pretty nicely for my needs and distributing a tiny .exe to users with a GUI is nice. I'm not sure how good the file I/O & filesystem support is though (need to look at the doc), because I use those common features a lot. I also need to check if he wrote functions (called words in Forth speech) for web scraping as there is a project I'd like to do where 8th would be ideal.
If you've never seen Forth before, try to keep an open mind with the syntax as it can look alien at first. There are a lot of 8th examples on Rosetta Code and although they are short, it might look like gobblygook if you're not used to thinking about the stack where parameters are popped off and used in functions which then return data to the stack so functions can be chained (kind of like Unix pipes). Leo Brodie's books on Forth (classics and free online) are called "Learning Forth" and "Thinking Forth" and are recommended on HN sometimes.
I did try out Forth a little, much earlier, and had read some of the Starting Forth book - had bought the hardcover edition, it looked really good, including the cover - dark brown, either was leather or had a leather effect (more likely the latter). And I liked Leo's writing.
Will check out 8th at some time. The cross-platform GUI thing with tiny EXE's is a pro (Rebol and Red are similar, except not sure if Rebol supported EXEs, except maybe via 3rd-party tools), although not having good I/O and filesystem support is a con, since a lot of the stuff I do involves file I/O.
I was going to mention that it is a lot like Rebol and Red in philosophy, but you beat me to it.
Rebol was mostly built by one person (Carl Sassenrath) but has to package the entire interpreter (I think maybe 5 MB) to make an .exe if you have the paid version. Red can already make executables and has made fantastic progress, but it is a larger effort. I check up on Red once a month and think 1.0 will be amazing.
>I was going to mention that it is a lot like Rebol and Red in philosophy, but you beat me to it.
I agree, the philosophies are similar. Minimalism, relying more on your own code than libraries, not generalizing your code for a lot of cases that might not happen, etc.
One thing that puzzles me about Forth, though:
I did read a few books about it (much earlier, and I'm not a language designer, so what I say next could be wrong). Some of them talked a bit about Forth's internal design. I understood from that, that when multiple words (Forth subroutines) are defined, they are "threaded", from which I understood that they are added to a linked list of word definitions. Wondered why it needed to be a linked list. If it was a lookup table of some kind, like a Python dict or a Ruby hash (basically an associative array), looking up word definitions to run them would be faster than in a linked list, which is linear, I'd think.
Edit: I did just have an idea of what the reason for that may be, but will wait a bit to see if anyone replies with their explanation. Hint about my idea: it may have to do with Charles Moore's Forth philosophy - just a guess on my part.
I'm very much a noob when it comes to Forth and I've asked myself the same question before. I think threaded is generic enough to not necessarily dictate the data structure. It more or less means that a word (if it isn't being defined and added to the dictionary (not dictionary as in Python dictionary, but what they call the list of words in Forth)) it will read from the stack, modify the stack, and then call the next word.
Someone on here or on the Forth subreddit should be able to answer though.
>Rebol was mostly built by one person (Carl Sassenrath)
Yes, I had read up about him after learning that he was involved with creating both Amiga OS and then Rebol. Amazing skills. IIRC, I read in some interview of him, that he said he read everything he could lay his hands on, about operating systems, before working on the Amiga OS. Maybe did the same for languages, before working on Rebol.
>but has to package the entire interpreter (I think maybe 5 MB) to make an .exe if you have the paid version.
Interesting. I had not tried the paid version of Rebol, only the free one, and that too, only for a bit, as a hobby. But did like it a lot even then.
>Red can already make executables
Yes, I had tried that, and the executables are really small too, just like the Red interpreter/compiler (in one file!) and the Rebol interpreter.
>I check up on Red once a month and think 1.0 will be amazing.
Likely so. I only hope they do not get off the tracks with their coin-funding stuff, I felt somewhat bad about that, would have preferred if they had gotten funding by more traditional means, but that is just my thoughts.
>maybe 5 MB
Just did a DIR and checked on my PC, the sizes of Rebol Core and Rebol View (bytes):
The page seems to spend all it's effort promoting how the language is cross-platform. Given the number of cross-platform languages nowadays, I wouldn't think that would be a good selling point (more of a check box).
They do seem to do that a good bit, but is there really a one stop language you can code in from a high level with REPL that can run on several desktop operating systems AND android/iPhone AND em embedded? I'm not aware of any, but that isn't really my area of use. C obviously can, but C is pretty low level and not a super productive language for most. So when they say cross-platform they mean write once and test on your laptop and then cross-compile to all those supported platforms. That is different than Java's write once and run on 3 out of the 6 platforms I mentioned with the bulk (and power I suppose) of the JVM. Note that I'm not saying this or similar philosophical languages like Rebol/Red are the answer to everything as they most certainly are not. If I'm writing something that has to scale to millions of users (something I know next to nothing about), I would likely use Java or Erlang. These languages (8th, Red, Rebol) are really good at allowing you to build reasonably performant applications in a super easy and interactive manner and satisfy a definite need that isn't satisfied by most common enterprise languages. Python is easy to script in, but doing GUI work or trying to deploy is painful. Java requires a nearly painful amount of boilerplate.
> "Programs that manipulate other programs are powerful, interesting and fun. I’ll use Ruby to take you on a tour of how interpreters and compilers work, introduce the ideas behind a technique called partial evaluation, and explain a surprising computer science result which allows compilers to be generated automatically."
As usual, the only relevant post is downvoted to submission. Can you all stop downvoting honest and helpful comments? This is one of the most hostile communities on the internet today. The post title is 100% misleading, it is not a large list of Compilers that are free. Which is what I expected. It's all about Ruby... Why was Ruby not in the title? OP should fix the title.
Whenever the Futamura projections are described, it's always at a conceptual level, in general terms. Reading a blog post like this one, I'm always left to wonder (a) why does this not end with a link to a github implementation of a general partial evaluator and (b) why is it the case that none of the compilers I use were produced with these techniques?
There must be some set of tradeoffs preventing this approach from living up to the "compilers for free" promise, but I haven't found anyone with a real explanation. Either there must be hard challenges which make implementing a general partial evaluator very difficult, or the results that we'd get from using such a partial evaluator are poorer than what we get through other means right?
Perhaps the greatest success in this direction is PyPy. At some level the RPython toolchain is general enough that people have used it to produce JITted versions of ruby, php and a few other languages. But these are a handful of alternative implementations to existing languages, where the predominant implementation doesn't use this toolset. And PyPy's own docs note that the JIT compiler needs some amount of hinting from the interpreter; you can't just throw any interpreter (written in RPython) at it.