Hacker News new | past | comments | ask | show | jobs | submit login

I think it's really cool that Haoran Xu and Fredrik Kjolstad's copy-and-patch technique[0] is catching on, I remember discovering it through Xu's blog posts about his LuaJIT remake project[1][2], where he intends to apply these techniques to Lua (and I probably found those through a post here). I was just blown away by how they "recycled" all these battle-tested techniques and technologies, and used it to synthesize something novel. I'm not a compiler writer but it felt really clever to me.

I highly recommend the blog posts if you're into learning how languages are implemented, by the way. They're incredible deep dives, but he uses the details-element to keep the metaphorical descents into Mariana Trench optional so it doesn't get too overwhelming.

I even had the privilege of congratulating him the 1000th star of the GH repo[3], where he reassured me and others that he's still working on it despite the long pause after the last blog post, and that this mainly has to do with behind-the-scenes rewrites that make no sense to publish in part.

[0] https://arxiv.org/abs/2011.13127

[1] https://sillycross.github.io/2022/11/22/2022-11-22/

[2] https://sillycross.github.io/2023/05/12/2023-05-12/

[3] https://github.com/luajit-remake/luajit-remake/issues/11




Copy and patch is a variant of QEMU's original "dyngen" backend by Fabrice Bellard[1][2], with more help from the compiler to avoid the maintainability issues that ultimately led QEMU to use a custom code generator.

[1] https://www.usenix.org/legacy/event/usenix05/tech/freenix/fu...

[2] https://review.gerrithub.io/plugins/gitiles/spdk/qemu/+/5a24...


Ultimately, most good ideas were first implemented by Fabrice Bellard.


Copy and patch goes all the way back to Grace Hopper's original compiler implementation


They were called "template-based" JIT and copy-and-patch approach is not new in this regard. The novel idea of copy-and-patch JIT was an automatic code generation via relocatable objects. (By the way, QEMU is indeed cited as an inspiration for copy-and-patch JIT.)


Heard of him; he's done a lot of stuff. So is he the Bourbaki of software?

https://en.m.wikipedia.org/wiki/Nicolas_Bourbaki


While I'm sure some people theorize that Fabrice Bellard is actually a pseudonym of a collective of 10x programmers, he is as far as I know just one person.


He he, I forgot to add a wink at the end of my question above.


I am happy to see him working on QuickJS in the last month or so. It could really use some ES2023 love!


Which itself is how I understand the compilation step of quaject code in the synthesis kernel to have worked. Ie. do constant propagation and dead code elimination on the input format, and use a simple template driven backend to dump out native machine code.

In fact I wouldn't be surprised if the earliest compilers were template based, as that's about the only implementation that would fit in a RAM as a compiler pass.


M. Anton Ertl and David Gregg. 2004. Retargeting JIT Compilers by using C-Compiler Generated Executable Code. In Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques (PACT '04). IEEE Computer Society, USA, 41–50.

https://dl.acm.org/doi/10.5555/1025127.1025995


While bears a significant resemblance, Ertl and Gregg's approach is not automatic and every additional architecture requires a significant understanding of the target architecture---including an ability to ensure that fully relocable code can be generated and extracted. In comparison, the copy-and-patch approach can be thought as a simple dynamic linker, and objects generated by unmodified C compilers are far more predictable and need much less architecture-specific information for linking.


Full copy of the paper: https://www2.cs.arizona.edu/~collberg/Teaching/553/2011/Reso...

There's also this which seems to use the same technique:

Templates-based portable just-in-time compiler, https://dl.acm.org/doi/abs/10.1145/944579.944588

Nice to see there's still room for innovation in the VM space!


Does Ertl and Gregg's approach have any "upsides" over copy-and-patch? Or is it a case of just missing those one or two insights (or technologies) that make the whole thing a lot simpler to implement?


I think so, but I can't say this any more confident until I get an actual copy of their paper (I used other review papers to get the main idea instead).


The copy-and-patch also assumes the compiler will generate patchable code. For example, on some architecture, have a zero operand might have a smaller or different opcode compared to a more general operand. Same issue for relative jumps or offset ranges. It seems the main difference is that the patch approach also patches jumps to absolute addresses instead of requiring instruction-counter relative code.


Anton Ertl! <3

Context: I've been on a concatenative language binge recently, and his work on Forth is awesome. In my defense he doesn't seem to list this paper among his publications[0]. Will give this paper a read, thanks for linking it! :)

If they missed the boat on getting credit for their contributions then at least the approach finally starts to catch on I guess?

(I wonder if he got the idea from his work on optimizing Forth somehow?)

[0] https://informatics.tuwien.ac.at/people/anton-ertl


Thanks a lot!! I'm something of a beginner language developer and I've been collecting papers, articles, blog posts, anything that provides accessible, high level description of these optimization techniques.


Reminds me of David K who is local to me in Florida, or was, last I spoke to him. He has been a Finite State Machine advocate for ages, and its a well known concept, but you'd be surprised how useful they can be. He pushes it for front-end a lot, and even implemented a Tic Tac Toe sample using it.

https://twitter.com/DavidKPiano


Oh hey it's me!




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: