I think it's really cool that Haoran Xu and Fredrik Kjolstad's copy-and-patch te...

bonzini · on Jan 9, 2024

Copy and patch is a variant of QEMU's original "dyngen" backend by Fabrice Bellard[1][2], with more help from the compiler to avoid the maintainability issues that ultimately led QEMU to use a custom code generator.

[1] https://www.usenix.org/legacy/event/usenix05/tech/freenix/fu...

[2] https://review.gerrithub.io/plugins/gitiles/spdk/qemu/+/5a24...

twbarr · on Jan 9, 2024

Ultimately, most good ideas were first implemented by Fabrice Bellard.

KRAKRISMOTT · on Jan 9, 2024

Copy and patch goes all the way back to Grace Hopper's original compiler implementation

lifthrasiir · on Jan 10, 2024

They were called "template-based" JIT and copy-and-patch approach is not new in this regard. The novel idea of copy-and-patch JIT was an automatic code generation via relocatable objects. (By the way, QEMU is indeed cited as an inspiration for copy-and-patch JIT.)

fuzztester · on Jan 10, 2024

Heard of him; he's done a lot of stuff. So is he the Bourbaki of software?

https://en.m.wikipedia.org/wiki/Nicolas_Bourbaki

vanderZwan · on Jan 10, 2024

While I'm sure some people theorize that Fabrice Bellard is actually a pseudonym of a collective of 10x programmers, he is as far as I know just one person.

fuzztester · on Jan 10, 2024

He he, I forgot to add a wink at the end of my question above.

darknavi · on Jan 9, 2024

I am happy to see him working on QuickJS in the last month or so. It could really use some ES2023 love!

monocasa · on Jan 10, 2024

Which itself is how I understand the compilation step of quaject code in the synthesis kernel to have worked. Ie. do constant propagation and dead code elimination on the input format, and use a simple template driven backend to dump out native machine code.

In fact I wouldn't be surprised if the earliest compilers were template based, as that's about the only implementation that would fit in a RAM as a compiler pass.

checker659 · on Jan 9, 2024

M. Anton Ertl and David Gregg. 2004. Retargeting JIT Compilers by using C-Compiler Generated Executable Code. In Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques (PACT '04). IEEE Computer Society, USA, 41–50.

https://dl.acm.org/doi/10.5555/1025127.1025995

lifthrasiir · on Jan 9, 2024

While bears a significant resemblance, Ertl and Gregg's approach is not automatic and every additional architecture requires a significant understanding of the target architecture---including an ability to ensure that fully relocable code can be generated and extracted. In comparison, the copy-and-patch approach can be thought as a simple dynamic linker, and objects generated by unmodified C compilers are far more predictable and need much less architecture-specific information for linking.

naasking · on Jan 9, 2024

Full copy of the paper: https://www2.cs.arizona.edu/~collberg/Teaching/553/2011/Reso...

There's also this which seems to use the same technique:

Templates-based portable just-in-time compiler, https://dl.acm.org/doi/abs/10.1145/944579.944588

Nice to see there's still room for innovation in the VM space!

vanderZwan · on Jan 9, 2024

Does Ertl and Gregg's approach have any "upsides" over copy-and-patch? Or is it a case of just missing those one or two insights (or technologies) that make the whole thing a lot simpler to implement?

lifthrasiir · on Jan 9, 2024

I think so, but I can't say this any more confident until I get an actual copy of their paper (I used other review papers to get the main idea instead).

pierrebai · on Jan 9, 2024

The copy-and-patch also assumes the compiler will generate patchable code. For example, on some architecture, have a zero operand might have a smaller or different opcode compared to a more general operand. Same issue for relative jumps or offset ranges. It seems the main difference is that the patch approach also patches jumps to absolute addresses instead of requiring instruction-counter relative code.

vanderZwan · on Jan 9, 2024

Anton Ertl! <3

Context: I've been on a concatenative language binge recently, and his work on Forth is awesome. In my defense he doesn't seem to list this paper among his publications[0]. Will give this paper a read, thanks for linking it! :)

If they missed the boat on getting credit for their contributions then at least the approach finally starts to catch on I guess?

(I wonder if he got the idea from his work on optimizing Forth somehow?)

[0] https://informatics.tuwien.ac.at/people/anton-ertl

matheusmoreira · on Jan 9, 2024

Thanks a lot!! I'm something of a beginner language developer and I've been collecting papers, articles, blog posts, anything that provides accessible, high level description of these optimization techniques.

giancarlostoro · on Jan 9, 2024

Reminds me of David K who is local to me in Florida, or was, last I spoke to him. He has been a Finite State Machine advocate for ages, and its a well known concept, but you'd be surprised how useful they can be. He pushes it for front-end a lot, and even implemented a Tic Tac Toe sample using it.

https://twitter.com/DavidKPiano

davidkpiano · on Jan 10, 2024

Oh hey it's me!