Interesting. I packaged up OCaml native aarch64 (arm64) and ppc64le backends for Fedora a few weeks ago. (Most of the hard work was done by Benedikt Meurer and Michel Normand). Neither required cross-compilation, since the native compiler compiles itself using the OCaml bytecode interpreter.
It did reveal an actual bug in [edit: a branch of] qemu. It wasn't emulating the aarch64 RET xN instruction correctly. Apparently no other software in the whole of Linux uses this strange variant of RET.
Ha, I'd forgotten about that bug report (just closed it) -- it was only in the SuSE version of the aarch64 QEMU, not in the mainline one.
I expect most code generators use BR for "jump to destination in arbitrary register" -- BR and RET behave identically except for RET providing a hint for branch prediction and for debuggers that it's a return-from-subroutine rather than a random jump; so "RET LR" is really common and "RET <any other register>" is kind of weird.
What do you think? Looks like a return to me, albeit using an unusual register. Note that this code is especially speed critical because every time OCaml calls a C function that could allocate GC'd memory, it has to go through this function.
Yeah, that's a legitimate use; it is (judging by the code) relying on the fact that it's glue between two calling conventions where the outer one has more callee-saves registers, so we can save LR in a register rather than putting it on the stack.
It did reveal an actual bug in [edit: a branch of] qemu. It wasn't emulating the aarch64 RET xN instruction correctly. Apparently no other software in the whole of Linux uses this strange variant of RET.
https://bugs.launchpad.net/qemu/+bug/1263747