Porting GHC: A Tale of Two Architectures

rwmj · on April 15, 2014

Interesting. I packaged up OCaml native aarch64 (arm64) and ppc64le backends for Fedora a few weeks ago. (Most of the hard work was done by Benedikt Meurer and Michel Normand). Neither required cross-compilation, since the native compiler compiles itself using the OCaml bytecode interpreter.

It did reveal an actual bug in [edit: a branch of] qemu. It wasn't emulating the aarch64 RET xN instruction correctly. Apparently no other software in the whole of Linux uses this strange variant of RET.

https://bugs.launchpad.net/qemu/+bug/1263747

pm215 · on April 15, 2014

Ha, I'd forgotten about that bug report (just closed it) -- it was only in the SuSE version of the aarch64 QEMU, not in the mainline one.

I expect most code generators use BR for "jump to destination in arbitrary register" -- BR and RET behave identically except for RET providing a hint for branch prediction and for debuggers that it's a return-from-subroutine rather than a random jump; so "RET LR" is really common and "RET <any other register>" is kind of weird.

rwmj · on April 15, 2014

This is the function using ret x19:

https://git.fedorahosted.org/cgit/fedora-ocaml.git/tree/asmr...

What do you think? Looks like a return to me, albeit using an unusual register. Note that this code is especially speed critical because every time OCaml calls a C function that could allocate GC'd memory, it has to go through this function.

pm215 · on April 15, 2014

Yeah, that's a legitimate use; it is (judging by the code) relying on the fact that it's glue between two calling conventions where the outer one has more callee-saves registers, so we can save LR in a register rather than putting it on the stack.

pedrocr · on April 15, 2014

Does anyone have any good pointers on why Ubuntu is adding a little-endian PPC port? I thought they had dropped the regular PPC port a while ago.

Googling around it seems the use for little-endian PPC is to be more easily compatible with GPUs. Is there really a market for running GPU computing with PPC64 CPUs?

cliffbean · on April 15, 2014

This document lists some reasons [0]

• Growing interest in running entire OS in little-endian mode – Ease porting of programs from other architectures – Ease porting of programs which access files containing LE binary data – Ease communication with GPUs • New OpenPower Consortium – IBM, Google, Tyan, Nvidia, Mellanox

Also, see [1].

[0] http://www.linux-kvm.org/wiki/images/7/70/Kvm-forum-2013-Mac... [1] https://www.ibm.com/developerworks/community/blogs/fe313521-...

spatulon · on April 15, 2014

Most PPC chips have been bi-endian for a long time but, as far as I know, everyone treated it as a big-endian processor. For example, it's big-endian all the way in the automotive industry, where PPC is incredibly popular.

Where has the demand come from to start supporting little-endian mode suddenly? I assume the existence of a Debian/Ubuntu port is evidence of such demand.

gsnedders · on April 15, 2014

POWER8 (not yet shipping) has far better little-endian support than prior PPC chips (many converted to/from big-endian when in little-endian mode!), as many customers of IBM apparently want better little-endian support (either to interface with existing little-endian hardware, to be more similar to other hardware, etc.), and IBM is paying for almost all the engineering work going on for the ppc64le ports. IBM need the software to work to be able to sell the hardware, so they're investing a lot in this.

pedrocr · on April 15, 2014

>Where has the demand come from to start supporting little-endian mode suddenly?

I found a Debian discussion on this and the only reasons presented were GPU computing and porting apps that assume little-endian. I assume the GPU computing is the big reason. As we move to unified memory architectures between CPU and GPU they need to both use the same endianness and I guess GPUs are usually little-endian to match x86.

dman · on April 15, 2014

Can you post the link about GPU computing. I am a bit perplexed about what powerpc has to do with GPUs.

_delirium · on April 15, 2014

> I am a bit perplexed about what powerpc has to do with GPUs.

As others note that IBM is funding most of the work, my guess is the main use-case here is IBM's POWER-based HPC clusters that have PowerPC CPUs augmented by a bunch of GPU coprocessors for Cuda/OpenCL offloading. IBM is trying to position POWER clusters as competing with x86-based clusters for certain kinds of work (especially scientific computing), and to match x86-based clusters decked out with GPUs they probably need to have GPU options for their POWER-based clusters as well.

The idea, as I read it (have not had occasion to encounter it myself) is that things are a lot easier if your CPU and GPU have the same endianness, or else you have to byte-swap whenever transferring data to/from the GPU (and there are even more complications if you have unified memory spaces). Since most (all?) commercially available GPUs are little-endian, if you want PowerPC CPUs along with GPUs for auxiliary processing, and want the endianness matched, the little-endian mode of PowerPC becomes important.

dman · on April 16, 2014

Interesting!

rwmj · on April 15, 2014

IBM have been leading the work and contributing a lot of patches for ppc64le in Fedora.