Agreed, the XOR trick is just a clever trick that might be useful in some esoteric niche cases, but otherwise don't bother. GCC will even optimize away the XOR operations when this is used for simple things like ints on amd64. I did a simple compile test, and with -O1 and higher I got identical machine code for the XOR algorithm and the naive swapping version. Without optimization I did get the XOR instructions, but the number of machine instructions was longer than for the naive case, and the number of CPU registers used was the same.