Hacker News new | past | comments | ask | show | jobs | submit login

Careful abusing these tricks. Over 10 years ago I decided to implement an RC4 (arcfour) cypher to generate pseudorandom noise for a test program.

The algorithm looks like (from wikipedia):

  i := 0
  j := 0
  while GeneratingOutput:
    i := (i + 1) mod 256
    j := (j + S[i]) mod 256
    swap values of S[i] and S[j]
    K := S[(S[i] + S[j]) mod 256]
    output K
  endwhile
Being a smartass 1337 coder (and declaring intermediate variables always being a bit of a bother in C89) in my implementation I decided to implement the swap using the xor trick:

   s[i] ^= s[j];
   s[j] ^= s[i];
   s[i] ^= s[j];
I then ran a few basic test vectors, everything seems to work fine.

A little while later I noticed that the output of the cypher was weird, the longer it ran the more zeroes I would get, eventually getting only zeroes.

The keen reader will already have spotted the problem: whenever i and j happen to be equal instead of doing nothing the code sets the entry to 0 (since x ^ x == 0 for any x).

And that's the story of how I never used this trick ever again.

In general it's always better to write clear, concise code and let the compiler optimize it. If it turns that the code is performance critical and the compiler doesn't do a good enough job then you can see if the smart solution is worth it.

And if you fail an interview because the interviewer expected you to know how to find a duplicate in an integer list by using xor then you probably dodged a bullet anyway. What a silly trivia question.




The code from the Wikipedia article for XOR swapping [1] checks if the values are equal.

  void XorSwap(int *x, int *y) {
    if (x != y) {
      *x ^= *y;
      *y ^= *x;
      *x ^= *y;
    }
  }

[1] https://en.wikipedia.org/wiki/XOR_swap_algorithm


> checks if the values are equal

Not the values (the values being equal doesn't break the trick), but the pointers. That is, if you use the XOR trick to swap a value with itself, then it will be set to zero, instead.

As if you had written:

  x ^= x
  x ^= x
  x ^= x
Yeah, now x would be zero, not itself.


At that point it's probably almost always better to just use some temporary space, rather than dealing with all the overhead of a branch (what if it's mispredicted, what about the resources in the branch predictor tied up by this that might cause something else to be mispredicted).

On the other hand there are many situations where you can guarantee that x and y are not the same space in memory (for example because they are local variables). There this trick might still be interesting (for the compiler or assembly programmer)


The xor swap truck is rarely better. It causes pipeline stalls, and so is likely to be significantly slower than the trivial swap. The temporary storage will be in a register, so it's (usually) not causing memory accesses.


Yeah exactly this, it only really makes sense in some very limited situations in embedded software, for example when you might be in an interior handler and have no free registers


If they’re local variables, the compiler may not need to generate any code for a swap. It could do the equivalent of register renaming (https://en.wikipedia.org/wiki/Register_renaming)


And if your compiler can't do it, your hardware might.


On top of that, the CPU can literally elide movs at runtime so the whole swap could end up being zero latency depending on the code motion


Wouldnt the extra branch kill any performance improvement you were getting from the xor trick?


As with every performance optimisation, it's better to test and profile rather than to assume


On the topic of intermediate variables, when I'm forced to use c89 I'm really shameless about introducing extra code blocks in the middle of other blocks. For example,

    {int i; for(i=0; i<N; i++){
       /**/
    }}
Still bothersome but better than having to declare everything at the top of the function, IMO.


I do something like this in "modern" Java to get for-each with index:

    { int i = -1; for (V v : list) { i++;
        // stuff
    } }
While declaring i = -1 to start is a little gross, I like that it has some of the same properties as a normal for loop. No effect on parent scope, and all loop-logic is concentrated on the first (textual) line.


I agree, but I think back then I didn't know this trick.

I also abuse blocks in Rust, but it's more in order to placate the borrow checker...


That's not necessarily abuse; it may be quite appropriate to explicitly limit something's lifetime.


As an example, in our C++ codebase at work we always place mutex locks inside their own scope blocks along with the critical sections of code they’re synchronizing. Helps readability and the scope determines when the mutex is released.


Indeed! I somewhat over-use this in Rust to initialize an immutable variable with mutating code.

  let var: Vec<u32>;              // this one is immutable, will be initialized later
  {
    let mut var_;                 // mutable
    …                             // initialize var_ with some mutating code
    var = var_;                   // now move var_ to var, initializing the latter
  }


This is a bit simpler as:

  let var = {
      let mut var = ...

      var
   };
that is, you can assign the result of the block directly. That way you don't need two variables with the _ name as the second.


Couldn't you just write:

  let var: Vec<u32> = {
    let mut var_;
    …
    var_
  };
Or, better still:

  let mut var: Vec<u32>;
  …
  let var = var;


You can also initialize it to an expression, e.g.

    let var = {
      let mut var_ = Vec::new();
      var_.push(1u32);
      var_
    };


if I was forced to use c89 I would resign and find a better job...


The XOR trick has also the disadvantage of being 3 dependend instructions that can not overlap. The one with a temp variable has only 2 of them dependend and can therefore save one cycle on an OO CPU.


Even in a old OOO core because of register renaming and friends the swap could have a latency of zero in the first place.


> And that's the story of how I never used this trick ever again.

I agree with your point, and I would go on to say that this is an excellent example of the value of tests. Especially with something so relatively small and self-contained, which makes it easy to test.


I sometimes use your "bug" on purpose in tests, when checking that two values are either both defined or both undefined, but not one defined and one undefined:

$ python

>>> a, b = 1, 2

>>> bool(a) ^ bool(b)

False

>>> a, b = 1, 1

>>> bool(a) ^ bool(b)

False

>>> a, b = None, None

>>> bool(a) ^ bool(b)

False

>>> a, b = None, 1

>>> bool(a) ^ bool(b)

True

(Note: this doesn't work if a value can be 0, because bool(0) is False in Python).

This helps to avoid having to write something like `(a is not None and b is not None) or (a is None and b is None)`.


Your note is very important, I think it would be good to give it more emphasis.

What works flawlessly, however, is:

  In [1]: a = None
  In [2]: b = 0
  In [3]: (a is None) ^ (b is None)
  Out[3]: True
Alternatively, as suggested in another comment, you can use inequality as a replacement for XOR:

  In [4]: (a is None) != (b is None)
  Out[4]: True

Another error prone pattern is the following:

  In [5]: a or b
  Out[5]: 0
Which can behave differently depending on the order of elements/values:

  In [6]: b or a
  Out[6]: None # should be 0
Since b is zero, it doesn't count. Less error prone, but also more verbose is:

  In [10]: a if a is not None else b
  Out[10]: 0


Lesson for myself: check my working. The test I actually used the XOR in, that I was referring to, uses the first example you wrote - I agree, this is much better. However, I also like your second example, which is arguably clearer. Thanks!


Not a python person, but I usually do this in other languages via an equality check `(a != nil) == (b != nil)` etc


You are aware that ^ on a 1 bit value (like a boolean) is just !=, right?


Hah, I wrote that code on the fly and didn't check the aforementioned test implementation, where I had `(a is None) ^ (b is None)` like the other commenter suggested.


You could just write this as

    (a is None) is not (b is None)
or

    (a is None) != (b is None)
It won't make a difference (and is 1 to to 5 characters longer), but it seems "cleaner" since it's more specific.


I like this as an example of how compilers cause bugs. This code looks reasonable and concise, but it shouldn’t. If you were writing microcode for a non-superscalar processor, you would do two separate memory to register loads and then three xors and then two register to memory stores. But the compiler teaches us to use shorthand and it will fill in the details. In this case the resulting code is ridiculous with loads and stores on each xor because the compiler can’t guarantee that i!=j, and it would normally perform terribly, but we’ve architected CPUs around compilers to use caches that make up for these problems, and they are so prevalent that it’s worth it to dedicate most of the CPU die area to branch prediction based on conventions of the C compiler, and most of the compiler code to conventions of the CPU branch predictor. It’s getting quite ridiculous, and maybe it’s time to move to architectures and languages that make everything explicit.

Edit: also, there is an xchg instruction that you have to pull out of intrinsics to use in C (good luck in most other HLL). It seems like a language should be more about how to use vocabulary than restricting it.


People like to harp on C++, but templates permit code that is simultaneously maximally clear and efficient:

#include <utility>

swap(s[i], s[j]);


Prefer pythonic style

std::tie(b, a) = std::make_tuple(a, b);


Why? That's about 400% harder to read. Even someone who doesn't know how to program in any language can probably infer what the C++ example does.


I know normally python is simpler and easier to read, but this is not one of those times.

The example C++ is much easier to read and more intuitive as to what it does.


To be fair, the Python syntax is just:

a, b = b, a

The "std::tie(b, a) = std::make_tuple(a, b)" is an attempt to replicate the Python style in C++... which, I agree, is certainly not clear, and, unlike using the "swap" algorithm, it's also unclear whether it will result in efficient code.


> because the interviewer expected you to know how to find a duplicate in an integer list by using xor

Maybe they expect you to come up with a solution right there. This way they can see that you're curious and trying to solve a problem even if you don't know its solution beforehand. (And by looking at your attempts they your mind is even working in the right direction. They could give you small hints and observe how you process them.)

I never conducted interviews myself, but some years ago my supervisor asked me for advice on interview problems for his interns. He wanted something that will help spot a person inclined to algorithmic thinking, but the job was not 100% algorithmic, he needed programmers. If he asked me the same today, I would have advised him to look at these xor search problems.


> how to find a duplicate in an integer list by using xor

I've never heard of that trick and would never have thought of it on my own. It takes a moment's thought to see why it works, and I like it.

Bit twiddling is a bit like symbolic integration. There isn't a systematic approach, it's just an accumulation of formulae that people have stumbled upon over time. You can buy books of them. (These days, of course, you just use a program that has them all hardcoded into it.)


n00b question: What is the ^= operator, and what language is it from?


It's bitwise XOR in C (and most other languages, I guess).

https://en.wikipedia.org/wiki/Operators_in_C_and_C%2B%2B#Bit...


x ^= y is the same as x = x ^ y, with ^ being XOR. Bunch of languages have it, e.g. C, C++, Java, ...


I have an alternative to the xor trick that doesn't suffer from zeros.

    def swap(a,b):
        a=a-b
        b=a+b
        a=b-a
        return a,b


That suffers from the same problem if a and b are pointers to the same thing (which was the issue in my code, since I indexed the same cell of the array).

Note that the problem is not that a and b have the same value, or even that one of them is 0, it's that a and b alias to the same memory location, effectively being two handles to the same variable.


Hmm, swaps the zero problem for a overflow/underflow one?


If you stick to multiplication and addition then the beauty of modular arithmetic ensures that over and underflow are only a problem if your final result under/overflows.

In particular you can solve the problem in this article by just subtracting the numbers from 1+2+...+n-1+n. The remainder will be your missing number (though be careful with the 1+2+...+n-1+n = n(n-1)/2 formula; division by 2 is very much not compatible with arithmetic modulo a power of 2, so divide first then multiply, or keep a running total if you want to be flexible).


Worse. The zero problem does not exist if the two variables are at different memory locations. The overflow problem of this solution always exists. Plus, if a and b are at the same memory location, it suffers from the same zero problem.


Uh,

  def swap(a, b):
      return b, a
?

Or since this is presumably python, just inline it without a function:

  (a, b) = (b, a)


Did you use this in a contest? I seem to remember this exact thing being used to leak data in a contest.


that's a very sharp edge case.


You're wrong. Remember that XOR works on individual bits. If what you're saying was true, then swapping bits that are both set to 1 would also fail, which means this algorithm wouldn't work at all.

Edit: Ignore this post, I misread the original post as saying swapping the same values would fail.


They're not saying swapping equal valued variables breaks it. It's when the pointer is the same, using the trick to swap a variable with itself will set the variable to 0.


Yeah, I miss-read that. Thanks for pointing that out!


> swapping bits that are both set to 1 would also fail

No, swapping two 1 bits works fine. Work it out more slowly, the article covered this and why it always works.

(1,1) => (1^1,1)=(0,1) => (0,1^0)=(0,1) => (0^1,1)=(1,1)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: