Hacker News new | past | comments | ask | show | jobs | submit login

As mentioned in this article, x ^ x == 0. Fun fact, this is frequently used by compilers as a "cheap" way to zero out a register.

In addition, there are comparatively few cases in programming where we XOR. Sure, it happens in things like games quite a lot, but the main use is actually _cryptography_.

Between these two facts (more like hints really), I managed to reverse engineer the bulk of a piece of malware I was given to analyse in a an internship. I was handed the malware, a copy of IDA Pro, and given a few days to see what I could find. All I could remember when presented with a wall of hex encoded machine code were the hints above. I looked for XORs of different values, assumed it was crypto, and extrapolated from there. Found a routine happening three times in quick succession and guessed it was triple-DES. Then I guessed that writing your own 3DES from scratch was unlikely, so googled for crypto libraries and happened to find one that nearly matched (I think an earlier/unmodified version), and worked my way up tagging the operations until I got to the purpose, exfiltrating various registry keys and browser history to [somewhere].

It was a fun exercise, and therefore these facts will stay with me for far longer than they are accurate I'm sure!




Xor'ing registers isn't a compiler trick or arcane piece of lore, it's the canonical way to zero a register on most architectures. It's the only universally recommended way for both Intel and AMD x86 and x64 processors.


Not necessarily true for most architectures. Many RISC architectures have a "zero" register, so a canonical "move short immediate" instruction could be "OR Rn, R0, #n" ("OR R0 which is always zero with n and store into Rn") or the same with ADD. Then clearing Rn will usually be "OR Rn, R0, R0", "ADD Rn, R0, #0" or something like that.


I should have been clearer, when I say most architectures I mean most x86/x64 architectures not most ISAs. Obviously ISAs with a dedicated zero register don't need the zeroing idioms of x86.


In that case it's all of them. Some might like having both a MOV and a XOR in some special cases (one to avoid partial register stalls and one to avoid partial flags stalls, if I remember correctly) but even in those cases it's usually easy enough to avoid partial register stalls in some other way and just use XOR.


In fact, modern x86 CPUs know that the result of "x XOR x" is independent of x, and use this fact to optimize operations that would otherwise have a dependency.


You are correct but using programmers terminology here is confusing.

The opcodes 31 C0 mean "Set EAX register to zero"

The opcodes 31 D8 mean "Set EAX to EAX xor EBX"

The trick part is in the mnemonics used by assemblers and literature that transcribe the first as "XOR EAX, EAX".

It isn't actually implemented as such, so it isn't really an "optimization"


I'm not sure what you mean by "programmers terminology".

0x31 0xC0 is disassembled to "xor eax, eax".

This is not nitpicking, as there's an important difference. The "xor eax, eax" instruction affects CPU flags [1], while "mov eax, 0" doesn't [2].

> It isn't actually implemented as such

The implementation is independent of the meaning of the instruction set. There are many implementations of x86 instructions with differing levels and kinds of optimization, so we can't make general statements about that.

[1] https://c9x.me/x86/html/file_module_x86_id_330.html

[2] https://c9x.me/x86/html/file_module_x86_id_176.html


You make it sound as if there were four Intel/AMD x86/x64 architectures. And well, there aren’t. The vendor split is not a thing, we’re all running the same software on Intel and AMD x86/x64 processors. And you could argue that the x86/x64 split doesn’t really matter for this, since x64 is a superset of x86 and inherits this tradition from the 16-bit and 32-bit eras.


There are dozens of x86/x64 uarchs, x86 is just an ISA, doesn't say anything about how the chip is built.


Thanks! Learning even more!


I feel like xor comes up all the time in programming! Perhaps generations that have almost entirely focuses on web front and backends never have use for it, but libraries underneath it all certainly are.


xor reg, reg is indeed the standard way to zero out a register in x86 assembly (it's not just a compiler trick) as it is both shorter and faster than loading the register with zero via a mov.

Apart from that, I'd say that a common use of XOR operations in general are interactions with hardware peripherals where manipulating bit fields are needed.


Why does mov'ing a constant end up having overhead?


It's not overhead, it's about dependency breaking. 32-bit xors on a single register are universally recognized as a zeroing idiom, which means the CPU doesn't have to wait for the results of previous operations in order to set the value of the applicable register to zero.

In modern CPUs zero'ing idioms aren't even executed, they only get as far as the register allocater. The register allocater will allocate a zero'd physical register for the architectural register that had the idiom applied to it and the job is done.


Could you share a source about register allocation describing optimizations such as the one you described?


You might find what you are looking for by googling "register renaming"


The magic words for this is "zeroing idiom".


It's larger, because it needs to fit a 32bit value of 0 in the instruction, and thus e.g. on x86 needs 5 bytes, whereas xor reg,reg needs 2. As such it was a common code size optimization, which in turn has lead to CPU manufacturers optimizing their CPUs to recognize it and treat it even more efficiently.


A comprehensive answer:

https://stackoverflow.com/questions/33666617/what-is-the-bes...

Basically, xor leads to smaller code and more efficient use of resources.


IIRC immediate operands have to be the same size as the destination, so to zero a 32-bit register you need a 32-bit constant, and it simply makes the whole instruction larger than a simple xor.


A somewhat common use of XOR is p != q. It is actually an XOR.

Another useful way to think of XOR is "either p or q is true, but not both, and not neither."


> Another useful way to think of XOR is "either p or q is true, but not both, and not neither."

That's literally what eXclusive OR means. This should be the first way to think about it.


I think most people think of XOR as “difference of bits”, which it is.


FWIW, I mostly think of it as addition in Z_2^n.


That is more traditional. It's addition without carry.


Although in Z_2^n, "addition without carry" and "subtraction without carry" are the same operation.


Compilers can optimize "a != b" to "a xor b" if they know that both operands are 0 or 1. There are crazy many expression rewriting rules.


They don’t need to be 0 or 1. We already know that a ^ a = 0, a ^ 0 = a, and a ^ (b ^ c) = (a ^ b) ^ c, so we must have a ^ b = 0 only when a = b. Therefore in the C convention where 0 means false, a ^ b = not (a = b) = (a!=b) (this equality only holds for expressions that are going straight into a Boolean operation (or test in eg an if statement), as a!=b should always evaluate to 0 or 1.

The compiler may use this by xoring a and b and then jumping if the zero flag was set.


> (this equality only holds for expressions that are going straight into a Boolean operation

Right, though that's more commonly turned into just a cmp instruction (a subtraction).

However, if the result is assigned to a variable, on x86 "a != b" would be two or three instructions (possibly clearing a register because setne takes an 8-bit register operand only, and then a cmp/setne pair). Instead the xor would be one instruction, or two if a mov is needed.


"In addition, there are comparatively few cases in programming where we XOR. Sure, it happens in things like games quite a lot, but the main use is actually _cryptography_"

I'd be curious if that's actually true. I know XOR is used for non-cryptographic checksums, parity bits, maintaining key traversal order for associative arrays, overflow detection, etc. Lots of general purpose "stuff" that isn't cryptography.


XOR is used a ton in the theoretical underpinnings of cryptography. It's used in the one time pad which is essentially the "smallest" cryptographic scheme that is perfectly secure (perfectly secure has a mathematical definition in this context, it's not saying there can never be any attacks).

In general the reason why is that if you have two random variables x and y, where x has any distribution (so for example x could even be "attack normandy on june 6" with certainty) and y is uniformly distributed across all n-bit strings (so it could be any string of n zeros and ones with equal probability), then you can show that x ^ y appears as if it is also uniformly distributed across all n-bit strings as well.

Because of this property it's used frequently in many higher order methods as well.


A lot of that sort of thing is abstracted away by common standard libraries. I don't tend to run into a whole lot of XOR on a day-to-day basis, apart from when digging into low-level libraries.


Hashing is another big use case.

For example, XOR is the most convenient combining function for Zobrist hashes in chess programs. OR/AND would be bad choices: given random input, their output is biased to the values 1 and 0 (respectively).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: