Hacker News new | past | comments | ask | show | jobs | submit login

There is something to be said for having LOTS and LOTS of non optimizing C compilers written in many languages. Even languages like, say, Python. And plenty of C compilers written in C. A virtue of being non-optimizing is that they are easier to study and verify correct operation.

What we need is that the source code for the smaller number of optimizing compilers can be compiled by the many, many different inefficient C compilers.

We need to be sure that the binary of the optimizing compiler is not compromised by the compiler which compiled the optimizing compiler.

You could compile compilers. Including compiling various compilers written in C. At some point you compile the wanted optimizing compiler's source using various different compilers generated through various chains of compilation. Those binaries of the optimizing compiler may all differ, but the code generated by those multiple binary versions of the optimizing compiler should be identical.

Finally use the uncompromised optimizing compiler to compile itself. In fact, using the multiple binaries of the optimizing compiler to compile itself and be sure they all produced the same final binary.

Cross compilation doesn't hurt either. For example a Raspbery Pi compiles the optimizing compiler for x86-64. The binary of that compiler should still match the same compiler compiled from other C compilers.

Now we not only have to be paranoid about the binary of a compiler being compromised by another compiler, but by the Intel Management Engine. Compromise baked right into the hardware.

Every paranoid thing I thought ten years ago turned out to be true.




See Ken Thompson's award speech where he talks about the original C compiler backdoor:

http://wiki.c2.com/?TheKenThompsonHack

http://vxer.org/lib/pdf/Reflections%20on%20Trusting%20Trust....


See David Wheeler's work on "Diverse Double Compiling". What the parent describes seems to be similar.

https://arxiv.org/abs/1004.5534


> Every paranoid thing I thought ten years ago turned out to be true.

That seems like confirmation bias. But yeah, good idea about compiler integrity.


You want verifiable builds, but that different algorithms should produce the same result.

Is this even possible?

And wouldn't it be easier to just verify the binary manually?


I've not been able to confidently follow what the parent wrote, but if they're discussing diverse double compiling, the way it works is that you compile the same source code (S) with a bunch of different compilers (C0...CN). This gives you a bunch of executables (E0...EN) which are the compilation of S. Because C0...CN are different compilers, E0...EN will probably differ, bitwise. But because behavior should depend only on the source code, if those compilers are sufficiently correct E0...EN should behave the same. So if you compile S with E0...EN to get (say) E0'...EN', those should all be bitwise identical.


That's right. Details, demos, and formal proofs about diverse double compiling (DDC) can be found here on my web page: https://www.dwheeler.com/trusting-trust/


You said it way better than I did.

Also doing cross-compilation on different platforms should produce identical binaries, are at least identical behaving binaries.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: