Hacker News new | past | comments | ask | show | jobs | submit | pecg's favorites login

If you really find that attack interesting, you might also find it interesting to read the paper Thompson ripped it off of. Paul Karger co-invented INFOSEC and more attacks/defenses than about anyone in the field if you're counting foundational stuff. He wrote during his landmark pentest of MULTICS that the PL/I compiler could be subverted with a trap door. Added you could even do a compiler/compiler trap. The trap doors were their favorite technique. Thompson was working on MULTICS and received that evaluation. His initial citation for his idea was "an unknown, Air Force document." They made him change it later by giving him another copy. Everyone still credits Thompson despite Karger inventing it and the original mitigations. Those became part of Orange Book class A1 requirements for security certification with all those products coming with defenses against subversion by malicious developers.

The misattribution and forcing a correction by Thompson is in 3.2.4 of their lessons learned paper:

http://hack.org/mc/texts/classic-multics.pdf

Here's the original paper where the so-called Thompson attack is described on page 17.

https://www.acsac.org/2002/papers/classic-multics-orig.pdf

Also note they were inventing both hacking techniques and INFOSEC while doing this evaluation. It was part of forerunner work happening among small number of people with little to draw on. It's why you see me say "the legendary Paul Karger" when describing the results they got. Due credit might be "Paul Karger's compiler attack popularized & further explored by Ken Thompson's paper, Trusting Trust." I keep mentioning it until more give it.

Back on topic. These days we have verified and certifying compilers, too. Even typed assembly language with correctness proofs. Lots of stuff to base it off of that's close to what most developers can understand. Basic refinement from Rust compiler code with no optimizations to macro assembly or local scripting languages is what I've been recommending outside verified compilers since any developer can do it without special tooling. I even proposed bash one time although as a compile target more than what I'd try code it in lol. I see using local scripting is in your suggestions, too. That different people are thinking on same lines here more often might mean it's worth exploring further.


As someone who worked on Plan 9 for over a decade, it would be incredibly difficult. The first question out of everyone's mouth, even back in the early 2000s: "So can you run [Mozilla/Firefox] on it?" No, we couldn't and that was with a very POSIX-like system; the browser is the killer app today and it's also an operating system all its own, meaning it's one of the hardest things to port. We had enough of a basic browser that you could read HTML pages, but otherwise you're stuck with 'linuxemu' which only worked up to a certain (old) version of Debian because the Linux kernel changed shit. If you decide POSIX is a bad paradigm, you're going to have an even harder time getting a browser running.

Of course, most of the shit we do with junky web apps today could just be presented as a 9P service with maybe a couple shell scripts in front of it, but the junky web apps already exist and are in use.


To provide more precedents and a little history:

The first C "interpreters" I know of were for Lisp machines: Symbolics' C compiler (http://www.bitsavers.org/pdf/symbolics/software/genera_8/Use...) and Scott Burson's (hn user ScottBurson) ZetaC for TI Explorers/LMIs and Symbolics 3600s (now available under the public domain: http://www.bitsavers.org/bits/TI/Explorer/zeta-c/). Neither of them are interpreters, just "interactive" compilers like Lisp ones are.

I am writing a C to Common Lisp translator right now (https://github.com/vsedach/Vacietis). This is surprisingly easy because C is largely a small subset of Common Lisp. Pointers are trivial to implement with closures (Oleg explains how: http://okmij.org/ftp/Scheme/pointer-as-closure.txt but I discovered the technique independently around 2004). The only problem is how to deal with casting arrays of integers (or whatever) to arrays of bytes. But that's a problem for portable C software anyway. I think I'll also need a little source fudging magic for setjmp/longjmp. Otherwise the project is now where you can compile-file/load a C file just like you do a Lisp file by setting the readtable. There's a few things I need to finish with #includes, enums, stdlib and the variable-length struct hack, but that should be done in the next few weeks.

This should also extend to "compiling" C to other languages like JavaScript, without having to go through the whole "emulate LLVM or MIPS" garbage that other projects like that do. I think I figured out how to do gotos in JavaScript by using a trampoline with local CPS-rewriting, which is IMO the largest challenge for an interoperable C->JS translator.

As to how to do this for C++, don't ask me. According to the CERN people, CINT has "slightly less than 400,000 lines of code." (http://root.cern.ch/drupal/content/cint). What a joke.


Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: