"We all agree that reproducible builds and DDC do not solve all problems. No one...

"We all agree that reproducible builds and DDC do not solve all problems. No one said they did."

This is the crux of my complaint. I know we plus some people here believe that. I don't think the majority do in practice, though. When it's compiler risk, the Thompson paper, yours, and reproducible builds are about all anyone talks about on the majority of forums. There's also warnings about optimizations reducing security or compilers just screwing up with solution being look at assembly. There's been virtually no interest outside high-assurance and functional programming fields in building robust compilers using whatever techniques can be done or addressing common security risks.

I think that's why you can make some of your later claims about what's out there for various languages. Virtually nobody cares enough to put time or money into it. They'll happily use crap compilers that are potentially subverted or full of 0-days so long as the binary was reproducible. That's root problem on demand side. Least we got the CompSci people staying on it FOSSing key results. Look up K Framework's KCC compiler & C semantics for what a great reference might look like. Designed to look like GCC for usability, passed 99.2% of torture suite by 2012, and didn't get locked up like CompCert. Yay!

http://www.kframework.org/index.php/Main_Page

http://fsl.cs.illinois.edu/pubs/ellison-rosu-2012-popl.pdf

" But CompCert only supports a subset of the C99 programming language, is proprietary, and produces executable code that performs significantly worse than Freely-available gcc and clang/llvm when used at their usual optimization levels (-O2)"

It's true. The proper response would've been to do a non-formal, open-source knockoff of CompCert or similar tool in safe, simple language. Then start adding optimizations. Most FOSS people in compilers didn't try but there's some more CompSci teams working on it. QBE backend was exciting, though, as a starting point on first, C compiler in chain if re-written with medium assurance techniques:

http://c9x.me/compile/

" I guess you'll use formal proofs. How do you know that those formal methods tools aren't subverted?"

Oh I saw that coming cuz it worried me, too. Let me present to you a page listing work to knock about everything related to formal methods outside the TCB outside the specs & logics themselves:

https://www.cl.cam.ac.uk/~mom22/

https://www.cs.utexas.edu/users/jared/milawa/Web/

Just follow... all.. of he and his accociates' links. They're all great. Quick summary: verified LISP 1.5 to machine code in HOL, verified SML (CakeML) same way, translation validation sort of compiles specs to assembly in HOL, they did theorem proven composed of simpler provers whose root-of-trust starts with their verified LISP/ML/assembly tooling, extraction mechanism in Coq or HOL to ML that's untrusted, HOL in HOL, a HOL/light designed to be tiny + use verified, and now a HOL to hardware compiler in works. These people are on a win streak.

I included specifically the paper that addresses your concern in addition to Myreen's page. It does for provers what my scheme would do for compilers by working from simple to complex with each step proving the next + verified implementations. They then do extra step of converting each layer to preceding layers to verify with those. It's remarkably thorough with you basically trusting simple specifications and implementations at each step + a tiny, first step. Were you as impressed reading it as I was that they both tackled the prover problem and were that thorough?

Note: It's for a FOL. K framework, some certified compilers, hardware verification, an Alloy-to-FOL scheme, Praxis's method, and Design-by-Contract are among stuff I found done in FOL. Already useful. Plus, I found a HOL-to-FOL converter proven in HOL. Just do that or the HOL/Light stuff in FOL... if possible... then we're from specs down to machine code with verified extraction & tiny TCB in Jared's link.

"And there are subverted binaries attacks - quite a number of them - which can be countered by reproducible builds."

This one I'm more open to consideration. The question being, if you have to build it to check in the first place, why aren't we countering the binary risk by solving source distribution and authentication process? Users vetted it was the right source & makefile then produced their own binary from it using local tools that they hopefully checked. No longer a binary subversion risk. It was the requirement for A1-class systems. The source authentication aspect is very easy to implement: (a) have central location with source, (b) replicate it to others where someone at least looks at it, (c) matching hashes for the source files, and (d) signed archives of it available at multiple locations for comparison. I also like to point out that one needs something like this anyway for source-level security, signing/checking distributed items is already a thing, and people constantly develop more SCM bullshit for fun. Seemed low friction.

So, at that point, you are working only with stuff that wasn't MITM'd. Getting backdoored at that point means the problem was in the source supplied or an existing binary (eg compiler) on your system. The former requires verifiable compilers I was advocating that have source checking & distribution above. The second, since distribution was secure, usually means the user is already compromised: you find a subverted, subverting compiler on your machine you erase that system, maybe buy new hardware with firmware-level attacks these days, reinstall the shit, carefully get a new compiler (preferrably source), and reinstall it. They'll have to get source or binary of replacement compiler from trusted place, vet it came from there, build if source, apply an act of faith to believe it's not malicious, and then run it.

It's just not buying a lot vs secure distribution with a diverse checking of source and testing of binary approach. Diverse, source checking against hashes and public keys is also super easy with prerequisite tools on most Linux distros I've seen. From there, the source or local tool might also be malicious. From there, you can do a C interpreter with checks, compile it with Softbound+CETS followed by strong sandboxing, whatever you want to do if avoiding building a whole, certifying compiler. Thing is, all these still have a compiler in the root-of-trust whose code might be incorrect or malicious even if binary output matches. I'm wanting to see a Hacker News post with 200-500 votes on countering that with an open-source project. Other than QBE and TCC which were a nice start, got my props, and I found here. :)

Note: I have other middle ground ideas, too. Maybe also rewriting an optimizing compiler like GCC, Clang, or SGI's OSS one to just use a subset of C a simple interpreter or C compiler can take. One user can verify by eye or hand or code themselves. Bug-for-bug matching to get that initial bootstrap. Maybe even pay AbsInt to use CompCert one time on initial bootstrap with simple, OSS knockoff allowed just for verification of that. There's precedent with TrustInSoft and PolarSSL. Still gonna need to rewrite them overtime for easier verification of correctness, effect of optimizations on security features, and covert/side channel production. Inevitable. Might as well start while top compilers are only (absurdly large number) lines of code instead of (that number + similarly large number) later.