And why Debian is not reproducible?

0x0 · on Aug 24, 2017

On almost any system, if you use a compiler to build a binary from source code twice in a row, you will get two different binaries. Simple things like using "__DATE__" and "__TIME__" macros in C, or the linker embedding a link timestamp in a executable file header, will trigger this even in the same environment. Moving between machines is even trickier, as __FILE__ and __DIR__ and a ton of other inputs conspire to create changes in the output.

It's hard work to set up a project where you can actually get exactly the same binary over and over again.

I think asking "why is Debian not reproducible" is missing the mark a little - when everything else (Windows, macOS, FreeBSD, etc etc etc) is probably not reproducible, the better question is perhaps "why is Debian trying to be reproducible, and why aren't other projects talking about this just as much" :)

MrQuincle · on Aug 24, 2017

https://wiki.debian.org/ReproducibleBuilds

"With this we can detect problems related to timestamps, file ordering, CPU usage, (pseudo-)randomness and other things"

https://tests.reproducible-builds.org/debian/index_variation...

wiz21c · on Aug 24, 2017

because during the compilation, some package use date-dependent stuff, for example (I dunno the exact details though)

vertex-four · on Aug 24, 2017

Also some build systems create artifacts as a result of timing-dependent algorithms. Simply put, if two things A and B run simultaneously, and A completes before B, then in some compilers/build systems/etc the result can be different from B completing before A. GHC, as a well-known example, suffers from this problem.

Daviey · on Aug 24, 2017

Often, package is built using a certain library version (from a different package), that library is then updated - and the new package cannot build using the new library.

Or... deeper parts of the compiler toolchain change, and the application doesn't re-build without changes.

vertex-four · on Aug 24, 2017

That has nothing to do with reproducible builds - the issue being solved by reproducible builds is that even in exactly the same environment, running the same build script twice can result in differences in the output.

noselasd · on Aug 24, 2017

Surely it can't have "nothing to do" ? Many debian source packages specifies dependencies using >= , something has to account for performing a build using the same minor version of such a dependency.

vertex-four · on Aug 24, 2017

Yes, but that's a different problem from reproducible builds. The only thing that reproducible builds is solving is ensuring that the same package built with the same dependencies in the same environment results in the same output.

cstrahan · on Aug 25, 2017

I'm afraid you're mistaken.

Taken directly from https://reproducible-builds.org/:

"Second, the set of tools used to perform the build and more generally the build environment should either be recorded or pre-defined"

FWIW, I'm a committer on a Linux distro specifically constructed to guarantee build reproducibility (nixos.org), so I'm pretty sure I know what is generally meant by "reproducible builds" in common industry vernacular. Byte-for-byte is important, but that's hardly the whole picture.

wtallis · on Aug 24, 2017

If the dependency is packaged separately, then you typically wouldn't expect minor version changes in the dependency to affect the contents of the package being built. If there are major changes to the header files being exported or if there's static linking involved then changes are to be expected. But if not, you'd expect the changes to show up in the dynamically linked process image, not the on-disk package.

noselasd · on Aug 24, 2017

It really doesn't take big changes to a library that makes the relocation table of an executable linking to it be a tad different

talaketu · on Aug 25, 2017

hermetic build is part of reproducible build, as far as the general concepts go.