Hacker News new | past | comments | ask | show | jobs | submit login

And why Debian is not reproducible?



On almost any system, if you use a compiler to build a binary from source code twice in a row, you will get two different binaries. Simple things like using "__DATE__" and "__TIME__" macros in C, or the linker embedding a link timestamp in a executable file header, will trigger this even in the same environment. Moving between machines is even trickier, as __FILE__ and __DIR__ and a ton of other inputs conspire to create changes in the output.

It's hard work to set up a project where you can actually get exactly the same binary over and over again.

I think asking "why is Debian not reproducible" is missing the mark a little - when everything else (Windows, macOS, FreeBSD, etc etc etc) is probably not reproducible, the better question is perhaps "why is Debian trying to be reproducible, and why aren't other projects talking about this just as much" :)


https://wiki.debian.org/ReproducibleBuilds

"With this we can detect problems related to timestamps, file ordering, CPU usage, (pseudo-)randomness and other things"

https://tests.reproducible-builds.org/debian/index_variation...


because during the compilation, some package use date-dependent stuff, for example (I dunno the exact details though)


Also some build systems create artifacts as a result of timing-dependent algorithms. Simply put, if two things A and B run simultaneously, and A completes before B, then in some compilers/build systems/etc the result can be different from B completing before A. GHC, as a well-known example, suffers from this problem.


Often, package is built using a certain library version (from a different package), that library is then updated - and the new package cannot build using the new library.

Or... deeper parts of the compiler toolchain change, and the application doesn't re-build without changes.


That has nothing to do with reproducible builds - the issue being solved by reproducible builds is that even in exactly the same environment, running the same build script twice can result in differences in the output.


Surely it can't have "nothing to do" ? Many debian source packages specifies dependencies using >= , something has to account for performing a build using the same minor version of such a dependency.


Yes, but that's a different problem from reproducible builds. The only thing that reproducible builds is solving is ensuring that the same package built with the same dependencies in the same environment results in the same output.


I'm afraid you're mistaken.

Taken directly from https://reproducible-builds.org/:

"Second, the set of tools used to perform the build and more generally the build environment should either be recorded or pre-defined"

FWIW, I'm a committer on a Linux distro specifically constructed to guarantee build reproducibility (nixos.org), so I'm pretty sure I know what is generally meant by "reproducible builds" in common industry vernacular. Byte-for-byte is important, but that's hardly the whole picture.


If the dependency is packaged separately, then you typically wouldn't expect minor version changes in the dependency to affect the contents of the package being built. If there are major changes to the header files being exported or if there's static linking involved then changes are to be expected. But if not, you'd expect the changes to show up in the dynamically linked process image, not the on-disk package.


It really doesn't take big changes to a library that makes the relocation table of an executable linking to it be a tad different


hermetic build is part of reproducible build, as far as the general concepts go.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: