I've tried again and again to like Nix, but at this point I have to throw in the towel.
I have 2 systems running Nix, and I'm afraid to touch them. I've already broken both of them enough that I had to reinstall from scratch in the past (yes yes - it's supposed to be impossible I know), and now I've forgotten most of it. In theory, Nix is idempotent and deterministic, but the problem is "deterministic in what way?" Unless you intimately understand what every dependent part is doing, you're going to get strange results and absolutely bizarre and unhelpful errors (or far more likely: nothing at all, with no feedback). Nix feels more like alchemy than science. Like trying to get random Lisp packages to play nice together.
Documentation is just plain AWFUL (as in: complete and technically accurate, but maddeningly obtuse), and tutorials only get you part of the way. The moment you step off the 80% path, you're in for a world of hurt, because the underlying components are just not built to support anything else. Sure, you can always "build your own", but this requires years of experiential knowledge and layers upon layers of frustration that I just don't want to deal with anymore (which is also why I left Gentoo all those years ago). And woe unto you if you want to use a more modern version than the distribution supports!
The strength of Docker is the chaos itself. You can easily build pretty much anything, without needing much more than a cursory understanding of the shell and your distro's package manager. Or you can mix and match whatever the hell you want! When things break, it's MUCH easier to diagnose and fix the problems because all of the tooling has been around for decades, which makes it mature enough to handle edge cases (and breakage is almost ALWAYS about edge cases).
Nix is more like Emacs: It can do absolutely anything if you have the patience for it and the deep, arcane knowledge to keep it from exploding in a brilliant flash of octarine. You either go full-in and drink the kool aid, or you keep it at arm's length - smiling and nodding as you back slowly towards the door whenever an enthusiast speaks.
I've gone down the same path. I love deterministic builds, and I think Docker's biggest fault is that to the average developer a Dockerfile _looks_ deterministic - and it even is for a while (build a container twice in a row on the same machine => same output), but then packages get updated in the package manager, base images get updated w/ the same tag, and when you rebuild a month later you get something completely different. Do that times 40 (the number of containers my team manages) and now fixing containers is a significant part of your job.
So in theory Nix would be perfect. But it's not, because it's so different. Get a tool from a vendor => won't work on Nix. Get an error => impossible to quickly find a solution on the web.
Anyway, out of that frustration I've funded https://www.stablebuild.com. Deterministic builds w/ Docker, but with containers built on Ubuntu, Debian or Alpine. Currently consists of an immutable Docker Hub pull-through cache, full daily copies of the Ubuntu/Debian/Alpine package registries, full daily copies of most popular PPAs, daily copies of the PyPI index (we do a lot of ML), and arbitrary immutable file/URL cache.
So far it's been the best of both worlds in my day job: easy to write, easy to debug, wide software compatibility, and we have seen 0 issues due to non-determinism in containers that we moved over to StableBuild in my day job.
I've work many years on bare metal. We did (by requirement) acceptance tests, so we did need deterministic builds, before such thing had even a name, or at least before it was mentioned as much as nowadays.
Redhat has a lot of tooling around versioning of mirrors, channels, releases, updates, etc. But I'm so old that even foreman and spacewalk didn't exist, redhat satellite was out of the budget, and the project was migrating from the first versions of CentOS to Debian.
What I did was simply use DNS + Vhosts (dev, stage, prod + versions) for our own package mirrors, and bash+rsync (and of course, raid+backups), with both, CentOS and Debian (and our project packages).
So we had repos like prod/v1.1.0, stage/v1.1.0, dev/v1.1.0, dev/v2.0.0, dev/2.0.1, etc Allowing us to rebuild things without praying, backport bug fixings with confidence, etc
Feels old and simple, however I think it was the same problem/issue that people gets now (re)building containers.
If you need to be able to produce the same output from the same input, you need the same input.
But also Nix solves more problems than Docker. For example if you need to use different versions of software for different projects. Nix lets you pick and choose the software that is visible in your current environment without having to build a new Docker image for every combination, which leads to a combinatorial explosion of images and is not practical.
But I also agree with all the flaws of Nix people are pointing out here.
I don't have any experience with Nix but regarding stable builds of Docker: we provide Java application, have all dependencies as fixed versions so when doing a release, if someone is not doing anything fishy (re-releasing particular version, which is bad-bad-bad) you will get exactly same binaries on top of the same image (again, considering you are not using `:latest` or somesuch)...
Until someone overwrites or deletes the Docker base image (regularly happens), or when you depend on some packages installed through apt - as you'll get the latest version (impossible to pin those).
I am convinced that any sort of free public service is fundamentally incomapatible with long term reproducible builds. It is simply unfair to expect free service to maintain archives forever and never clean them up, rename itself, or go out of business.
If you want reproducibility, the first step is to copy everything to a storage you control. Luckily, this is pretty cheap nowdays
> Until someone overwrites or deletes the Docker base image (regularly happens)
Any source of that claim?
> or when you depend on some packages installed through apt - as you'll get the latest version (impossible to pin those).
Well... please re-read my previous comment - we do Java thing so we use any JDK base image and then we slap our distribution on top of it (which are mostly fixed-version jars).
Of course if you are after perfection and require additional packages then you can install it via dpgk or somesuch but... do you really need that? What about security implications?
You gave example of nvidia and not ubuntu itself. What's more, you are referring to devel(opment) version, i.e. "1.0-devel-ubuntu20.04" which seems like a nightly so it's expected to be overriden (akin to "-SNAPSHOT" for java/maven)?
Besides, if you really need utmost stability you can use image digest instead of tag and you will always get exactly the same image...
Do you have an example that isn't Nvidia? They're infamous for terrible Linux support, so an egregious disregard for tag etiquette is entirely unsurprising.
> Anyway, out of that frustration I've funded https://www.stablebuild.com. Deterministic builds w/ Docker, but with containers built on Ubuntu, Debian or Alpine.
Another option for reproducible container images is https://github.com/reproducible-containers although you may need to cache package downloads yourself, depending on the distro you choose.
For Debian, Ubuntu, and Arch Linux there are official snapshots available so you don't need to cache package downloads yourself. For example, https://snapshot.debian.org/.
Yes, fantastic work. Downside is that snapshot.debian.org is extremely slow, times out / errors out regularly - very annoying. See also e.g. https://github.com/spesmilo/electrum/issues/8496 for complaints (but it's pretty apparent once you integrate this in your builds).
Yeah, but it's impossible to properly pin w/o running your own mirrors. Anything you install via apt is unpinnable, as old versions get removed when a new version is released; pinning multi-arch Docker base images is impossible because you can only pin on a tag which is not immutable (pinning on hashes is architecture dependent); Docker base images might get deleted (e.g. nvidia-cuda base images); pinning Python dependencies, even with a tool like Poetry is impossible, because people delete packages / versions from PyPI (e.g. jaxlib 0.4.1 this week); GitHub repos get deleted; the list goes on. So you need to mirror every dependency.
> Anything you install via apt is unpinnable, as old versions get removed when a new version is released
Huh, I have never had this issue with apt (Debian/Ubuntu) but frequently with apk/Alpine: The package's latest version this week gets deleted next week.
>Documentation is just plain AWFUL (as in: complete and technically accurate, but maddeningly obtuse)
Documentation is often just plain erroneous, especially for the new CLI and flakes, not even edge cases. I remember spending some time trying to understand why nix develop doesn't work like described and how to make it work like it should. I feel like nobody ever actually used it for its intended purpose. Turns out that by default it doesn't just drop you into the build-time environment like the docs claim (hermetically sealed with stdenv scripts available), it's not sealed by default and the commandline options have confusing naming, you need to fish out the knowledge from the sources to make it work. Plenty of little things like this.
>In theory, Nix is idempotent and deterministic
I surely wish they talked more about edge cases that break reproducibility. Things like floating point code being sensitive to the order of operations with state potentially leaking from OS preemption, and all that. Which might be obvious, but not saying obvious things explicitly is how you get people shoot themselves in the foot.
> Things like floating point code being sensitive to the order of operations with state potentially leaking from OS preemption, and all that.
That’s profoundly cursed and also something that doesn’t happen, to my knowledge. Unless the kernel programmer screwed up, an x86-64 FPU is perfectly virtualizable (and I expect an AArch64 FPU too, I just haven’t tried). So it doesn’t matter where preemtion happens.
(What did happen with x87 is that it likes to compute things in more precision than you requested, depending on how it’s configured—normally determined by the OS ABI. Yet variable spills usually happened in the declared precision, so you got different results depending on the particulars of the compiler’s register allocator. But that’s still a far cry from depending on preemption of all things, and anyway don’t use x87.
Floating-point computation does depend on associativity, in that nearestfp(nearestfp(a+b)+c) is not the same as nearestfp(a+nearestfp(b+c)), but the sane default state is that the compiler will reproduce the source code as written, without reassociating things behind your back.)
That's doesn't happen in a single thread, but e.g. asynchronous multithreaded code can spit values in arbitrary order, and depending on what you do you can end up with a different result (floating point is just an example). Generally, you can't guarantee 100% reproducibility for uncooperative code because there's too much hardware state that can't be isolated even in a VM. Sure, 99% software doesn't depend on it or do cursed stuff like microarchitecture probing during building, and you won't care until you try to package some automated tests for a game physics engine or something like that. What can happen, inevitably happens.
We don't need to be looking for such contrived examples actually, nixpkgs track the packages that fail to reproduce for much more trivial reasons. There aren't many of them, but they exist:
> We don't need to be looking for such contrived examples actually, nixpkgs track the packages that fail to reproduce for much more trivial reasons. There aren't many of them, but they exist
Less than a couple of thousand packages are reproduced. Nobody has even attempted to rebuild the entirety of the nixpkgs repository and I'd make a decent wager on it being close to impossible.
It’s really not that bad. However, with a standard NixOS setup, you still have a tremendous amount of non-reproducible state, both inside user accounts and in the system. I’m running a “Erase your darlings” setup, it mostly gets rid of non-reproducible state outside my user account. It’s a bit of a pain, but then what isn’t on NixOS.
That setup uses Home Manager, so maybe it's not for you, but worth mentioning if we're talking about making all state declarative and reproducible. You have to use the Impermanence module and set up some soft links to permanent home folders on different drive or partition. But for making all state on the system reproducible and declarative, this is the best way afaik.
True, I think it's more a more elegant setup than the ZFS version. Why actively rollback to a snapshot when ephemeral memory will do that automatically on reboot.
That said I'll just mention that ZFS support on NixOS is like nothing else I've seen in Linux. ZFS is like a first-class citizen on NixOS, painless to configure and usually just works like any other filesystem.
I use both Docker and NixOs at work. I've never had any of the problems you seemed to have above. Docker is fine, performance wise it's not great on Macs. I love nix because it's trivial to get something to install and behave the same across different machines.
Nix Doc are horrible but I've found that ChatGPT4 is awesome at troubleshooting Nix issues.
I feel like 90% of the time I run into Nix issues, it's because I decided to do something "Not the Nix way."
Give a try to Fedora Atomic (immutable). At this point I have pretty much played around and used every distro package maneger there is and I have broken all of them in one way or another even without doing something exotic (pacman I am looking at you). My Fedora Kinoite is still going strong even with adding/removing different layers, daily updates, and a rebase from Silverblue. Imho rpm-ostree will obsolete Nix.
You have to restart to boot into a new image. You use containers for stuff you don't need into your base distro, like cli tools, and flatpak for any desktop applications.
> Documentation is just plain AWFUL (as in: complete and technically accurate, but maddeningly obtuse)
That has been the case for as long as I can remember. I gave up on Nix about 5 years ago because of it, and apparently not much has changed on that front since then..
I never tried going all in on Nix, but I don't think it's an all or nothing proposition. In my case, I use Ubuntu for my personal notebook and I wanted to prototype something with Elixir. The distro package is versions behind latest so I can't use Phoenix 1.7 with it. The solution was simple: there's a Nix package for the latest version, so I simply used nix-shell. Bonus points for having VSCode so I didn't have to install it on my personal machine. So for the price of running <nix-shell -p vscode erlang elixir> I got all I needed with very minimal fuss.
I've been a nixos user for years and I generally had the opposite problem: the latest of the package you want is not available but hey here's a version from months ago - or just build it yourself (which is not hard, oftentimes updates work fine with no build change, you just point at a different version).
Also rebuilding everything at every update take forever (I had a few nix-shells with ai dependencies that would take hours to upgrade).
I love the concept of nix but I'm back to Arch, binary bleeding edge packages and AUR for less supported stuff.
I recently faced a similar hurdle with Nix, particularly when trying to run a .NET 8 AOT application. What initially seemed like it would be a simple setup spiraled into a plethora of issues, ultimately forcing me to back down. I found myself having to abandon the AOT method in favor of a more straightforward solution. To give credit where it's due, .NET AOT is relatively new and, as far as I know, still has several kinks that need ironing out. Nonetheless, I agree that, at least based on my experience, you need a solid understanding of the ins and outs before you can be reasonably productive using Nix.
.NET AOT really is not designed for deployment, in my experience - for example, the compilation is very hard to do in Nix-land, because a critical part of the compilation is to download a compiler from NuGet at build-time. It's archetypical of the thousand ways that .NET drives me nuts in general.
It's intended for 'cloud-native' deployments, as I understand it, so I concur that it's quite disappointing. The concept of downloading compilers via NuGet doesn't sit well with me either. However, I've observed performance enhancements in applications compiled AOT, and I remain optimistic that future versions of .NET will bring further improvements.
I’m not sure exactly why this is being downvoted. It seems pretty fair to want your container builds to not fail because of the “chaos” with docker images and how they change quite a lot. This isn’t about the freedom to build how you want, it’s about securing your build pipelines so that they don’t break at 4am because docker only builds 99% of the time.
I’ll use docker, I like docker, but I can see the point of how it’s not necessarily advantageous if stability is your main goal.
It's more complicated than that. Reproducible builds help build confidence that your build process isn't compromised.
Sure, your compiler, your hardware, or your distro might be compromised, but if you follow the chain all the way through you does indeed validate version X does result in SHA y, there's now less things were blindly trusting.
It also helps with things like rolling back to earlier versions when you don't still have the binary kicking around without having to revalidate the binary.
If you're not getting the same SHA on different hardware, weeks apart, even if it's good enough for you, it's not reproducible
You complain about the documentation, and the first thing I wonder is if you’ve tried using one of the prominent chatbots like chatgpt or claude to help fill in the gaps of said documentation? Maybe an obvious thing to do around here, but I’ve found they help fill in documentation gaps really well. At the same time Nix is so niche there might not have been enough information out there to feed into even chatgpt’s model…
>I've already broken both of them enough that I had to reinstall from scratch in the past (yes yes - it's supposed to be impossible I know)
Could you mention a bit about how they broke? I'm curious to see how that state looks, as from my perspective switching to a previous configuration seems to cover everything.
I have 2 systems running Nix, and I'm afraid to touch them. I've already broken both of them enough that I had to reinstall from scratch in the past (yes yes - it's supposed to be impossible I know), and now I've forgotten most of it. In theory, Nix is idempotent and deterministic, but the problem is "deterministic in what way?" Unless you intimately understand what every dependent part is doing, you're going to get strange results and absolutely bizarre and unhelpful errors (or far more likely: nothing at all, with no feedback). Nix feels more like alchemy than science. Like trying to get random Lisp packages to play nice together.
Documentation is just plain AWFUL (as in: complete and technically accurate, but maddeningly obtuse), and tutorials only get you part of the way. The moment you step off the 80% path, you're in for a world of hurt, because the underlying components are just not built to support anything else. Sure, you can always "build your own", but this requires years of experiential knowledge and layers upon layers of frustration that I just don't want to deal with anymore (which is also why I left Gentoo all those years ago). And woe unto you if you want to use a more modern version than the distribution supports!
The strength of Docker is the chaos itself. You can easily build pretty much anything, without needing much more than a cursory understanding of the shell and your distro's package manager. Or you can mix and match whatever the hell you want! When things break, it's MUCH easier to diagnose and fix the problems because all of the tooling has been around for decades, which makes it mature enough to handle edge cases (and breakage is almost ALWAYS about edge cases).
Nix is more like Emacs: It can do absolutely anything if you have the patience for it and the deep, arcane knowledge to keep it from exploding in a brilliant flash of octarine. You either go full-in and drink the kool aid, or you keep it at arm's length - smiling and nodding as you back slowly towards the door whenever an enthusiast speaks.