Upgrading from Debian Jessie to Bullseye after nearly 30 years

antod · on July 28, 2022

I found that a 'refreshing' read - reveling in the old-school sysadmin wizardry, after so many years of livestock-over-pets thinking. This server sounds like the ultimate pet.

It's just nostalgia though, I can't imagine managing systems that way (even at the less skilled level I had) any more.

But at the same time it won't stop me admiring those skills - it's like watching a traditional craftsman using old tools.

Nextgrid · on July 28, 2022

I can't imagine managing lots of such systems, but a handful sounds doable, which is all you need sometimes.

I'm currently speccing out a system for an internal application that processes huge amounts of data. So far the plan is to just use standard Postgres on Debian on a huge "pet" server with hundreds of GBs of RAM and redundant 4TB NVME drives and call it a day. It's sized for peak load so no need for any kind of scaling beyond that, and it's a single machine using well-tested software (and default configs whenever possible) so maintenance should be minimal (it'll also be isolated onto its own network and only ever accessed by trusted users so the need for timely updates is minimal too).

bayindirh · on July 28, 2022

It's doable, believe me. Linux was always reliable, but its reliability improvements didn't stop over the years, so keeping a lot servers up to date, and running smoothly is easier than 15 years ago.

We have a lot of pets and cattle servers. Cattles can be installed in 150+ batches in 15-20 minutes or so, with zero intervention after initial triggering.

Pets are rarely re-installed. We generally upgrade them over generations, and they don't consume much time after initial configuration which are well documented.

I prefer to manage some of them via Salt, but it's rather an exercise for understanding how these things work, rather than a must.

In today's world, hardware is less resilient than the Linux installation running on top of it, so if you are going to build such monolith, spec it to be as redundant and hot-swappable as much as you can. Otherwise murphy may find ways to break your hardware in creative ways.

metadat · on July 28, 2022

Do Debian package versions still lag far behind the current public stable version (i.e. by years)?

Postgres gets solid updates seemingly bi-annually or at least annually.

Wowfunhappy · on July 28, 2022

They should only be years behind given very unlucky timing, ie, upstream releases a new version right after the stable feature freeze.

Whatever the latest version is at the time of Debian's feature freeze, that will be the version for the life of that Debian release. That's basically the point of Debian—the world will never change out from under you.

ac29 · on July 28, 2022

> They should only be years behind given very unlucky timing, ie, upstream releases a new version right after the stable feature freeze.

Literally the first package I looked up is shipping a January 2020 version in bullseye despite freezes not starting until 2021. And yes, there were additional stable releases in 2020.

Wowfunhappy · on July 28, 2022

What package? There's probably a story there.

ac29 · on July 28, 2022

mpv, a reasonably popular media player.

sigio · on July 28, 2022

And with debian's release cycle being a lot shorter these days, most packages should be more then new enough.

If needed you can get kernel-packages and stuff like browsers from backports or some 3rd party repositories tailored to bullseye.

riku_iki · on July 28, 2022

Postgres provides apt repo with fresh versions, which you can add to your debian.

diffeomorphism · on July 28, 2022

Stable is released every 2 years, so <=2 at most, and yes on purpose? Isn't that kinda the whole point of releases like Windows LTSC, Red Hat, etc.? That you actively do not want these updates, only security fixes?

anecdotal1 · on July 29, 2022

Backporting security fixes is forking the software though.

There have been instances where upstream and Debian frozen version have drifted far enough apart that the security backport was done incorrectly and introduced a new CVE. Off the top of my head this happens for Apache more than once.

I for one appreciate the BSD "OS and packages are separate" so my software can be updated but my OS is stable

diffeomorphism · on July 30, 2022

For apache I never heard about that. Instead the issue I heard about was that debian organises/manages apache quite differently, nothing about version drift.

> BSD

No thanks, I want my software to be stable too.

baq · on July 28, 2022

Put a replacement PSU, a few fans and a small box of replacement drives in the same order.

edit: Make that 2 PSUs just to be sure...

Nextgrid · on July 29, 2022

This will be hosted by Hetzner, OVH or a suitable equivalent, so the “SLA” is based on assuming that they’ll rectify any hardware failures within 2 days. I this case I’ll gamble on backups with the idea that in the worst case scenario it takes us less that an hour or rebuild the machine on a different provider such as AWS.

The machine itself is only really required for a few days every quarter, with a few days’ worth of leeway if we fail. Therefore I feel this is a acceptable risk.

briffle · on July 28, 2022

that sounds like a fun project, but you definately want to still automate its setup with something like ansible, saltstack, puppet, etc.

Because someday, you'll get a new pet, with much more CPU power you'll want to migrate to. Or maybe rather than upgrade to a newer version, plus reconfigure disks, etc, its just easier to move to a new system, etc. Or the system just plain dies, DC burns down, etc, and you need to quickly use DR to get it setup on a new system. Having all those configs, settings, application's you install, etc, defined in a tool like ansible, and then checked into git is just about priceless especially for pets or snowflakes.

Nextgrid · on July 29, 2022

I agree, learning Ansible (or equivalent) is on my todo list.

I the meantime, a document (and maybe a shell script) with commands explaining how to reinstall the machine from an environment provided by the hosting provider (a PXE-booted Debian) is enough considering the machine is only critically required for a few days every quarter and needs only softwares that’s already packaged by the distribution.

jrumbut · on July 28, 2022

I was amazed how little work was required, then I thought about what "exim4 has a new taint system" might have looked like if you tried the upgrade not knowing what exim4 was.

Real *nix expertise is spooky to watch sometimes!

bayindirh · on July 28, 2022

When you manage servers for this long, all knowledge starts to compound fast. Many scary looking messages transform to, "Oh, you need X, Y, Z? OK. Let's do it".

tryauuum · on July 28, 2022

The "livestock vs pets" comparison seems off. It's assumed with the livestock you can lose one server and don't care much – though with real animal livestock if your cow gets ill you don't kill it and order more healthy cows.

And the comparison also assumes you cannot kill the "pet" server. I have many pet servers with carefully chosen names, but I still can painlessly kill them and redeploy with the same name because I have Ansible or SaltStack code to do so

bayindirh · on July 28, 2022

The term originates from CERN mostly, which does HPC stuff in its data center. We are also an HPC center, and it's very fitting.

The cattle servers are generally HPC worker nodes. Your users don't notice when a cattle server goes offline. Scheduler reschedules the lost jobs with high priority, so they restart/resume soon.

But, pet servers on the other hand are generally coordinators of the cattle, like your shepherd dogs, keeping them in order, or giving them orders. Losing them is really creates bigger problems, and you need to tend them quick, even if you have failovers, etc. (which we certainly have).

You can re-deploy a pet server pretty quickly, but they generally have an uptime of ~2 years, and reinstallation periods of 5-6 years, if ever. We upgrade them as much as we can.

d110af5ccf · on July 28, 2022

> I still can painlessly kill them and redeploy with the same name because I have Ansible or SaltStack code to do so

those don't sound like pets to me ...

tetha · on July 28, 2022

> The "livestock vs pets" comparison seems off. It's assumed with the livestock you can lose one server and don't care much – though with real animal livestock if your cow gets ill you don't kill it and order more healthy cows.

Honestly, in most environments, it's like that. You don't delete a postgres server because a component crashed weirdly. You take a look at that component and see if there is a deeper reason for that crash and if there is a more important root cause to fix. That would prevent issues on a lot of other systems.

However, it's important to have the option to delete and rebuild the server. For example, we had a root drive corruption cause by some storage issues at the hoster on the server and binaries would crash in weird ways. At that point, I probably could fix the server by syncing binaries from other systems and such, but it's much easier to just drop it and rebuild it.

And that's very much how larger groups of animals are handled.

> And the comparison also assumes you cannot kill the "pet" server. I have many pet servers with carefully chosen names, but I still can painlessly kill them and redeploy with the same name because I have Ansible or SaltStack code to do so

Those don't sound like pets. For historic reasons, I have systems on which external agencies and consultancies have done things outside of the configuration management I don't know. And given the house of cards piled up on some of the systems, I don't think anyone knows how to redo that system. That's a pet. Once I delete that system, it never comes back the same way.

_abox · on July 28, 2022

I still manage mine as 'pets'. I have no intention of becoming a farmer :)

But I don't do anything with cloud at work.

bayindirh · on July 28, 2022

You don't become a farmer. Cattle "arrives" if your task calls for it. :)

sdoering · on July 28, 2022

Sorry to derail, but I really am irked by this cattle VS. pets analogy.

I know that a lot of meat is industrially produced with little regard for the animal well-being (livestock). And I know quite a lot farmers growing their herds organically and naming every single one. Quite a few farmers told me they would never eat meat if they didn't know the name of the animal it came from. They know the character of every single animal in their herds.

So to me this analogy only works as long as we disregard the fact that these animals have unique characters. And that imho counters the analogy.

I prefer bots VS. pets to differentiate the two sides.

d110af5ccf · on July 28, 2022

Unique character has nothing to do with it. They may have names but they're still livestock - raised en masse to be sold. Pets don't just have names, they're akin to family members. Livestock don't hang out on the couch with you while you watch TV.

Production systems are very much like livestock - spun up to serve a purpose. Getting overly attached as you might with a personal pet system is probably a mistake that will reduce your efficiency.

bayindirh · on July 28, 2022

> So to me this analogy only works as long as we disregard the fact that these animals have unique characters. And that imho counters the analogy.

No, no... You are not mistaken. Every server in your "cattle" fleet has its own character after some point. Some of them eat through disks, one of them has a wonky Ethernet not quite broken, other one is always a little slower than the rest.

On a more serious note, I really understand what you're saying, but I'd rather not discuss it here, but the above paragraph really holds.

dade_ · on July 28, 2022

But the farmer still kill the cow just the same for slaughter, and another one replaces it. Please meet Sally2, I say to my 10 year old pet dog. Note, the farmer doesn’t have any bots and even the tractor is a pet.

Analogy is fine and I prefer to get these points of view on Portlandia.

kqr · on July 28, 2022

I actually had a similar upgrade to do that I've been postponing for too long. I ended up rebuilding all functionality on the server on a different server -- this time also creating the necessary automation for both provisioning and migrating, so I can rebuild it much cheaper next time.

Did it take longer than slogging through a dist-upgrade? Probably. Does it cause some more inconvenience for users, like changed host keys and upgrades of software that might not have needed upgrades? Yup, definitely.

Does it let me sleep well at night knowing I'm no longer dependent on a specific machine staying un-broken in the near future? Sure as hell.

----

(I also ended up splitting apart the independent functions the machine provides onto multiple machines, one for each purpose. This increases my exposure to machine failure, but simultaneously limits the blast radius. That's a good trade-off, for the most part.)

forgetfulness · on July 28, 2022

Did you use a configuration management system like Puppet or Chef, or did you roll your own with scripts?

kqr · on July 28, 2022

Configuration management, mainly to make it slightly more cross-platform in case I would use a different Linux distro or whatnot next time around.

m463 · on July 29, 2022

What has helped me most of all is just keeping a log - nothing special maybe readme.txt - of what I did when creating a system, or configuring a package.

Nux · on July 27, 2022

Ha, been visiting this famous machine for decades to download Putty. Small world.

https://www.chiark.greenend.org.uk/~sgtatham/putty/latest.ht...

_abox · on July 28, 2022

Yeah I was wondering whether it was that same chiark. I didn't have time to get to the end of the article.

A bit worrying to hear that important security software was hosted on a 30 year old OS.

beermonster · on July 28, 2022

The HN title is misleading as it was actually upgraded from Debian 8 "Jessie" to Bullseye.

Debian 8 "jessie" support reached its end-of-life on June 30, 2020, five years after its initial release on April 26, 2015.

However that does mean it was potentially vulnerable to critical vulnerabilities for more than two years whilst world+dog used it to download PuTTY for secure access to their servers. Eek

Would be good to know how this was managed if at all.

adw · on July 28, 2022

TFA covers this (extended security coverage from a third party provider).

blibble · on July 28, 2022

I'd be more worried about the 200 shell accounts

that machine has definitely been hacked

xnorswap · on July 28, 2022

For many years it was only a http site and didn't even have https.

tecleandor · on July 28, 2022

Oooooh, the name sounded familiar but I didn't know why, as it's been some years since I used Windows regularly. Nice :)

foobarian · on July 27, 2022

Slightly clickbaity:

"last major OS upgrade was to jessie (Debian 8, released in April 2015). That was in 2016."

"after nearly 30 years running Debian i386"

But impressive nonetheless!

GuB-42 · on July 27, 2022

Clickbait indeed, I actually clicked because I knew it was wrong.

The true story is that the server started in 1993 as Debian 0.93R5, getting upgrades from time to time, the last one being Jessie to Bullseye.

I expected some bullshit or parody, but here, it is genuinely impressive.

josephcsible · on July 28, 2022

But i386 is the architecture, and until now, none of the upgrades have changed that.

joeyh · on July 27, 2022

The actual title is not the inspid word soup deposited atop this thread.

bscphil · on July 27, 2022

Yeah @dang this title is terrible. Maybe "Upgrading from x86 Debian Jessie directly to x64 Bullseye" would be better?

jklinger410 · on July 28, 2022

The original title is: chiark’s skip-skip-cross-up-grade

Since that isn't very informative, I tried to simplify the first sentence: Two weeks ago I upgraded chiark from Debian jessie i386 to bullseye amd64, after nearly 30 years running Debian i386. This went really quite well, in fact!

Realizing now the feet is going from i386 to amd64, rather than an old debian version upgrade.

Unfortunately I can't change the title.

rob_c · on July 28, 2022

Yeah, my big feeling is it's become a philosophical question as to where it's the same server after so many updates/upgrades.

fanf2 · on July 27, 2022

chiark hosts Simon Tatham’s homepage (including PuTTY, his puzzle collection, his notes on C preprocessing hacks, all of which often appear on this site) and my homepage (https://dotat.at) amongst many other things.

Here is Markus Kuhn commenting on my tweet about this article: https://twitter.com/mgk25/status/1550386849071390721

mjg59 · on July 28, 2022

  mgarrett@chiark:~$ file /lib/libc.so.4.6.27 
  /lib/libc.so.4.6.27: Linux/i386 demand-paged executable (QMAGIC), stripped
  mgarrett@chiark:~$ dpkg -S /lib/libc.so.4.6.27 
  libc4:i386: /lib/libc.so.4.6.27

Unfortunately the installed kernel has no support for a.out binaries, so nothing can actually run against it any more :(

fanf2 · on July 28, 2022

There were (are?) some old a.out binaries lurking in /usr/local which occasionally caused amused remarks on irc when someone found one :-)

hobo_mark · on July 28, 2022

TIL, looks like a.out support was removed just a few months ago: https://news.ycombinator.com/item?id=30792059

superkuh · on July 27, 2022

Too bad dreamwidth hosted blogs are behind cloudflare and set to block non-corporate (or older) browsers. Try to read this post from a Debian Jessie system and you'll just get a cloudflare page that doesn't work.

jeroenhd · on July 27, 2022

The page loads fine in Ladybird[1] on Arch. It's the browser purpose-built for SerenityOS[2] using a in-house HTTP/JS/TLS engine that hasn't matured to the point of practical usability yet. If I were a site administrator using some kind of weird metric to block a browser, this thing would definitely go on the blacklist.

As for a more common uncommon browser, GNOME Web (WebKit) also works fine.

Whatever is causing you to get blocked, it's not the browser engine you're using. Check your plugins, antivirus, MITM engines, and whatever else messes with your connection. It could also be a simple IP block because of a bad IP neighbour or a shared CGNAT server.

[1]: https://github.com/awesomekling/ladybird

[2]: https://serenityos.org/

superkuh · on July 28, 2022

I tried via 3 different routable IPv4s from different netblocks. I tried the same browser on 3 different physical computers and OS installs.

I get that "It works for me." for some of you with non-corporate browsers. But please understand "It doesn't work for me." and it's not because I have some weird antivirus or packet mangling or a bad IP. It's because Cloudflare's heuteristics are biased against browsers that implement some, but not the latest, JS features. That's cloudflare and dreamwidth's fault, not mine, and they are in the wrong.

Blocking is bad by default and they must justify and adapt, not the users.

jeroenhd · on July 28, 2022

"It works for me" is as useless as "it blocks all non-corporate browsers".

It doesn't block all non-corporate browsers. It apparently blocks your browser, whatever that may be, running from your system, communicating from your network. I don't know what happened to make Cloudflare hate your browser, but my blanket statements are as useful as yours.

They seem to be blocking elinks from non-residential (server) networks. I don't know why so I don't know if it's warranted or not. With the amount of bots Cloudflare has to deal with and the extreme minority of elinks2 users, I can imagine blocking them is a worthwhile tradeoff.

Either way, Cloudflare only provides the defaults, the website operator is responsible for its configuration. In my opinion, a website should be allowed to inconvenience the long tail of weird visitors for any reason they want. I understand that you disagree, but you'll have to convince support@dreamwidth.org if you want to improve the situation, not me.

Rediscover · on July 28, 2022

FWIW, it works on lynx version 2.8.9rel.1 which possibly could be considered non-corporate and slightly old (this version is from 20180708).

1vuio0pswjnm7 · on July 27, 2022

Here is the response I got.

   the route "/11840.html" is not recognized

Internet Archive works for Dreamwidth sites. For me, I add one line to a text file and the localhost forward proxy prefixes the URLs automatically.

https://web.archive.org/web/20220719195142if_/https://diziet...

FWIW, I use a non-corporate browser.

marttt · on July 28, 2022

Thanks for your comment. I realized now that achive.org's "archive string" (here 20220719195142if_) is updated automatically. So if I use this string + some other URL, then I get redirected to a current snapshot of that other site, e.g.

https://web.archive.org/web/20220719195142if_/http://ranprie...

points me to

https://web.archive.org/web/20220723235055if_/https://ranpri...

I suppose the string consists of date + time in hhmmss format + if_? Anyhow, looks like arbitrary strings (e.g. 19991230225818if_) also get redirected to the next existing snapshot counting from that string. This is really nice and simple for text browser scripts.

Is there some straightforward way to list all of archive.org's snapshots (of a particular site) without a javascript-enabled browser?

jwilk · on July 28, 2022

> Is there some straightforward way to list all of archive.org's snapshots (of a particular site) without a javascript-enabled browser?

I use https://github.com/jsvine/waybackpack.

  $ waybackpack --list https://diziet.dreamwidth.org/11840.html
  ...
  https://web.archive.org/web/20220727234836/https://diziet.dreamwidth.org/11840.html
  https://web.archive.org/web/20220728045504/https://diziet.dreamwidth.org/11840.html
  https://web.archive.org/web/20220728084126/https://diziet.dreamwidth.org/11840.html

1vuio0pswjnm7 · on July 28, 2022

"Is there some straightforward way to list all of archive.org's snapshots (of a particular site) without a javascript-enabled browser?"

https://archive.org/services/docs/api/wayback-cdx-server.htm...

FWIW, below is a quick and dirty script I use for a variety of purposes, such as accessing www search result URLs so I do not have to (a) use sites that do not support TLS1.3, (b) use sites that require SNI or (c) use DNS. I will call this script "www".

Example usage:

    alias links="links -no-connect"
    x=https://ranprieur.com
    # retrieve first 5 snapshots (default)
    echo $x|www >1.htm
    # retrieve first 3 snapshots
    echo $x|www 3 >1.htm
    # retrieve last 3 snapshots
    echo $x|www -3 >1.htm 
    # retrieve all snapshots
    echo $x|www 0 >1.htm
    links 1.htm

    #!/bin/sh

    LIMIT=${1-5}; 
    read x0; 
    x0=$(echo $x0|sed 's/%/&&/g');
    x1=web.archive.org;
    curl -A "" "https://$x1/cdx/search/cdx?url=$x0&fl=timestamp,original&limit=$LIMIT&showDupeCount=true" \
    |(echo "<h2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;${x0}</h2><ol><pre>";sed -n "/[0-9]\{14\} [hf]/{s/\(.* \)\(.*\)/<li><a href=https:\/\/${x1}\/web\/\1if_\/\2>\1<\/a>/;s/ </</;s/ //2p;}");
    echo "</ol></pre>";

NB. I used curl here because this is an example for HN. That does not mean I am a curl user.

I also have a small script I use for the Common Crawl archives. They also use CDX but the results are WARC files compressed with gzip. I wrote a small program in C to extract the gzip'd results after HTTP/1.1 pipelining. For retrieving results without pipelining (i.e., many TCP connections), I modified tnftp to accept a Range header.

marttt · on July 28, 2022

This is excellent, many thanks for sharing.

I don't know how I didn't think that Wayback Machine might maybe also have an API. :/ Also, lots of interesting stuff for things like the above on Common Crawl: https://commoncrawl.org/the-data/examples/

I guess my text-only browsing just got a bunch of extra batteries (thus far simply w3m + a few wget-etc scripts).

easrng · on July 28, 2022

The number is a timestamp, and the if_ just hides the toolbar, it's optional (It presumably stands for IFrame, since that's what it's used for (rewriting iframe src attributes so they don't show the toolbar))

jonathantf2 · on July 27, 2022

Reading this from a Debian bullseye system just fine.

dvfjsdhgfv · on July 27, 2022

This is very wrong. If someone at DW is reading this, please don't do that.

acdha · on July 27, 2022

I’d want confirmation that it’s true first. People can configure custom UA blocking rules on Cloudflare but based on this I’d bet the problem is some custom configuration or plug-in interfering with the normal human activity challenge.

https://www.webpagetest.org/result/220727_AiDcRG_FMH/

codedokode · on July 28, 2022

I think that it would be better if I could use 64-bit kernel with 32-bit applications.

32-bit applications are better in almost every aspect: they use smaller ints and pointers and use less memory, they can be used on both 32-bit and 64-bit systems, they don't waste energy on computing extra bits. The only disadvantage is that they cannot use more than 2Gb of RAM, but almost every aplication I use doesn't need so much RAM. On x86, 32-bit apps also have slightly less GPRs.

So the perfect solution would be a 64-bit kernel with mostly 32-bit applications except for those cases when you need more than 2Gb of RAM.

But instead many developers refuse to provide 32-bit versions of applications for Linux, and provide worse 64-bit versions. And Linux distributions compile everything for 64-bit, even if the application is never supposed to use even 1 Gb of RAM.

So everything is wrong with 64-bit support in Linux.

bayindirh · on July 28, 2022

You can multi-arch a distribution, also there's x64_32 (aka x32) which does exactly what you propose [0].

Running the processor in x64 mode allows wider swaths of data to be processed per clock, which really allows higher performance in data-heavy/CPU bound cases, but horses for courses, definitely.

[0]: https://en.wikipedia.org/wiki/X32_ABI

codedokode · on July 28, 2022

> You can multi-arch a distribution

I use multi-arch, but it is buggy. Several times, when trying to install a 64-bit library, apt was suggesting to remove tens or hundreds packages to satisfy the dependencies. That included the packages that I have installed manually and it is strange that apt tries to remove them.

Also, when using multi-arch, you can accidentally install a 32-bit kernel and then half of applications stop working.

ranma42 · on July 28, 2022

x64_32 never really took off and there was some discussion of removing the support in the past: https://lwn.net/Articles/774734/

josteink · on July 28, 2022

Any particular reason for that? Anyone knows?

Anyone has actual experience using it? Is it hard to set up? Does it have compatibility issues with plain x64? Gains not worth additional complexity?

I’ve really never seen it used or any guides on how to create a system/software built that way.

tester756 · on July 28, 2022

Wow,

That feels like reading that post on why Visual Studio wasn't 64 bit back in the days

https://web.archive.org/web/20160309232651/http://blogs.msdn...

BTW: Recently VS started being 64bit.

cesarb · on July 28, 2022

> note that this applies only to x86, on ARM 64-bit mode doesn't provide more GPRs and is even more useless

It does. The 32-bit ARM architecture has 15 GPRs (plus the instruction pointer which is sort of treated as a GPR by many instructions), while the 64-bit ARM architecture has 31 GPRs (plus one which can either be a constant zero register or the stack pointer depending on the instruction).

> Another option would be to compile apps in 64-bit mode, but use 32-bit wide ints and pointers. This way the RAM could be saved too.

There's a third mode for x86 Linux which does exactly that, but it's never been popular.

codedokode · on July 28, 2022

Strange, I thought that all RISC CPUs typically have 32 registers. I guess I was wrong about ARM.

muricula · on July 28, 2022

Everything you said about memory space & bandwidth is 100% on point.

However: arm arch 32 has 14 general purpose registers (including the link register). arm arch 64 has 31 general purpose registers (also including the link register).

I also doubt there are real power savings in modern out of order processors. The few transistors saved in the register file and ALUs are just dwarfed by everything else. Heck, look at how much of a modern soc like the M1 is used by the compute complex compared to everything else.

whalesalad · on July 27, 2022

Knock on wood I’ve had tremendous success with big upgrades like this on Debian. It’s not the most glamorous distro but the stability is well worth it.

My go to base VM is Debian 11. I usually use it for my base Dockerfile unless there’s a compelling reason not to. I tend to run Ubuntu AMI’s on EC2 but methinks I should give vanilla Debian a shot there too.

BrandoElFollito · on July 28, 2022

I used to constantly upgrade Linux (usually Debian/Ubuntu, but also Fedora at some point) since 1994. It was an exercise I hated.

Some 5 or 8 years ago I discovered Docker and it is truly a game changer for self hosted stuff. I do not care about the OS anymore, I can upgrade/reinstall in a breeze. I actually now run Arch to be independent of upgrades and just roll. if the thing breaks it is not a problem - I can reinstall and have everything running in an hour.

This changed my life of a home sysadmin.

jcynix · on July 28, 2022

> I do not care about the OS anymore, I can upgrade/reinstall in a breeze. I actually now run Arch to be independent of upgrades and just roll.

Interesting. Do you have any links to howtos or beginner's guides to such a setup?

BrandoElFollito · on July 28, 2022

I do not have a documentation, but I do disaster recovery tests from time to time. This is how you can try it out:

- download the ISO of a linux distribution, Arch is good because you have continuous updates (there is no "version")

- start it on a VM engine (VirtualBox, Hyper-V on Windows, VMWare, ...)

- from that point on - START DOCUMENTING

- try to on docker install a program you know that is not too complex network-wise or just start with "hello-world" (https://hub.docker.com/_/hello-world)

- you will find that when running "docker pull hello-world", docker is not installed

- install docker on Arch according to Arch docs. DOCUMENT that step

- retry, hello-world works

- now try something like https://github.com/mendhak/docker-http-https-echo

- you will learn the basics of docker networking, read some docs or just try until you have a curl call working

- at that point you can try a program you know (nextcloud, syncting, ...), pulling it from docker hub and make it work. pay attention to two things: the network and the persistent volumes (I recommend, at least for the start, the file-based ones, not the docker ones)

- grab a beer, you are 90% done, good work

- have a close look at Caddy - this is a web server similar to apache, nginx but MUCH much better. So much better that I have no words.

- you will use it as a proxy server for your containers, so that you can get to them via https://nextcloud.yourdomain.com. It os worthwhile to get your domain even if you do not expose anything because things are much easier that way (caddy will manage the TLS part)

- now learn docker-compose and add all your dockers to it (it is a YAML description of your containers).

- add backup, this will be easier if you add this program on the OS itself (it can sure be in a container but I preferred having that part independent). I recommend Borg despite its few poor choices in the design (that are not likely to bite you at that point)

TADAM! you are done.

You are independent of the OS, if you want to install fedora or whatever it just doe snot matter because i) all your programs are maintained by someone else (than you to the maintainers, it is nice to donate sometimes) and ii) your backup is data that is easily pluggable back to a ne instance of the software

Testing new programs is super easy (you just add them to the doclker compose YAML).

I truly recommend you try with a VM and you will quickly realize it is time to reformat your server and put everything under docker :) AMA if you have questions.

PS. The documentation you wrote (likely 10 lines or so) is now your DRP, you can test it from time to time just to make sure you can recover everything (= the data) from the backup

doubled112 · on July 28, 2022

Docker and docker-compose also make it really easy to store all of the data in a single location, which opens up easy snapshot management with something like ZFS or Btrfs, and backups that way.

This is a thing I've found throws people off - you don't need the OS at all, just those volumes.

  /srv/docker_data
      name_of_project/
          docker-compose.yml
          volumes/
              my_project_db1/

Just for example.

At upgrade time I can snapshot that whole project and test, and revert if it goes bad.

BrandoElFollito · on July 28, 2022

Your file structure is interesting, I like the fact that the data and composition are together.

I actually went away from separate docker-compose files and use a large one (and wrote a small utility to start, stop, pull, etc.)

I will have a deeper look at that, if I managed to combine this with the caddy configurations that would be pretty much awesome :) EDIT: caddy supports globs, this si won-der-ful. I will switch to your configuration and rewrite my tool accordingly (adding a way to bootstrap the docker-compose file and the caddy config).

Thank for having chimed in!

rovr138 · on July 28, 2022

Group things logically vs just doing one huge docker-compose file or 500 individual things.

If you need 7 services together for that feature to work right, write them together. If you need 3 for the next one, group those.

If you at some point want to remove one, just bring down that docker-compose. That's it. Up to you to then backup/delete/leave as is.

You can write a script to automate when you start the machine. But you can also set things that are needed to autostart.

BrandoElFollito · on July 28, 2022

> If you need 7 services together for that feature to work right, write them together. If you need 3 for the next one, group those.

After giving it a thought, this is probably the biggest drawback of that approach: when you have services that other services depend on (say, a db), then you cannot in the docker compose take that into account (as the other services will have their own docker compose and that the requires element points to a service within the same docker compose).

I can love with that, though - the fact that docker compose + caddy config (for the reverse proxy) and the data are all togather is fantastic and allows for easy bootstrapping

doubled112 · on July 28, 2022

I've never used Caddy, and I'm not sure how dynamic it is. I use Traefik for reverse proxy so that each docker-compose configures the proxy for the application in that folder using labels. docker-compose up/down adds and removes the proxy config for that application.

There's an externally defined Docker network for the reverse proxy and web applications.

Each app should run it's own self-contained everything. App, DB, Redis, you name it.

I use the term "application" fairly vaguely. The number of containers each directory contains really depends on the application.

Real world example. These are each a single docker-compose file:

Nextcloud, has containers for application, PostgreSQL DB, Redis, Draw.io, and Collabora CODE. I use all of this exclusively from Nextcloud so it made sense to bunch them together. Nextcloud, Draw.io, and Collabora are all added to the reverse_proxy network in addition to the one docker-compose automatically creates.

Gitea has the application container and it's own PostgreSQL container. Again, the Gitea application is added to the reverse_proxy network.

This simplifies it when I want to back up or move between machines. It also makes it possible to run different DB versions should you run into incompatibilities. It kind of sounds like you're trying to run a single DB server for everything?

BrandoElFollito · on July 28, 2022

> I use Traefik for reverse proxy so that each docker-compose configures the proxy for the application

I've used traefik v1 and v2 and I did not like it. This is of course a personal opinion and I know it has its strengths. The fact that the config is though labels in docker-compose was putting me off, as well as some other things. But I know it is good and used a lot.

Caddy is a web server that works great and has a well-thought configuration (especially v2). It is not dynamic by default (but there are some images that bring is dynamism à la traefik, and a REST API)

> Each app should run it's own self-contained everything. App, DB, Redis, you name it.

It depends on the setup. In my home environment, running several backends such as MariaDB, PostgreSQL etc. is too much. Yes, it is the right approach (including the fact that you do not have dependencies) but the mileage varies.

(ah, you've edited your answer to add some points so some of my comment is redundant)

doubled112 · on July 28, 2022

Sorry for the edits. One time I'll hit reply and have it say what I wanted it to, but this was not that time.

The benefits of choice! I use Traefik because the config is through labels in docker-compose.

I'm also running it at home. Ryzen 3 2200G (4c/4t), 32GB RAM, $100 Intel NVMe. Running roughly 40 containers in my docker VM, plus a few extra VMs. It's enough for my family and a couple of friends.

You'd be amazed at how low impact a small PostgreSQL or MariaDB instance is. I/O is the largest bottle neck. You can feel an HDD holding you back with a bunch of DBs churning simultaneously.

Of course, YMMV. If we're talking about a Raspberry Pi, disregard everything I've said. I'd run as few DB instances as possible too.

BrandoElFollito · on July 28, 2022

No, you are right. The curse of premature optimization.

I have an older Skylake with 25 GB RAM or so. Load is 0.51, RAM is at 5 GB or so... Plenty of space to grow but I am rather new to docker (~8 years as opposed to 30+ in standard Linux) so I did not do too much of research.

I will go for really independent, contained blocks, it has nothing but advantages.

mholt · on July 28, 2022

If you want traefik-like config from container labels, simply use: https://github.com/lucaslorentz/caddy-docker-proxy

doubled112 · on July 28, 2022

This is it.

It gives me control of each application individually.

I can snapshot, back up, and even migrate each application between machines easily.

It also helps out when I have an application that I need to control versions in. Using the latest tag can be fun and exciting (read dangerous and cause breakage), so if the tag is in the docker-compose file I immediately know what version that snapshot was running if I need to look back.

VistaBrokeMyPC · on July 28, 2022

With an ansible playbook and all my services running through docker, it's trivial to set everything up again within the hour with minimal effort compared to years ago when I did services purely through systemctl. In the future I'd like to build a DIY PiKVM and have ansible do the entire installation of the os too, but that's not really necessary for my home server so it's on the back burner.

BrandoElFollito · on July 28, 2022

I used to use ansible and salt but I realized that all configurations are in the persistent files so there is nothing to configure.

Now, I run services and not elastic containers - in the sense that once I have Syncthing or whatever set up, it stays that way. i do not need to pop up containers and configure them manually.

jjice · on July 28, 2022

I like reading these kinds of stories because it's impressive to see system administration pros out there.

For me though (and I've never had a server this old), container images that have a declarative build structure that I can look at and modify in pieces is such a blessing to have.

I've had to wrestle with some weird config stuff that someone changed once without even noticing too many times.

I'm a livestock kind of guy, but I respect those with pets :)

layer8 · on July 28, 2022

I’ve been upgrading the same Debian VPS since 2009 (starting with Etch IIRC, currently on Bullseye as well), with very few issues. Merging configurations like for Exim is usually most of the work, and it’s often not even strictly necessary.

Would love to read the INN story.

denton-scratch · on July 28, 2022

I believe chiark is the server that is the home of PuTTY.

This is an amusing article about getting a known-good copy of PuTTY.

https://noncombatant.org/2014/03/03/downloading-software-saf...

(They have HTTPS now, so perhaps the problems are mitigated)

tux2bsd · on July 28, 2022

It was a matter of who gives a fuck, noone on windows that's for sure (as trained by Microsoft).

pabs3 · on July 28, 2022

I like that this included a cross-grade from i386 to amd64. I wonder if anyone has ever migrated an install across different CPU types, like x86 to Arm. I guess it would be easyish with multiarch, qemu-user-static and the new crossgrading tool.

https://wiki.debian.org/CrossGrading

heywoodlh · on July 28, 2022

Huh, thanks for this link! Very interesting read, did not know tooling like this existed (I assumed that crossgrading in the post was a manual/unsupported process).

pabs3 · on July 28, 2022

I think the crossgrading in the post was definitely a more manual process than the automated crossgrade tool, but yeah the tool is relatively new in Debian terms.

stevekemp · on July 28, 2022

Yes I performed an i386 -> amd64 migration, of a Debian system, back in 2012:

https://blog.steve.fi/today_i_migrated_from_32_bit_to_64_bit...

It wasn't hugely difficult, but it required a bit of attention to details and I think some forced-replacements of packages.

neilv · on July 27, 2022

As a Debian user for 20+ years, I've been impressed how well the upgrades of the `stable` major versions go for my workstations. (Servers just get rebuilt.)

hprotagonist · on July 27, 2022

a domain name i know primarily from about 20 years of downloading PuTTY and friends …

Too · on July 28, 2022

“recommendation to change the meaning of #!/usr/bin/python from Python 2, to Python 3.”

This isn’t exactly true. The recommendation was for a long time that it should alias python2, later it changed to be ambiguous and up for distros to decide, which is possibly even worse, but I guess good in the long run once py2 gets completely eliminated. It can be found in https://peps.python.org/pep-0394/ Only way to be sure is to use more specific shebang or a venv, which is a good idea anyway. Especially when trying to run some really old sw, like in tfa, that is not up to date with other packages.

bayindirh · on July 28, 2022

In current state, Debian does not contain a /usr/bin/python file, and since 2.7 is eliminated from system relatively long time ago, everything is python3.

If you really in dire need, you can install "python-is-python3", which just symlinks /usr/bin/python to /usr/bin/python3. However, no package can depend on it in Debian. It's strictly verboten.

dekhn · on July 27, 2022

I stopped using debian for a while after a botched upgrade (bo->hamm, some sort of libc4->glibc issue) and switched away. But that was almost 24 years ago, so I guess this system predated that.

samtheprogram · on July 27, 2022

What have you used since then?

One mistake can really turn someone away for good, but also wondering if you have had another poor experience with whatever you (initially) replaced Debian with?

dekhn · on July 28, 2022

Before debian I used TAMU distribution. After Debian I switched to Red Hat 5.something and stuck with it through the extremely weird Red Hat 9 era, which had its ups and downs. Next I switched to ubuntu and have mostly stuck with it, mainly because the packages are fairly well done and it gets an LTS every few years. I have dabbled with debian (missing some features in ubuntu), and several other OSes, but these days, i stick with what represents the least-effort path for dev compatibility (ubuntu). I admire arch but it's not the standard.

sexy_panda · on July 28, 2022

I just realised that updating an outdated Ubuntu 21.10 system is just kind of impossible without dirty hacks.

I should have gone with bullseye from the beginning.

tmottabr · on July 28, 2022

You should be able to upgrade without any hacks.

21.10 reaching EOL just means that it stopped receiving updates, everything else should still work as is.

Just install the update-manager-core and run "do-release-upgrade -d" as sudo.

sexy_panda · on July 28, 2022

Well this didn't work, but this worked / is currently running:

https://fedingo.com/how-to-install-or-upgrade-software-from-...

rob_c · on July 28, 2022

I assume this involved imaging a disk or whipping it from one server into another because the amd64 spec didn't exist that far back...

marcodiego · on July 28, 2022

Upgrading from Jessie to Bullseye compiling apt and having working using just apt and dpkg, that would be a feat.

moomin · on July 28, 2022

I remember when chiark was chiark.culture.org.

Yes, really, iwj was that far ahead of the curve in 1993.

jancsika · on July 27, 2022

How far back could one start and still do a practical upgrade path on commodity hardware?

lmm · on July 28, 2022

The article says "chiark’s OS install dates to 1993, when I installed Debian 0.93R5, the first version of Debian to advertise the ability to be upgraded without reinstalling." So that's probably the starting point.

jeroenhd · on July 27, 2022

On the Intel side, Debian will not run on anything without the i686 instruction set. In theory this should include the Pentium II Pro and up.

You can try it for yourself without the hardware, grab https://pcem-emulator.co.uk/index.html, set it up for a Pentium II with contemporary hardware and see if it works. It'll be slow as molasses but it should boot!

rahimnathwani · on July 28, 2022

It seems like Debian Bullseye is available for the i386 architecture (i.e. Intel 486 and above):

https://cdimage.debian.org/debian-cd/current/i386/iso-cd/

jeroenhd · on July 28, 2022

It's listed as i386, but current Debian actually requires a more recent instruction set:

> 2.1.2.1. CPU

>Nearly all x86-based (IA-32) processors still in use in personal computers are supported. This also includes 32-bit AMD and VIA (former Cyrix) processors, and processors like the Athlon XP and Intel P4 Xeon.

>However, Debian GNU/Linux bullseye will not run on 586 (Pentium) or earlier processors.

(https://www.debian.org/releases/bullseye/i386/ch02s01.en.htm...)

rahimnathwani · on July 28, 2022

Oh! It seems this has been the case for at least 6 years: https://lwn.net/Articles/686904/