How safe is a dist-upgrade from 7.8 to 8?

corford · on April 25, 2015

I'd wait a month or two for edge cases to be worked out and then go for it.

If you have a fairly vanilla 7.8 system (i.e. no source compiled stuff on there, mainly just big name packages etc.) then you should be fine doing it now or in a few days.

We updated some simple 7.8 installs to the rc jessie release a month or so ago and everything went fine apart from an obscure bug with monit and inherited umasks. Got around that by manually installing the sid deb of monit.

Edit: the main thing you'll want to do is go read up on systemd ahead of upgrading as it's quite a change and there are still some wrinkles ('systemctl daemon-reload' is a new command we've had to use quite a bit).

pas · on April 25, 2015

Well, if you only have one 7.8 lying around, then fire up a VM, replicate your setup (best is to stop the machine tar-gz the whole, unpack it in the VM, boot from a rescue disc, isntall grub[2]; but if you don't want to go that bullett proof, then just copy sources.list [+ .d], install the same packages, copy /etc, reboot a few times) and try the upgrade.

We've a lot of 7.x instances (bare metal and VMs both) and a few running jessie for about 2 months. The upgrade was flawless. systemd had quirks, but that faded too.

All in all, go ahead, it's yet again a nice little step forward.

kasabali · on April 25, 2015

Debian upgrades are pretty reliable (IMHO best among linux distros) but I can't guarantee you won't have problems in your particular setup :)

Silhouette · on April 25, 2015

Counterpoint: Upgrading from 6 to 7, every single machine I use, from a home server to various professionally managed machines at work, had at least one serious problem. RAID and bootloaders raised a few issues for example. We got everything working eventually, but the amount of wasted time even with very experienced sysadmins looking after some of those machines was silly.

Debian is generally very good with stability for things like security updates and we certainly plan to continue using it. However, our plans for updating this time are more along the lines of "set up completely new machine with Debian 8 from the start, install our own choice of packages and applications, and then systematically migrate data/connectivity from the old systems to the new ones". We expect the time and money costs of having the transition period to be less than the potential downtime if direct upgrades take as much effort as they did from 6 to 7.

Your mileage may vary, Linux has infinite possibilities and ours may just have been unlucky, the plural of anecdote is not data, etc.

kasabali · on April 25, 2015

Best strategy is having a backup image and testing the upgrade procedure on a test machine beforehand, if possible. This release may be especially problematic because of the systemd change (it is possible to boot with sysvinit in Grub menu, but remote upgraders should beware).

Did your last upgrade issues stem from the upgrade procedure or were they because of new versions?

Silhouette · on April 25, 2015

Were your last problems stem from the upgrade procedure or were they because of new versions?

I can't remember all of the different problems now, but one I do remember is that if you had a typical set-up with mirrored (RAID1) drives but the boot-related partitions cloned rather than mirrored, one of the bootloaders got upgraded but not the other. That is, the drives were left out of sync and booting from one of the drives wouldn't work properly if the other failed. The thing that really concerned us wasn't so much the specific details here but that this was essentially a silent failure in the upgrade process, combined with a potentially catastrophic failure in a basic system function as a result.

I think there is quite a big difference between the theory of updating a couple of sources files and running a couple of upgrade commands and the practice of manually checking things like basic RAID configuration and reinstalling missing bootloader updates. This time around, the fact that Jessie uses systemd made the discussion for whether to even try a dist-upgrade a very short one, because literally everyone in the room agreed that the probability of failures was too high for that strategy to be worth considering. The substantial discussions were more about migration to fresh machines relatively soon vs. sticking with 7 at least until we know the LTS situation.

chousuke · on April 25, 2015

Am I understanding this correctly: instead of having the boot partitions configured as a (MD?) RAID set, you had somehow manually cloned them between two disks? A mirrored boot partition works just fine if you're legacy booting... With EFI I guess you have to do manual cloning (which is fragile) or rely on hardware RAID.

Did you use some tool to do that? How do you expect the upgrade process to even be able to take that kind of thing into account?

Without knowing any details it's hard to say if it was an actual bug or just plain old human error, but it sounds like the latter.

Silhouette · on April 25, 2015

Did you use some tool to do that?

I've long forgotten exactly why these systems were first set up that way. Presumably it was because at the time someone was leaving their options open about the RAID set-up for the main drives/partitions and bootloaders of that generation didn't support MD well so keeping boot as a non-RAID set-up was not uncommon. Whatever the history, the fact is that before the automated part of the 6-to-7 upgrade there was a fully working system, and after it there wasn't.

How do you expect the upgrade process to even be able to take that kind of thing into account?

I don't think it's rocket science to suggest that if you're migrating to a new bootloader, and you've got a system with multiple drives in it (RAIDed or otherwise), and you're installing an OS that is widely used in server or multiple-OS environments, just assuming that you should upgrade the bootloader on one specific drive and ignore anything else is not a great idea. What if the sysadmin installing the update wasn't the person who installed the original and simply hadn't realised how the /boot was set up?

Without knowing any details it's hard to say if it was an actual bug or just plain old human error, but it sounds like the latter.

There was no "error". The situation before the upgrade was what it was, and after the upgrade the problem was quickly detected and fixed. But it took time and effort to do that, instead of having a smooth, fully automated upgrade process. Again, the fact is that before the automated part of the 6-to-7 upgrade there was a fully working system, and after it there wasn't.

Will the 7-to-8 update now expect everyone performing it to be intimately familiar with the implications of things like systemd? Because I'm betting plenty of people will encounter it for the first time as part of this upgrade cycle.

What about package compatibility? Some packages have been entirely removed in Jessie; see the political debates about FFmpeg vs. Libav for a relatively high-profile example. That is inevitably going to break some people's install scripts/tool recipes/etc.

My point here is that there are significant changes as part of the upgrade, and upgrades always carry a degree of risk, and my personal experience (based on several different projects) of the 6-to-7 upgrade process was that the risk was real and the fully automated part of the process was not able to do everything necessary itself. Consequently I would not recommend that anyone assume a 7-to-8 upgrade will necessary go completely smoothly and be fully automated either.

[Edit: To be clear, I'm not saying you shouldn't do it or something awful will happen. Nor am I criticising Debian for not anticipating every possible scenario and handling everything completely automatically. I'm just saying my experience last time around was different to kasabali's experience, and as one data point, projects I work on where the experience was not as smooth last time but the desire is to move to 8 quite quickly are generally favouring a clean install and application migration strategy rather than an in-place upgrade. The expectation of those teams is that this will incur less risk and might be faster anyway once you take all implementation and testing effort into account.]

chousuke · on April 26, 2015

Hmm, this sounds like a difference in expectations. I don't think anyone said that Debian upgrades are fully automated; the package manager does what it can (and it usually does a good job) but it's always the sysadmin's job to verify that the configuration at reboot is sane, especially if there's even a hint of something special in the configuration.

Upgrade scripts certainly could try to predict every crazy thing people do with their computers, but past a certain point, it's not very productive. People are creative.

In the end, the admin must make the decision whether reinstalling and reconfiguring a server has a lower general cost than verifying and potentially fixing an upgraded installation.

Silhouette · on April 26, 2015

Of course in the end it's the sysadmin's job to administer the system, but that's also a convenient way to shift responsibility for problems away from the tools. As I mentioned, the problems were quickly detected and subsequently fixed in the cases I'm aware of. But that still required time and effort, and since realistically no sysadmin is going to be an expert on every part of their system that might be affected by an OS upgrade on this scale, I still think it's fair to highlight the risk.

xorcist · on April 25, 2015

How could any update to the bootloader have been installed properly in that setup?

If you did something unorthodox, such as building a boot process dependent on a manual step to clone drive, you surely must be prepared to deal with this in any number of situations that can arise?

All non-standard solutions carry a debt where all future admins must understand what you built and how this affects operation.

Silhouette · on April 25, 2015

Given that Debian's standard installers have always been pretty bad at configuring any non-trivial disk set-up without manual intervention, I feel some people here are a little too quick to criticise. As I said in another post, I don't know why the systems where that issue came up were originally set up as they were, but there have certainly been times, particularly before the current generation of bootloaders, when that sort of set-up wasn't unusual.

The point remains that this doesn't matter. Before the upgrade, there was a fully working system. After the automated part of the upgrade, there wasn't. The original question was how safe the upgrade from 7 to 8 is, and this is a demonstration of the fact that such upgrades can carry risk. I'm not saying don't do them, I'm not expecting Debian maintainers to be omniscient, and I'm not telling you your child isn't beautiful. I'm just saying if you're thinking about moving from 7 to 8, be aware of the potential that there will be things the automated tools can't or won't do for you that may break your system, and plan your upgrade or other migration strategy accordingly.

xorcist · on April 25, 2015

No one argues the packaging system can handle every possible situation. It's just that this case seems, on the face of it and without knowing any of the details, have been one where the system was manually placed in a state where the updater was broken.

I'm not a DD and I have no vested interest in it, but that particular data point is an outlier no matter how you look at it.

There are more obvious situations where updates will break your system. Most common probably when you've installed third party packages with dependencies on system software. But that's not generally what's referred to when asked if the update process is stable. Such things will break no matter how stable the process in itself is.

marcosdumay · on April 25, 2015

I'd wait a couple of days. Dist-upgrade is normally reliable, but does not have a completely flawless history.