This is a great example of how bad many open source projects are at accepting contributions from 'non core' developers. The patch is just rejected, when it actually looks pretty valid to handle all cases of return value from a kernel interface. While it might not be a perfect solution, accepting it with suggestions for additional improvements could have led to those improvements.
Technically the patch isn't rejected (and I'm kinda peeved TFA claims it is). It's in limbo, waiting for further action from the submitter or an other contributor: it's marked as needing improvements since it hides the issue under the rug instead of reporting it to the caller/user.
> The OS disobeying your configuration files is really better than it restarting?
It does not restart, it locks up with a mostly useless error message, then, maybe, at one point, possibly restarts. The restart is not a policy decision it's a side effect of the box being dead. Here's the better option: bubble up the issue to the caller and let it decide what to do.
> Looks to me that Upstart kept the lesser of two evils.
Just because the reject status wasn't used does not mean it wasn't rejected. The patch, as it was written, was rejected. It won't be used and the recommendation was a complete rewrite in a completely different direction.
People legitimately object to accepting a patch that fixes one issue but can lead to other problems down the line. IMHO, even not accepting the patch, but with suggestions for additional improvements would have been better behaviour.
As it stands, the communication was:
1. There's a problem, here's a patch.
2. NAK, because (valid reasons).
3. Radio silence.
Perhaps better behaviour would have been:
2. NAK, because (valid reasons). However, I acknowledge the problem; perhaps you can fix it (some other way).
This way, the discussion is more likely to keep going and end up with a proper fix to the issue.
2. We won't apply it because it doesn't fix it perfectly. Sure it's better than what we did, and you offered the patch for free out of the kindness of your heart, but we aren't going to apply it until you work on it some more.
3. But... it's better in every way!
If you write a patch for a project that is an improvement in every way, but not yet perfect and they don't apply it... Well you're probably not going to spend much more time helping that project are you?
And this is where we disagree. I believe it's worse. The original crashes the system, which is really bad. The cases where this happens are few, and people are going to be aware of what they did to cause it (unzipping a bunch of files in there was suggested as a trigger). With the proposed patch, nothing will happen. The user will not be aware of the issue and the system will not update with the changes. The user may not even notice their action had no effect and if they do, it'll be a harder mystery than the crash to figure out.
Some prefer the system to crash than silently ignore input. Error codes and messages are better than either of those.
I suspect the 'goodness' or the 'badness' of the patch is contained in what init does when it misses notifies. Clearly the kernel has dropped some on the floor as it ran out of space, and it tells you this. What libnih does is then scream and shout and abort (which is a fine first configuration since you don't know how common this will be) but when you discover it does happen, you consider ignoring it, if the error is idempotent. Meaning of course if you ignore it, do you later get a notify when the kernel has more notify buffers to use? To understand that you need to read the notify code in the kernel so see how it is generating notifications.
If the kernel drops and never returns a notification, then init has to know that it missed some in order to operate under the correct set of init files. That requires a combination patch to init and to libnih.
If the kernel gets around to notifying anyway, just more slowly, then you can safely ignore it, init will eventually get the message the 'regular' way and you're done.
Given the bug, the next step might be to see if systemd suffers a similar challenge in the presence of a lot of config changes.
I'm pretty sure James Hunt is a core Upstart developer, at least these days. Scott James Remnant, the original author, stopped being the lead maintainer when he left Canonical, but I don't remember if that had happened by 2011.