> It at least doesn't lock anything up that has a file open when the network goes down. NFS is a nightmare with that.
Yeah, we've been bitten by this too, around once a year, even with our fairly reliable and redundant network. It's a PITA, your process just hang and there's no way to even kill it except restarting the server.
This is too bad. The sweet spot was "hard,intr" at least when I was last using NFS on a daily basis (mid 1990s). Hard mounts make sense for programs, which will happily wait indefinitely while blocked in I/O. This worked well for things like doing a build over NFS, which would hang if the server crashed and then pick right up right where it left off when the server rebooted.
Of course this is irritating if you're blocked waiting for something incidental, like your shell doing a search of PATH. In those cases you could just control-C and continue doing what you wanted to do (as long as it didn't actually need that NFS server).
However I can see that it would be difficult to implement interruptibility in various layers of the kernel.
I think the current implementation comes reasonably close to the old "intr" behavior.
AFAICT the problem with "intr" wasn't that the kernel parts were impossible to implement in the kernel, but rather an application correctness issue, as few applications are prepared to handle EINTR in any I/O syscall. However, with "nointr" the process would be blocked in uninterruptible sleep and would be impossible to kill.
However, if the process is about to be killed by the signal, then not handling EINTR is irrelevant. Thus in 2.6.25 a new process state TASK_KILLABLE was introduced (https://lwn.net/Articles/288056/ ), which is a bit like TASK_UNINTERRUPTIBLE except the task can be interrupted by a fatal signal, and the NFS client code was converted to use it in https://lkml.org/lkml/2007/12/6/329 . So the end result is that the process can be killed with Ctrl-C (as long as it hasn't installed a non-default SIGTERM handler), but doesn't need to handle EINTR for all I/O syscalls.
Yeah, we've been bitten by this too, around once a year, even with our fairly reliable and redundant network. It's a PITA, your process just hang and there's no way to even kill it except restarting the server.