Most Linux distributions run using an optimistic memory allocation system, whereby memory (RAM plus swap space) can be over-allocated. On these systems, your program can die due to lack of memory at any point in time. I.e. Even if you test the return values of every malloc() call, you still won't be safe.
I did not believe you, but then I did "man malloc" and sure enough, in the NOTES section at the bottom.
>>By default, Linux follows an optimistic memory allocation strategy. This means that when malloc() returns non-NULL there is no guarantee that the memory really is available.
So it's like airlines overbooking seats; the system just hopes that the memory is available when you actually try to use it. I had no idea. That would be an extremely annoying bug to try and track down. How would one even do it? Is there a way to test if you truly have the memory without segfaulting?
That's not quite true. The problem is more insidious that that.
Think of the memory requirements of the fork() system call. It clones the current process, making an entire copy of it. Sure, there's lots of copy-on-write optimisation going on, but if you want guaranteed, confirmed memory, you need to have a backing store for that new process. The child process has every right to adjust all of its process memory.
So if a 4GB process calls fork(), you will suddenly need to reserve 4GB of RAM or new swap space for it. Or if you can't allocate that, you will have to make the fork() fail.
This can be terrible for users, since most often a process is going to fork() and then exec() a very small program. And it seems nonsensical for fork() to fail with ENOMEM when it appears that there is lots of free memory left. But to ensure memory is strictly handled, that's what you have to do.
The alternative, which most distributions use, is to optimistically allocate memory. Let the fork() and other calls succeed. But you run the risk of the child process crashing at any point in the future when it touches its own 'confirmed' memory and the OS being unable to allocate space for it. So the memory failures are not discovered at system call points. There's no return code to spot the out of memory condition. The OS can't even freeze up until memory is available because there's no guarantee that memory will become available.
Well that's even more interesting! So you can have a program appear to be stuck and not know why! At least now I know I can use mlock() everywhere to determine if it locked on a write to promised-but-not-yet-available memory.