There is still a small difference though: when I execute a write and sync, then I'm pretty sure it either succeeded or didn't. I'm not sure if there is any way for the disk to say "don't know" which the OS will then pass to my program. Even if, I guess such a case is so rare that it is probably excluded in any error-model. Like cosmic rays. And I can always retry, since it's very very unlikely that there is a connection issue only between my program and the disk but not between other programs and that disk - since they run on the same machine.
Over a network things are different, because the connection might be gone for a long time and know I'm not aware of the state and I can't check it. I can also not tell if the connection might come back, the machine I was talking to might have burned down. If I, myself, burn down, then I don't need to worry anyways.
That's the difference. This is specific for write/sync though - as you explained it, just because things are running on the same machine does not mean things are necessarily more reliable than a "remote" call.
When modeling rpc calls as async calls returning a result, you can be just as sure that it has completed (or failed) as with a local call.
And considering write can use an underlying mounted network share, a spun-down HDD, or a thumb drive on a failing USB port, the same failure modes exist in either case.
In either case you're just submitting commands to another device's queue (and waiting for a result) over a hot-pluggable connection.
Because either the disk is down for everyone on that machine or for no one. That it is only down for my program is theoretically possible (e.g. a bug in the OS or driver) but it's so unlikely that it's usually moved into the "cosmic rays" category.
I don't disagree with that, but does that matter for your program?
Your code still has to handle the same failure modes regardless.
I've had many situations where e.g. reading locally cached files on a smartphone eMMC had worse latency and throughput than doing the same read over the network on the server. The cache actually made performance worse in every way.
Well, I wouldn't use that as an example, as Postgres actually misinplemented write/sync in the past causing data corruption ;)
Regarding having a distributed postgres, that's not an issue actually. As long as all writing workers have access to the same synchronization primitives you can easily use postgres with e.g. k8s volumes in single writer, multiple reader mode. And single master, multiple read replica postgres deployments are common too.
The guarantees you're talking about aren't given by the storage implementation, but by the fact that all write workers run on the same machine.
There is still a small difference though: when I execute a write and sync, then I'm pretty sure it either succeeded or didn't. I'm not sure if there is any way for the disk to say "don't know" which the OS will then pass to my program. Even if, I guess such a case is so rare that it is probably excluded in any error-model. Like cosmic rays. And I can always retry, since it's very very unlikely that there is a connection issue only between my program and the disk but not between other programs and that disk - since they run on the same machine.
Over a network things are different, because the connection might be gone for a long time and know I'm not aware of the state and I can't check it. I can also not tell if the connection might come back, the machine I was talking to might have burned down. If I, myself, burn down, then I don't need to worry anyways.
That's the difference. This is specific for write/sync though - as you explained it, just because things are running on the same machine does not mean things are necessarily more reliable than a "remote" call.