Actually, I think we are on the same side. :-) There is still a small difference...

kuschku · 2024-03-09T16:51:06 1710003066

Why does there have to be a difference?

When modeling rpc calls as async calls returning a result, you can be just as sure that it has completed (or failed) as with a local call.

And considering write can use an underlying mounted network share, a spun-down HDD, or a thumb drive on a failing USB port, the same failure modes exist in either case.

In either case you're just submitting commands to another device's queue (and waiting for a result) over a hot-pluggable connection.

valenterry · 2024-03-09T17:04:13 1710003853

Because either the disk is down for everyone on that machine or for no one. That it is only down for my program is theoretically possible (e.g. a bug in the OS or driver) but it's so unlikely that it's usually moved into the "cosmic rays" category.

This is not true for networks (the internet).

kuschku · 2024-03-09T17:06:54 1710004014

I don't disagree with that, but does that matter for your program?

Your code still has to handle the same failure modes regardless.

I've had many situations where e.g. reading locally cached files on a smartphone eMMC had worse latency and throughput than doing the same read over the network on the server. The cache actually made performance worse in every way.

valenterry · 2024-03-09T18:36:14 1710009374

Yeah it matters. It's the reason why e.g. postgres offers data consistency and still operates in parallel on the disk.

If it were the same doing this over a network, then we could have a distributed, consistent postgres. What would I give for that. :-)

kuschku · 2024-03-09T19:12:20 1710011540

Well, I wouldn't use that as an example, as Postgres actually misinplemented write/sync in the past causing data corruption ;)

Regarding having a distributed postgres, that's not an issue actually. As long as all writing workers have access to the same synchronization primitives you can easily use postgres with e.g. k8s volumes in single writer, multiple reader mode. And single master, multiple read replica postgres deployments are common too.

The guarantees you're talking about aren't given by the storage implementation, but by the fact that all write workers run on the same machine.

valenterry · 2024-03-09T19:17:31 1710011851

There is no distributed postgres that behaves like a single one (except for being a bit slower sometimes or do).

kuschku · 2024-03-09T19:44:12 1710013452

All I claimed was that, if you let postgres run against network mounted storage it works the same as if you used it with local storage.

Do you see what I meant?

valenterry · 2024-03-09T20:54:57 1710017697

Sure, but what's the point? There still is no distributed postgres. there are read replicas, but that's not the same thing.

So why do you think that is?

kuschku · 2024-03-10T00:21:08 1710030068

I dunno what the point is supposed to be, you brought it up