I hope they are conservative about this, because even the smartest and best prog...

jerf · on June 19, 2023

I mentally snarked to myself that "obviously they should rewrite it in Rust first".

Then, after more thought, I'm not entirely sure that would be a bad approach. I say this not to advocate for actually rewriting it in Rust, but as a way of describing how difficult this is. I'm not actually sure rewriting the relevant bits of the system in Rust wouldn't be easier in the end, and obviously, that's really, really hard.

This is really hard transition.

I don't think multithread code quality should be measured in absolutes. There are things that are so difficult as to be effectively impossible, which is the lock-based approach that was dominant in the 90s, and convinced developers that it's just impossible difficult, but it's not multithreaded code that's impossibly difficult, it's lock-based multithreading. Other approaches range from doable to even not that hard once you learn the relevant techniques (Haskell's full immutability & Rust's borrow checker are both very solid), but of course even "not that hard" becomes a lot of bugs when scaled up to something like Postgres. But it's not like the current model is immune to that either.

mmphosis · on June 19, 2023

Concurrency isn’t a “nice layer over pthreads” - the most important thing is isolation - anything that mucks up isolation is a mistake.

— Joe Armstrong

Threads are evil. https://www.sqlite.org/faq.html#q6 https://www2.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-...

Nginx uses an asynchronous event-driven approach, rather than threads, to handle requests. https://aosabook.org/en/v2/nginx.html http://www.kegel.com/c10k.html

johannes1234321 · on June 19, 2023

The code already is multithreaded. They have shared state just across multiple processes instead of threads within a process.

They might even reduce complexity that way.

usefulcat · on June 19, 2023

It's not the same at all for global variables, of which pgsql apparently has around a couple thousand.

If every process is single threaded, you don't have to consider the possibility of race conditions when accessing any of those ~2000 global variables. And you can pretty much guarantee that little if any of the existing code was written with that possibility in mind.

ants_a · on June 19, 2023

Those global variables would be converted to thread locals and most of the code would be oblivious of the change. This is not the hard part of the change.

jrott · on June 19, 2023

The fact that they are planning on doing this across multiple releases gives me hope that they'll be cautious with this.

anarazel · on June 19, 2023

Postgres is already concurrent today. There's a lot of shared state between the processes (via shared memory).

xwdv · on June 19, 2023

Nonsense, multithreaded code can be written as bug free as regular code. No need to fear.

PhilipRoman · on June 19, 2023

I'm assuming you're referring to formally proven programs. If that's the case, do you have any pointers?

Aside from the trivial while(!transactionSucceeded){retry()} loop, I have trouble proving the correctness of my programs when the number of threads is not small and finite.

dboreham · on June 19, 2023

This is true. However, the blast radius may be smaller with a process model. Also recovering from a fatal error in one session could possibly be easier. I say this as a 30-year threading proponent.

taeric · on June 19, 2023

I think the point is that some mistakes in process based code are not realized as the bugs that they will be in threaded code?

preordained · on June 19, 2023

It can be. Anything can be. It is far more treacherous, though.

baggy_trough · on June 19, 2023

In theory, yes. In practice, no.

ajkjk · on June 19, 2023

It is just harder.