> using simple threads instead of dealing with the massive overhead and complexi...

bausgwi678 · 2024-07-13T08:49:21 1720860561

Using multiple processes is simpler in terms of locks etc, but python libraries like multiprocessing or even subprocess.popen[1] which make using multiple processes seem easy are full of footguns which cause deadlocks due to fork-safe code not being well understood. I’ve seen this lead to code ‘working’ and being merged but then triggering sporadic deadlocks in production after a few weeks.

The default for multiprocessing is still to fork (fortunately changing in 3.14), which means all of your parent process’ threaded code (incl. third party libraries) has to be fork-safe. There’s no static analysis checks for this.

This kind of easy to use but incredibly hard to use safely library has made python for long running production services incredibly painful in my experience.

[1] Some arguments to subprocess.popen look handy but actually cause python interpreter code to be executed after the fork and before the execve, which has caused production logging-related deadlocks for me. The original author was very bright but didn’t notice the footgun.

ignoramous · 2024-07-13T16:32:34 1720888354

> The default for multiprocessing is still to fork (fortunately changing in 3.14)

If I may: Changing from fork to what?

thomasjudge · 2024-07-13T18:15:24 1720894524

"In Python 3.14, the default will be changed to either “spawn” or “forkserver” (a mostly safer alternative to “fork”)."

- https://pythonspeed.com/articles/python-multiprocessing/

lyu07282 · 2024-07-13T18:22:51 1720894971

Same experiences, multiprocessing is such a pain in python. It's one of these things people think they can write production code in, but they just haven't run into all the ways their code was wrong so they figure out those bugs later in production.

As an aside I still constantly see side effects in imports in a ton of libraries (up to and including resource allocations).

coldtea · 2024-07-14T01:01:13 1720918873

>which make using multiple processes seem easy are full of footguns which cause deadlocks due to fork-safe code not being well understood. I’ve seen this lead to code ‘working’ and being merged but then triggering sporadic deadlocks in production after a few weeks

Compared to theads being "pain free"?

skissane · 2024-07-13T07:30:06 1720855806

Just the other day I was trying to do two things in parallel in Python using threads - and then I switched to multiprocessing - why? I wanted to immediately terminate one thing whenever the other failed. That’s straightforwardly supported with multiprocessing. With threads, it gets a lot more complicated and can involve things with dubious supportability

lyu07282 · 2024-07-13T18:45:39 1720896339

There is a reason why it's "complicated" in threads, because doing it correctly just IS complicated, and the same reason applies to child processes, you just ignored that reason. That's one example of a footgun in using multiprocessing, people write broken code but they don't know that because it appears to work... until it doesn't (in production on friday night).

skissane · 2024-07-13T22:12:25 1720908745

I don't agree. A big reason why abruptly terminating threads at an arbitrary point is risky is it can corrupt shared memory. If you aren't using shared memory in a multiprocess solution, that's not an issue. Another big reason is it can lead to resource leaks (e.g. thread gets terminated in a finally clause to close resources and hence the resource doesn't get closed). Again, that's less of an issue for processes, since many resources (file descriptors, network connections) get automatically closed by the OS kernel when the process exits.

Abruptly terminating a child process still can potentially cause issues, but there are whole categories of potential issues which exist for abrupt thread termination but not for abrupt process termination.

lyu07282 · 2024-07-14T07:51:53 1720943513

This is not a matter of opinion. Always clean up after yourself, the kernel doesn't know shit about your application or what state it's in, you can not rely on it to cleanly terminate your process. Just because it's a child process (by default it's a forked process!) not a thread, doesn't mean it can not have shared resources. It can lead to deadlocks, stuck processes, all kinds of resource leaks, data and stream corruption, orphaned processes, etc. etc.

skissane · 2024-07-14T09:03:39 1720947819

> This is not a matter of opinion. Always clean up after yourself, the kernel doesn't know shit about your application or what state it's in, you can not rely on it to cleanly terminate your process.

If you have an open file or network connection, the kernel is guaranteed to close it for you when the process is killed (assuming it hasn't passed the fd/socket to a subprocess, etc). That's not a matter of opinion.

Yes, if you are writing to a file, it is possible abruptly killing the writer may leave the file in an inconsistent state. But maybe you know your process doesn't write to any files (that's true in my case). Or maybe it does write to files, but you already have other mechanisms to recover their integrity in this scenario (since file writing processes can potentially die at any time–kernel panic, power loss, intermittent crash bug, etc)

kaba0 · 2024-07-14T16:14:36 1720973676

Different programming languages have different guarantees when it comes to threads. If IO is hidden behind object semantics, objects aren’t killed by “killing” a thread, and they can be gracefully terminated when they are deemed to no longer be in use.

lyu07282 · 2024-07-14T09:59:53 1720951193

Oh well I see, you will learn the hard way then (:

skissane · 2024-07-15T07:45:57 1721029557

You have no idea what I'm actually doing, yet you are convinced something bad is bound to happen, although you can't say what exactly that bad thing will be.

That's not useful feedback.

mabster · 2024-07-14T23:56:49 1721001409

That's why I've always liked Java's take on this. Throw an InterruptedException and the thread is considered terminated once that has dropped all the way through. You can also defer the exception for some time if it takes time to clean something up.

The only issue there is that sometimes library code will incorrectly defer the exception (i.e. suppress it) but otherwise it's pretty good.