Not sure if you’re trying to troll me by taking it out of the obvious performance context but anyway: The performance penalty due to the usage of a un-JIT-ed scripting language?
Not trolling you, just wondering what you thought was more important from a performance perspective. Re-writing working software in another language is not always feasible due to real-world time constraints, and in that context the performance difference between message-passing and threads would be the first of my problems. Basically, I just don't see the argument of "use a more appropriate language than Python" as a useful counter to criticism of the GIL. The whole point of criticizing Python (in my case anyway), is to hopefully nudge the language to suiting my needs more closely.
The counter is not (even while some people try to turn it that was): THREADS SUCK, WE WON'T ADD THEM BECAUSE WE DON'T LIKE THEM. it is: given the circumstances (which Nick outlines verbosely) a removal of the GIL is not pragmatic.
And this is the last time I’ve wrote this, I feel like a street-organ. >:(
And all I was saying in this thread is that the performance gap between threads and processes isn’t that big of a deal, if you run non-native code anyway. The multiprocessing module is pretty cool.
I'm not intentionally trying to make you repeat yourself. And I know that removal of the GIL is not pragmatic at the moment (or maybe ever), but that doesn't mean it wouldn't be valuable. The GIL wasn't much of a problem ten years ago because not many personal computers had multiple cores. Today it's become a bit of a pain for me personally, and it will only become more painful as core count increases while single-core performance remains largely stagnant.
And all I was saying in this thread is that the performance gap between threads and processes isn’t that big of a deal, if you run non-native code anyway. The multiprocessing module is pretty cool.
This is a line that I hear over and over again, but I strongly disagree with. It's not always easy to predict where your performance bottlenecks will be until you actually start implementing it in some language. If I've chosen Python for a project and find I need more cores, I'm stuck with either re-implementing critical sections of code in C extensions or other languages, or using multiprocessing. And multiprocessing is not that great because it splits the memory space across processes and communication between them is extremely expensive. And there are many caveats to this which cause enormous headaches (eg., you can't fork your process while having an active CUDA context, not all Python objects are serializable, pickling is slow, marshaling doesn't work well for all data types, you must finish dequeuing large objects from a multiprocessing.Queue before joining the source process, etc.).
Yes, I could get a 10-100x speedup by re-writing everything in C. But most of the time, I would be very happy with a 6-12x performance gain from just using threads in a shared memory space.
No, I didn't try Jython. The choice of CPython was made before I took over the project, and there are also a dozen or so dependencies which I don't think are compatible with Jython.
What would be the first of my problems?