The GIL prevents the corruption of Pythons internal structures. It's hard to remove because:
1. Lots of extensions, which can control when they release the GIL unlike regular Python code, depend on it
2. Removing the GIL requires some sort of other mechanism to protect internal Python stuff
3. But for a long time, such a mechanism was resisted by th Python team because all attempts to remove the GIL either made single threaded code slower or were considered too complicated.
But, as far as I understand, the GIL does somewhere between nothing and very little to prevent races in pure Python code. And, my rough understanding, is that removing the GIL isn't expected to really impact pure Python code.
My understanding, is that many extensions will release the GIL when doing anything expensive. So, if you are doing CPU or IO bound operations in an extension _and_ you are calling that operation in multiple threads, even with the GIL you can potentially fully utilize all of the CPUs in your machine.
> removing the GIL isn't expected to really impact pure Python code.
If your Python code assumes it's just going to run in a single thread now, and it is run in a single thread without the GIL, yes, removing the GIL will make no difference.
> If your Python code assumes it's just going to run in a single thread now, and it is run in a single thread without the GIL, yes, removing the GIL will make no difference.
I'm not sure I understand your point.
Yes, singled thread code will run the same with or without the GIL.
My understanding, was that multi-threaded pure-Python code would also run more or less the same without the GIL. In that, removing the GIL won't introduce races into pure-Python code that is already race free with the GIL. (and that relatedly, pure-Python code that suffers from races without the GIL also already suffers from them with the GIL)
Are you saying that you expect that pure-Python code will be significantly impacted by the removal of the GIL? If so, I'd love to learn more.
> removing the GIL won't introduce races into pure-Python code that is already race free with the GIL.
What do you mean by "race free"? Do you mean the code expects to be run in multiple threads and uses the tools provided by Python, such as locks, mutexes, and semaphores, to ensure thread safety, and has been tested to ensure that it is race free when run multi-threaded? If that is what you mean, then yes, of course such code will still be race free without the GIL, because it was never depending on the GIL to protect it in the first place.
But there is a lot of pure Python code out there that is not written that way. Removal of the GIL would allow such code to be naively run in multiple threads using, for example, Python's support for thread pools. Anyone under the impression that removing the GIL was intended to allow this sort of thing without any further checking of the code is mistaken. That is the kind of thing my comment was intended to exclude.
> But there is a lot of pure Python code out there that is not written that way. Removal of the GIL would allow such code to be naively run in multiple threads using, for example, Python's support for thread pools.
I guess this is what I don't understand. This code could already be run in multiple threads today, with a GIL. And it would be broken - in all the same ways it would be broken without a GIL, correct?
> Anyone under the impression that removing the GIL was intended to allow this sort of thing without any further checking of the code is mistaken. That is the kind of thing my comment was intended to exclude.
Ah, so, is your point that removing the GIL will cause people to take non-multithread code and run it in multiple threads without realizing that it is broken in that context? That its not so much a technical change, but a change of perception that will lead to issues?
> This code could already be run in multiple threads today, with a GIL.
Yes.
> And it would be broken - in all the same ways it would be broken without a GIL, correct?
Yes, but the absence of the GIL would make race conditions more likely to happen.
> is your point that removing the GIL will cause people to take non-multithread code and run it in multiple threads without realizing that it is broken in that context?
Yes. They could run it in multiple threads with the GIL today, but as above, race conditions might not show up as often, so it might not be realized that the code is broken. But also, with the GIL there is the common perception that Python doesn't do multithreading well anyway, so it's less likely to be used for that. With the GIL removed, I suspect many people will want to use multithreading a lot more in Python to parallelize code, without fully realizing the implications.
> Yes, but the absence of the GIL would make race conditions more likely to happen.
Does it though? I'm not saying it doesn't, I'm quite curious. Switching between threads with the GIL is already fairly unpredictable from the perspective of pure-Python code. Does it get significantly more troublesome without the GIL?
> Yes. They could run it in multiple threads with the GIL today, but as above, race conditions might not show up as often, so it might not be realized that the code is broken. But also, with the GIL there is the common perception that Python doesn't do multithreading well anyway, so it's less likely to be used for that. With the GIL removed, I suspect many people will want to use multithreading a lot more in Python to parallelize code, without fully realizing the implications.
> Switching between threads with the GIL is already fairly unpredictable from the perspective of pure-Python code.
But it still prevents multiple threads from running Python bytecode at the same time: in other words, at any given time, only one Python bytecode can be executing in the entire interpreter.
Without the GIL that is no longer true; an arbitrary number of threads can all be executing a Python bytecode at the same time. So even Python-level operations that only take a single bytecode now must be protected to be thread-safe--where under the GIL, they didn't have to be. That is a significant increase in the "attack surface", so to speak, for race conditions in the absence of thread safety protections.
(Note that this does mean that even multi-threaded code that was race-free with the GIL due to using explicit locks, mutexes, semaphores, etc., might not be without the GIL if those protections were only used for multi-bytecode operations. In practice, whether or not a particular Python operation takes a single bytecode or multiple bytecodes is not something you can just read off from the Python code--you have to either have intimate knowledge of the interpreter's internals or you have to explicitly disassemble each piece of code and look at the bytecode that is generated. Of course the vast majority of programmers don't do that, they just use thread safety protections for every data mutation, which will work without the GIL as well as with it.)
1. Lots of extensions, which can control when they release the GIL unlike regular Python code, depend on it 2. Removing the GIL requires some sort of other mechanism to protect internal Python stuff 3. But for a long time, such a mechanism was resisted by th Python team because all attempts to remove the GIL either made single threaded code slower or were considered too complicated.
But, as far as I understand, the GIL does somewhere between nothing and very little to prevent races in pure Python code. And, my rough understanding, is that removing the GIL isn't expected to really impact pure Python code.