Hacker News new | past | comments | ask | show | jobs | submit login
What’s in Which Python (nedbatchelder.com)
214 points by gammarator on May 23, 2022 | hide | past | favorite | 72 comments



The `re` module in 3.11 will support "Atomic grouping and possessive quantifiers". I recently wrote a blog post about these features: https://learnbyexample.github.io/python-regex-possessive-qua...


Weren't those called "lazy" quantifiers (vs. greedy quantifiers)?

Edit: D'oh, I was mixing these up non-backtracking subexpressions with lazy ones. Thanks!


Lazy will match as minimally as possible, in contrast to greedy which match as much as possible. Both lazy and greedy quantifiers will backtrack to allow overall regex to match.

Possessive on the other hand will not backtrack (unless it is part of a group having lazy/greedy quantifiers).

A simple example: `error.*valid` or `error.*?valid` will match `error: invalid input` but `error.*+valid` will no match because `.*+` will consume rest of the characters without giving back to allow overall regex to match.


Possessive is different from both greedy and lazy. It's like greedy except that it won't give back anything it matches, even if doing so would be necessary for the full pattern to match. See https://stackoverflow.com/q/5319840/7509065 for a more detailed explanation.


How can Python 3.11 be 60% faster than 3.10?


The table of “Specializing Adaptive Interpreter” changes are interesting: https://docs.python.org/3.11/whatsnew/3.11.html#pep-659-spec...


That's the best link. Scroll up too, the sections just above are significant too.


Up to 60%. In average 25% on pyperformance which is another benchmark. See the faster cpython work, it's not a JIT but plenty of other easier optimizations


I think this is where they introduce the JIT (Guido and co are working on this at Microsoft) or add JIT optimizations.


Apparently the JIT won't be introduced in 3.11 and these speed gains are due to other optimisations [0].

> Q: Is there a JIT compiler?

> A: No. We’re still exploring other optimizations.

[0]: https://docs.python.org/3.11/whatsnew/3.11.html#faster-cpyth...


You're right. My mistake. Thanks for the link!


Mark Shannon and co.


And 3.10 is already 25% faster than 3.8


And because of my choice of OS I'm stuck with Python 3.8. Thanks so much Python!


Which OS prevents you from installing another python runtime?

I install python using asdf on both MacOS and various flavours of Linux.

https://asdf-vm.com/


It could be even more, but cpython (being the reference implementation) has been kept simple.

PyPy manages far more.


I'd take the official claims and benchmarks with a huge dose of skepticism. Python is no longer a free software project. The infrastructure is censored and dissent is suppressed ruthlessly.

The entire ecosystem is great at marketing. Critical people like Armin Ronacher have left long ago.

I'd like to see truly independent benchmarks from unbiased or critical outsiders. But in general those do not have any interest in Python.


> Python is no longer a free software project. The infrastructure is censored and dissent is suppressed ruthlessly.

Those are serious allegations. Can you back up these claims?


Walrus operator…

I’m kidding of course but that seems like the camel who broke the bridge too far.


I don't see how you can suggest you have a deep understanding of Python's infrastructure while concurrently not being aware of any unbiased benchmarks from critical outsiders.


Can you suggest an alternative? (not perl)


If you change programming languages based on random comments from loonies on the internet, I don't think you're ever going to stick with a language long enough to implement hello world...


Sure it's a highly productive language with an enormous variety of quality libraries available, but uh, some guy on the internet doesn't like the developers or something.


If one is on the Python 3 train, is there any reason to not always be on the latest release?

Is it better about breaking changes than the 2 to 3 adventure was?


There are very, very few breaking changes within Python 3. If you're an individual user, then upgrading to the latest version is almost certainly a good idea.

But there are lots of companies that release software using Python. To ensure a stable production environment, they upgrade more slowly, or on a schedule that's distinct from the Python release cycle.

Also, companies often don't just let people upgrade things themselves. The IT department schedules a company-wide Python upgrade, for example, testing the effects beforehand. I have several training clients still using 3.7 and 3.8, not because things would break with 3.10, but because they haven't fully vetted the newest version, and they're being super conservative.

I do wonder if companies will be a bit faster to upgrade over the coming years, given that execution speed is expected to improve rather dramatically. Upgrading won't be a matter of some nice-to-have features, but rather of real savings in system usage.


> There are very, very few breaking changes within Python 3.

But will structural pattern matching be back-ported, though?

Certainly an incompatibility, if not a breaking change per se.


GP means there are very few backwards-incompatible changes in new releases. There are plenty of forwards-incompatible ones, though.


How is pattern matching a breaking change? If your code doesn't have it how would it break?


Some people might mistakenly assume that "match" and "case" are now hard keywords, which would of course break lots of code.

But they are soft keywords, so there's no issue.


Some popular packages are Python version dependent. For example, I was using Python 3.10 early and no Python wheels for 3.10 were released for PyTorch back then. The only thing you can do in this case is to either wair for the publisher to provide these packages or use a lower version Python interpreter.


The 3.10 interpreter would still run 3.9-compatible Python code though, right?


For python code, yeah, most of it.

Python does sometimes have backwards-incompatible changes, e.g. for 3.10 they removed a bunch of stdlib modules and methods, like the "formatter" and "parser" module [0].

So if you used those, your code wouldn't work in 3.10.

But the main reason to wait for wheels (which is pythonese for "pre-built packages") is if they use native code (like C or rust) and you would have to compile them yourself otherwise, which increases installation time quite a bit.

(this was also the reason why Alpine was a bad choice for python containers for a long time because it uses musl and there were only wheels for glibc available. AFAIK musl wheels exist now so that isn't relevant anymore)

[0]: https://docs.python.org/3/whatsnew/3.10.html#removed


I always wondered what’s keeping packages back from having wheels ready the day a release drops, or way before even?

I imagine often it’s just another parameter in a CI somewhere, where things mostly “just work” because backward-incompatible breakages have become much rarer.


It's usually a question of having the resources (usually developers to debug issues). IIRC pretty much all Windows wheels are maintained by one guy, Christoph Gohlke, he was a godsend when I worked on a Windows laptop. I owe that guy many beers.


I've never seen a better illustration of xkcd 2347 than Christoph Gohlke. I tried to find a donation link to send him small thank yous but couldn't find one.


I think the main problem is C modules. CPython only maintains ABI compatibility across minor releases (e.g. 3.10.0 and 3.10.8), so you may need a different binary compiled for CPython 3.9 and 3.10. https://docs.python.org/3/c-api/stable.html


It's absolutely an issue with C modules, especially ones that aren't easy to build or if you're running Windows where it's unlikely that there's a compilation environment setup.

Pillow used to _always_ get this when a new Python release came out, because while we were generally following the betas and build away, our quarterly releases weren't sync'd with the Python ones. So there's be a gap, and every. single. time. we'd get support issues with people pip installing on the newest python and not having a compiler or the basic dependencies. (Aside: even if the last line is "Couldn't find required dependency libjpeg" many many people just don't get that it's requiring additional libraries).

So, we just shifted our fall releases to come after the Python releases.


By default, yes.

However, as the page you link also mentions, a C extension may also opt to use the "Stable ABI", which does not change across major releases (with some caveats).


If using python for large project, sure, use latest version.

If writing small tools, consder limiting yourself to system python and its stdlib / system-packaged libraries. A script which ypu can download as a single file, immediately run, and immediately make some changes in, all without worrying about venvs or other setup, is a beautiful thing.

This works especially well if you are writing internal tools and your company has standarts like "Everyone is on Ubuntu 20.04 or 22.04"


I think it depends on how large your codebase is and how many dependencies it has. The first issue is that any new 3.x release is going to have bugs introduced by new features, and possibly regressions in old ones. This could cause your current code to behave differently, as well as cause issues if someone immediately uses a new feature.

The other issue is the more dependencies you have, the more likely one of them will not work with the newest version. Usually these things get sorted out pretty quickly, but it's frustrating to get a bunch of errors.

I typically wait until at least the first bug fix release (3.11.1 for example) to try out a new Python version. These often have most of the common issues fixed, and give enough time for all the popular dependencies to catch up.

This is from my experience maintaining a million+ line codebase that started at Python 2.6 and was updated to 3.8 as new versions came out.


If you're writing internal tools for your teammates, don't use the latest release. Your coworkers will almost certainly be on something older and you'll get super weird bugs. And you'll get the most bizarre sounding pushback when you tell them to install a newer version of Python.


I find Conda packages often have a maximum version that's a bit out of date, sometimes several versions behind.

I'm sure using 3.10 instead of 3.9 would be fine for most packages, but I don't know if it can be "forced"


Compiled packages have ABI restrictions. Even conda-forge puts major version constraints on their packages at build time (ie: built on 3.9: >=3.9,<3.10.0a).


If you are dependent on binary packages rather than building extensions from source (more common on Windows), there may not be prebuilt packages for some libraries for the newest version.


Sometimes. If you're relying on a toolchain that's parsing and analyzing Python code, things like compilers and mypy will sometimes lag behind new syntax, bytecode and C API changes. Until recently mypy didn't have full support for match statements, and its corresponding compiler, mypyc, still doesn't. Same goes, or went, for full support for match statements in other compilers, as well.


There are many little incompatibilities over 3.x. Especially at 3.10/11 where deprecations are finally being removed. Few big changes, as from 2.x.

Wheels typically lag however.

Release[-2] or Release[-3] are good choices. Definitely upgrade from [-4] or lower.


For my Python things, I aimed to be compatible with the oldest supported Python for people on older OS's or who aren't constantly updating their Python installations. That way my stuff was still accessible to them.


There have been breaking changes, I don't think anything too difficult to overcome, but it's easy to pin it intending to fix later, and fall behind. (e.g. 3.9-3.10 something to do with dataclasses iirc)


Unfortunately some python packages are not reliably able to move across versions. I’ve had issues with sqlalchemy and celery before. Let’s not get started with the python packages from snowflake.


Sometimes you're in an organization that doesn't upgrade things right away and it can be difficult to tell which features you can use and which you don't have yet.


There are breaking changes in the C api occasionally, so that can be a good reason to not upgrade. Sometimes an extension needs to use the unstable API.


Yea, if you use standard docker images o you want to use the python version they come with which is usually 3.8 (Ubuntu 20.04) or 3.6 (Ubuntu 18.04)


Why not use the library/python image with the specific version over library/Ubuntu?


I don’t understand your question, but to illustrate the point:

We work often with 3rd party libraries and images provided by those third parties. These are often based straight up on Ubuntu. Nvidia CUDA image is an example of this. Then you are stuck with their python version.

And it’s not as easy as installing your own python version into the image because python dependencies inside those 3rd party images might be installed using OS package managers and not pip so then they are installed to the wrong python version etc. It’s a rabbit hole.

The problem is that at some point Debian/ubuntu decided to repackage python dependencies as apt packages.


It starts with 2.1, so it misses quite a few changes.

If I recall correctly, 1.6 was when it started to get attention on mainstream with Zope and co.

Still, add the standard library changes as well, and there is plenty of inspiration for pub quizzes.


Zope was for Python 1.5.2 (quoting the README from https://old.zope.dev/Products/Zope/2.0.0-donotuseme/2.0.0/Zo... - "This release requires Python 1.5.2.").

Python 1.6 was a "contractual obligation" release that few used, when compared to 1.5.2 or 2.0. See https://python.readthedocs.io/en/v2.7.2/whatsnew/2.0.html#wh... .

1.6final and 2.0beta1 were released on the same day, and 2.0final was released a month later. The 2.x series was almost completely backwards compatible to 1.x.


Thanks for the clarification, it has been a while, :)


I think it's intended to be useful for managing dependencies on specific python versions. And I hope 2.0 doesn't come up there for you.


How come Python is still improving in so many ways after so many years, or what has kept them back for so long? These are not small improvements.


Not quite complete enough for my liking. 3.9 doesn't have any mention of zoneinfo.ZoneInfo for example.


I've now added "new modules" to the most recent handful of versions.


Global interpreter lock, will it ever be removed?


The latest efforts to remove the GIL was presented at the 2022 Python Language Summit and discussed here earlier this month:

https://news.ycombinator.com/item?id=31348097


I think it will never be removed. All code using threads in python are using the GIL to hide their blushes and as soon you remove the GIL I don't think there's any reason to believe threaded code in python will continue to work as race conditions are exposed.

The only option, imo, is to make multiple GIL pools and use actors or message passing to work around this. However, this means the super cheap FFI that Python enjoys will be mitigated as it has to find a solution like go (stack swapping), or react (bridge to underlying platfom).


Unrelated, but i wish there was a table of what version of each function was in each version of glibc. I would love to be able to answer a question like "if i compile on version X, and deploy on version Y, what functions will be missing or changed?".


You might like my tool: https://sourcedigger.io/glibc?q=*sscanf*

If it fails make sure to let me know (I have no analytics)


Thanks, that looks useful. Is this doing the moral equivalent of git log -S?


[flagged]


You got downvoted for a crappy joke but I agree with the gist of your comment. Python as a language keeps growing and adding more and more features, relentlessly. If you stop using it for a few years like I did, it's almost unrecognisable. I honestly can't think of any language that have decided that no feature is one too many.

What started as a scripting language now has pattern matching, union types, async, matrix op syntax sugar, type hints. What's next? Yet the project and package management story still is not good enough.


Honestly, people sometimes can't appreciate a bit of humour. I just didn't want to go "the rude critic" route.

I don't like it when a popular language goes balistic with features. It makes people go on personal crusades to create their own "simpler, easier" languages just to prove a point. See Hare for the latest example.


Why attack sex workers though? Good humour punches upwards.


Oh look the humor police are here


Let them create their own languages, maybe they will discover better ways that eventually get adopted by other languages.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: