The first stable release of PyPy3

Buetol · on June 20, 2014

Wow, this is a very exciting moment for the Python world.

And they didn't even reached their funding goal for "py3k in pypy" [1]. This is dedication. I encourage everyone to fund this extremely incredible project!

[1]: http://pypy.org/py3donate.html

chrismonsanto · on June 21, 2014

I have been checking the py3k branch on Hg every other day waiting for this moment, what a pleasant surprise. Very, very exciting. Thanks all.

I donated a while back, will make another donation soon.

I would like to start using this immediately but I think I'll have to wait until a 3.3 release for "yield from".

rectangletangle · on June 21, 2014

Awesome, I hadn't realized this project was quite this far along. If they get PyPy 3.4/3.5 going with NumPy, it will make a really nice package. Fast Python code for the high-level logic, paired with fast low-level number crunching. This could also help speed up the adoption of Python 3.

rch · on June 21, 2014

Looks like they're over 80% of the way to hitting the funding goal for that one too:

http://pypy.org/numpydonate.html

ma2rten · on June 21, 2014

The problem is: even if numpy gets ported we still don't have scipy and a million other packages which require C bindings.

rectangletangle · on June 21, 2014

True. Though this will likely make PyPy more mainstream, and thus it'll hopefully attain more community support.

thomasahle · on June 20, 2014

I wish the community would just switch entirely to pypy. Being able to just slightly performance sensitive code in python is a huge win.

ngoldbaum · on June 20, 2014

It makes sense to use pypy if you're writing pure python code. The second you need a C extension, you're pretty much out of luck. This kills a lot of the appeal for people in the scientific/analytics side of things, who make heavy use of legacy C and Fortran routines.

dragonwriter · on June 20, 2014

> The second you need a C extension, you're pretty much out of luck.

In theory, shouldn't CFFI be the foundation of the solution to that problem?

tych0 · on June 20, 2014

In practice it works pretty well. I am nearing completion of a rewrite of X's XCB-based python bindings in cffi, and it has worked out quite nicely.

rguillebert · on June 20, 2014

rguillebert · on June 20, 2014

You can use these C and Fortran routines on PyPy, just not with the CPython C extension API.

takluyver · on June 20, 2014

There's a lot of established code built around the C API. It's not like it can just be rewritten using CFFI over a weekend.

rguillebert · on June 21, 2014

Well, I'm not talking about every C extension in the world, I'm talking about thin bindings around C routines which is what numpy has for fft for example, someone wrote a basic equivalent for numpypy in a few hours with no prior numpypy/cffi knowledge.

onalark · on June 21, 2014

Unfortunately, NumPy uses deep knowledge of the CPython API in quite a few places, which is one of the reasons implementing NumPyPy has been so challenging.

rguillebert · on June 21, 2014

Another approach should be used for those, but to be clear I wasn't talking about the entire numpy library, I'm talking about things like numpy.fft

sitkack · on June 21, 2014

No, you are wrong. It supports both ctypes and cffi, both of which should be the goto for calling native code. Use PyObject has been the stupid choice for over 4 years.

masklinn · on June 20, 2014

> I wish the community would just switch entirely to pypy.

What purpose would that serve?

> Being able to just slightly performance sensitive code in python is a huge win.

I think you slightly this phrase, but aside from that pypy does not work for everybody and everything (e.g. at best it's no slower for sphinx, it really doesn't like the way docutils works). It's not like pypy's a magic wand.

quacker · on June 21, 2014

> What purpose would that serve?

If PyPy became the official/canonical implementation, PyPy would receive more attention and third-party library compatibility would be a requirement. Complaints about Python's slowness would be somewhat less relevant, and Python might see wider adoption. The RPython toolchain would receive more attention and that could be useful to other languages. There are plenty of reasons, but PyPy is usually a free speedup for your Python application. Who's going to complain about that?

> pypy does not work for everybody and everything

True, but as the official implementation of Python, compatibility with PyPy would then be a must, and this situation would be greatly improved.

BuckRogers · on June 21, 2014

I agree with you, but it will never happen. GvR wants a as-simple-as-possible reference implementation, for one, he has to maintain it with a volunteer dev team. Also, there's a split in the Python community between guys like me and you- and the scientific squad. Until the scientific stuff works 100% in PyPy you'd lose a significant portion of the Python userbase by dumping CPython.

GvR has done enough damage to Python with Python3. I don't intend to encourage him to do make any more changes. Us Python web developers are better off using what we have (non reference implementations, which don't hurt anyone), or just use Node.js.

michh · on June 21, 2014

I don't think it's fair to put the blame of the unfortunate way things have gone with Python 3 solely on the shoulders of GvR. Afaik, a huge part of the community felt this was the way to go. Unfortunely, it wasn't.

pekk · on June 21, 2014

Killing Python 3 isn't something a majority of the community wants and it isn't objectively better either.

Fede_V · on June 21, 2014

I agree entirely. What's kind of a pity is that until NumPy is ported over, all of the scientific stack is basically unusable on PyPy - and right now, there are several incredibly good NumPy specific JITs (numexpr, numba, parakeet).

jamespo · on June 20, 2014

Maybe moving libraries & code to Python 3 should be the priority.

cookiecaper · on June 21, 2014

Convincing distros to package it as the default "python" should be the priority. Until that happens, Python 3 will see limited adoption. The path of least resistance will always have the most traffic.

_delirium · on June 21, 2014

Ubuntu seems to have that as a near-term goal: https://wiki.ubuntu.com/Python/3

keeperofdakeys · on June 21, 2014

The first step is to get everything python3 compatible, and have it use a hashbang or other mechanism to select the right interpreter. After this happens, the default interpreter has no real meaning: everything will use the right interpreter.

cwyers · on June 21, 2014

It's a chicken and egg problem. So long as most Python libraries run on Python 2 but not Python 3, distros are going to package Python 2.

cookiecaper · on June 22, 2014

Someone has to take the first step and break the cycle to get the chicken-egg problem undone. Arch has had Python 3 as the default Python interpreter for a couple of years now and it's been working pretty much fine. Many libraries now support Python 3, and I've done full sites in Py 3. Almost all Python scripts I write these days are Python 3. I don't think I've had to downgrade a script that started in 3 down to 2 for a couple of years now. It's as ready as it's going to get.

The groundwork is done, and I think everyone who is going to support Py 3 without any extra prodding has already done so. Now we need the distros to come through and give that extra nudge to the maintainers that are still slacking, or encourage people to replace those libraries that refuse to update.

Luyt · on June 21, 2014

Actually, the majority of PyPI packages are python-3 compatible. For a status overview, see http://python3wos.appspot.com/

briancurtin · on June 21, 2014

That's not what that site is saying.

For my PyCon Russia talk, I pulled down the data for all 44,402 packages (as of May 31). 13.5% of all packages on PyPI support some version of Python 3. 75.5% of the top 200 packages by download count claim to support some Python 3 version (according to their setup.py classifiers). Additionally, 64% of the top 500 support some Python 3 version.

Another interesting thing I saw was that of those 44K packages, 44% of them have seen a release within the last 12 months (representing 82% of the last month's download share), and 22% of those packages released in the last year support some version of Python 3.

ris · on June 21, 2014

How much memory do you have?

dagw · on June 21, 2014

For a sufficient performance increase? As much as it takes. Memory is cheap

ris · on June 22, 2014

On servers and especially on virtualized servers it is absolutely not.

tedunangst · on June 20, 2014

Minor note: the openbsd support (at least for 2.x) is amd64 only. Building for i386 at some point requires running a bootstrap process that doesn't fit in memory.

hcarvalhoalves · on June 20, 2014

> Building for i386 at some point requires running a bootstrap process that doesn't fit in memory.

Seriously, it takes more than 4gigs to build PyPy? Is that also necessary for other platforms besides OpenBSD?

thristian · on June 21, 2014

When you're compiling CPython, it's neatly broken into little bite-sized chunks (.c files), each of which has all the type-annotations and such that the compiler needs to produce efficient code.

When you're compiling PyPy, it basically has to load the entire Python interpreter structure into memory so it can do its various analyses and annotations, so compiling PyPy takes a long time. I think for a while it was excluded from certain Linux distros because their package-build-farm machines wouldn't handle it.

keeperofdakeys · on June 21, 2014

http://stackoverflow.com/questions/8452396/does-pypy-transla...

Pypy is written in RPython, a subset of the python language. When it's 'compiled', the pypy RPython code runs in cpython or pypy, to re-compile the pypy source into C code, to generate a binary. Lots of tuning and such occurs at the same time, so the JIT runs well on the target machine. This is why it takes a long while, and lots of memory.

The build also prints a fractal while compiling. http://pypy.readthedocs.org/en/latest/faq.html#why-does-pypy...

tedunangst · on June 21, 2014

I think it's 2 and some change, but yeah. I don't know the specifics. Once bootstrapped, it's more reasonable, but building from source is pretty wicked.

sitkack · on June 21, 2014

4GB is literally nothing. My laptop has 16, most servers I use have 128+. 4GB is netbook territory.

tekacs · on June 21, 2014

... I think the implication is that more than 4GB would exceed the pre-[PAE][1] memory limit[2]. A form of cross-compilation might work, though PyPy build isn't exactly a simple, 'classical' build process. :P

Edit: also, looking at your comments[3E] it looks like surely you know this (sorry) so I'm now really not sure what you're getting at... :P

[1]: http://en.wikipedia.org/wiki/Physical_Address_Extension

[2]: and even with PAE you still need to split into multiple processes/address spaces to do anything useful

[3E]: https://news.ycombinator.com/threads?id=sitkack

sitkack · on June 21, 2014

My point is requiring a lot of ram for a build is not a problem. Yes it would be nice to support low end devices for PyPy compilation, but the set of people on extremely constrained hardware and those people doing development on PyPy that would need to build from source is well, by definition zero.

32 bit is dead except for ARM, and it will be dead on ARM in 4 years.

tekacs · on June 24, 2014

> 32 bit is dead except for ARM, and it will be dead on ARM in 4 years.

Uh... sure? ... but the parent post was about how building for 32 bit _today_ simply does not work and will not work.

Whilst it's not necessarily best to build for technology almost gone, there definitely will continue to exist 32 bit devices that people would expect to run Python on for quite a number of years yet - today's 32 bit ARM chips aren't going anywhere awhile and not every form factor (say non-desktop) is well suited to a 64+-bit architecture. :/

sitkack · on June 25, 2014

Remember we are talking about _building_, actually JITing a JIT using a dynamic language _for_ a dynamic language.

I haven't run a 32 bit desktop or server system since 2004. 32 bit is quite dead. In 4 years, only the cheapest ARM SoCs will be 32 bits. In embedded devices, yes 32 bits will be around for a great long while.

cookiecaper · on June 21, 2014

4GB is not literally nothing, it's 25% of the memory available on your laptop. That's a significant chunk.

girvo · on June 21, 2014

That's not true. $1000 ultrabooks often have 4GB. Hell, the base model rMBP has 4GB (I paid the extra for 8GB).

sitkack · on June 21, 2014

The people surfing the web and buying some music on iTunes are not building PyPy from source. It makes no sense to put the engineering work into supporting such memory constrained dev environments.

zyngaro · on June 20, 2014

I've just made a small donation.

wldcordeiro · on June 20, 2014

This is awesome, now just to wait for a Python 3.4 PyPy release :D

Derbasti · on June 20, 2014

And Numpy! And ctypes (for Matplotlib)!

Although I must say, numpypy is quite usable already!

sitkack · on June 21, 2014

PyPy has had ctypes support for a great long while.

Derbasti · on June 21, 2014

True. Not complete enoughbfor Matplotlib, though.

rguillebert · on June 21, 2014

I think you're talking about the c extension api.

johnrds · on June 20, 2014

I created a simple Terminal instance that compares Python and PyPy in a performance test:

https://terminal.com/tiny/shkhWWkcEV

(this lets you compare the performance on a real Linux system, without installing anything)

mineo · on June 22, 2014

The PyPy people themselves have a benchmark portal at http://speed.pypy.org/ with graphs and everything.

codiator · on June 21, 2014

PyPy seems to be 7x faster!

hyperbovine · on June 21, 2014

On a silly piece of code that nobody would ever have any use for. I have tried PyPy for "real" data and numerical tasks from time to time, and never have I noticed any sort of speedup. Usually it's slower than CPython. Perhaps this latest version will be different, who knows.

apendleton · on June 21, 2014

I'm using it in production, and speedups tend to be on the order of 4-5x for my app (the compute-intensive part involves hierarchical agglomerative clustering of documents by text similarity, so it's data/numbers-heavy). Obviously it'll depend on your individual application (and non-CPU-bound tasks won't benefit much), but we switched to PyPy because it showed major improvements in profiling of our app on production data (and we switched around PyPy's 1.9 release, so it's even better now). It's not like everyone's just imagining the speed improvements...

IanOzsvald · on June 22, 2014

I've just finished writing "High Performance Python" for O'Reilly (due August), we have a chapter on Lessons from the Field and one chap talks about his successful many-machine roll out of a complex production system using PyPy for a 2* overall speed gain. We also cover Numba, Cython, profiling, numpy etc - all the topics you'd expect.

illumen · on June 21, 2014

It's not like everyone's just imagining that it's slower for many work loads either.

apendleton · on June 21, 2014

Not disagreeing, but they implied that this benchmark only showed a speed improvement because it's a toy, and that real workloads with real data are usually slower. That hasn't been the case in my experience.

rguillebert · on June 21, 2014

Help us make your code faster, report it please :)

wolf550e · on June 21, 2014

You do remember that it's a jit and the first run is not fast? You have to let it run for a while to generate fast code and only benchmark after that.

pekk · on June 21, 2014

You might try again since things have changed. If you don't get any kind of speedup, the PyPy project would likely consider it a bug and it would be helpful to document that it was slower. Please consider finding some way of reporting the specific measurable issues you find!

chris_mahan · on June 20, 2014

Excellent. I've been waiting for this for a long time.

voidlogic · on June 21, 2014

How does the performance of PyPy and Jython compare?

pipeep · on June 21, 2014

According to Jython's (a little dated) FAQ <https://wiki.python.org/jython/JythonFaq/GeneralInfo>, "Jython is approximately as fast as CPython--sometimes faster, sometimes slower. Because most JVMs--certainly the fastest ones--do long running, hot code will run faster over time."

PyPy aims to be (and is in many cases) faster than CPython.

The advantage with Jython isn't a performance one: it's the ability to call Java code directly.

rguillebert · on June 21, 2014

Jython is usually slower than CPython I believe, it has no GIL though.

rdtsc · on June 21, 2014

Wonder if it can be faster under higher parallelism conditions. Multiple threads doing some CPU intensive work?

sitkack · on June 21, 2014

Jython can utilize threads as well as Java can, so on many core machine Jython wins by a pretty large margin.

husio · on June 20, 2014

Thank you.

derengel · on June 20, 2014

I don't know or use Python but why an implementation that is trying to be "superior" still has the GIL?

huxley · on June 20, 2014

This donation page has some background on how PyPy is proposing to replace the GIL with software transactional memory:

http://pypy.org/tmdonate2.html#introduction

pekk · on June 20, 2014

What DO you know or use? Did you think that the GIL was an obvious and stupid oversight made by stupid people for no good reason?

glibgil · on June 21, 2014

Obviously the GIL was shortsighted, yes. Leave the people out of it. The idea was stupid. There was a reason, but it wasn't a good reason.

rguillebert · on June 21, 2014

What would you replace it with ?

glibgil · on June 21, 2014

No GIL.

rspeer · on June 21, 2014

Your username is apt, but novelty accounts aren't a thing here. What exactly are you hoping to communicate?

glibgil · on June 21, 2014

My name is Gil. Look at my comment history and apologize.

rguillebert · on June 21, 2014

you need something to allow concurrent access to internal interpreter data structures...

glibgil · on June 21, 2014

STM or MVar

meowface · on June 20, 2014

Because it's very tricky to remove.

Ruby also has a GIL.

dragonwriter · on June 20, 2014

> Ruby also has a GIL.

MRI has a GIL; major alternative implementations (JRuby, Rubinius) do not.

OTOH, addressing the downsides of a GIL are not the only reasonable motivations for an alternative implementation, so there's no reason that a better-than-stock Python (or Ruby) fundamentally must remove the GIL (the current "MRI" used to be an alternative, YARV, to the old MRI, and both had GILs.)

meowface · on June 21, 2014

True. Jython and IronPython also do not have a GIL.