PyPy Status Blog: STM with threads

fiatmoney · on June 10, 2012

Microsoft's experiment with STM ended up not going so well [1] [2]. Does anyone know how the PyPy implementation differs, and if it might overcome these problems? Is it just laying the groundwork for leveraging hardware transactional memory support, or is the thesis that this will eventually become usable directly?

[1]http://www.infoq.com/news/2010/05/STM-Dropped [2]http://www.bluebytesoftware.com/blog/2010/01/03/ABriefRetros...

SpikeGronim · on June 10, 2012

This particular STM is promising because it is only about 5x slower than a well-optimized JIT, and yet it has no JIT yet. Once they do some low risk work it should be very close to vanilla pypi and faster than CPython. I'd love to be able to use the programming semantics you get as an app developer from an STM language.

The usual problem with STM is that generates many writes to memory. You ruin your cache locality and your program bottlenecks on the bus to RAM. Clojure solves this by using custom data structures that use "structural sharing". Pypy could solve this by using their JIT to elide writes.

reitzensteinm · on June 11, 2012

To clarify, it's 1/5th the speed of normal PyPy without JIT per thread. Except it can now run on 8 cores, so it's slightly faster on an 8 core machine.

It's molasses compared to PyPy with JIT.

SpikeGronim · on June 11, 2012

Oh dear, much slower than I thought. I misinterpreted that, thanks.

masklinn · on June 11, 2012

> Does anyone know how the PyPy implementation differs

It differs significantly in that the whole interpreter works under STM, it's not just an API.

> Is it just laying the groundwork for leveraging hardware transactional memory support, or is the thesis that this will eventually become usable directly?

It's a development project at this point (you can — and should — check the backlog on pypy's blog), but the goal is to have it usable directly and pave the groundwork for HTM/

bfrog · on June 10, 2012

While I think STM and Threading is cool, I think being able to run multiple python loops and message pass in the same process would be much cooler, and much faster akin to an Erlang like VM. Then again I'm pretty biased as I've done more Erlang this year than python!

starvinmargin · on June 11, 2012

If you can write your application in an efficient way using message passing - that's fantastic! However some applications have shared state at their core and are very inefficient when programmed using message passing.

cdavid · on June 11, 2012

Are there many such cases where you would use python, though ?

mcherm · on June 11, 2012

Yes. One example would be scientific computing, which is an area that demands extreme machine performance and which also is frequently done in Python.

nickik · on June 11, 2012

Well if it where possible to use python in such a case there would probebly be demand.

DanWaterworth · on June 11, 2012

What I don't get is how this is exposed to the programmer. I seem to remember that there was talk before of using an eventloop type model where events are handled in separate transactions. Is this still the case? If so, then how is livelock prevented?