An introduction to parallel programming using Python's multiprocessing module

platz · on July 12, 2014

I feel that a discussion of http://en.wikipedia.org/wiki/Amdahl's_law should be mandatory when introducing parallel programming

Small sequential portions of an otherwise parallel algorithm can have huge effects on the overall running when trying to scale up.

"parconc" explains this while discussing a parallel version of k-means, talks about how things like granularity of data needs to be fine-tuned for parallel algos, and provides some nice visualizations into what the CPU's are actually doing on a timeline: http://chimera.labs.oreilly.com/books/1230000000929/ch03.htm...

Overall I think multicore is a good tool to have in your toolbox, but it seems like there needs to be a lot of tuning and effort to get good rewards for the time invested.

flatline · on July 12, 2014

This is certainly true for some applications, but originally when Amdahl's law was formulated, people made estimates based on e.g. 95% of an application being parallelizable, and thus fairly rapidly reaching a point of diminishing returns. In practice, however, there's often a simple reformulation of the problem that can result in a much higher percentage, and Amdahl's becomes less an obstacle than it would first appear. There are still many problems that scale more or less linearly with the number of processing units and are "trivially parallelizable".

d0vs · on July 12, 2014

https://docs.python.org/dev/library/concurrent.futures.html

abemassry · on July 13, 2014

I wrote https://github.com/abemassry/crazip in python and it was challenging to find out how to do multiprocessing effectively, not sure how this would run if implemented in other laungaues.

yohanatan · on July 13, 2014

This article contains a perfect example of why I don't like to write or read comments in code. Comments are not compiled and thus allow for sloppy verbiage such as the following:

    # Exit the completed processes
    for p in processes:
        p.join()

The comment should read something more like: "Wait for all the subprocesses to exit" but is that really any more helpful than just reading the code and seeing that join is called on each subprocess and connecting the dots from there?

rdtsc · on July 13, 2014

> but is that really any more helpful than just reading the code and seeing that join is called on each subprocess and connecting the dots from there?

It isn't if you have been programming using threads before ,"join" is obvious then

But if you haven't and maybe you are a physicist trying to get something done in python, "p.join()" could mean anything -- "Join to what?", "Why is there no argument to join()?" "We are joining the data togther there like a list..." "It looks like we should be stopping the processes but the method is not called 'stop()' so that's not it"...

That is the problem of teaching this stuff by someone who has been programming for a while, this kind of stuff gets internalized and becomes obvious but it is not obvious to a beginner.

roadie · on July 13, 2014

Will use this as ammo next time someone complains about my minimalist approach to commenting. I think comments need to be targeted at a specific audience. If you know that only professionals are going to work on your code then I think only high-level comments are probably OK.

yohanatan · on July 13, 2014

In the case where the reader doesn't understand `.join()`, they should look up the following page and read its documentation (which is about as clear as it can possibly be): https://docs.python.org/2/library/multiprocessing.html#multi...

Also, I would suggest that the physicist in your example hire a software practitioner [who can be expected to know 'join'] to write his software correctly (rather than trying to half-ass it himself).

hueving · on July 13, 2014

Instead of calling the 'else' condition of the for loop a 'completion-else', just call it the 'nobreak' condition. Unlike 'completion-else', 'nobreak' immediately describes when it will be executed.

webmaven · on July 14, 2014

Better starting point: https://medium.com/@thechriskiehl/parallelism-in-one-line-40...

hueving · on July 13, 2014

> x_i = (point_x - row[:,np.newaxis]) / (h)

TypeError: list indices must be integers, not tuple

andreasvc · on July 13, 2014

This assumes a NumPy matrix.

gomesnayagam · on July 14, 2014

Still multiprocessing is experimental phase for various use case :(

thikonom · on July 13, 2014

pretty poor language choice to teach parallel programming concepts

zobzu · on July 12, 2014

the author seems to be unaware of threading and event based models.

multiprocessing adds memory isolation through the CPU's protected memory.

freyrs3 · on July 12, 2014

If you're working on CPU-bound tasks with NumPy/SciPy using threads then you have to think very hard to make sure most of the critical sections are hitting the NumPy calls in C which release the GIL. It's not a great reliable way to program. The way the author describes is basically the only pure-Python way of achieving parallelism for this kind of problem.

zobzu · on July 12, 2014

it doesnt make it any less true. besides, that C python's GIL sucks doesnt mean threading or event models sucks.

In doubt, ask nginx how they feel about that. Then ask apache's mpn-prefork how that feels.

freyrs3 · on July 12, 2014

If you're holding a global mutex every 100 instructions while context switching between CPU bound tasks, then yes, the GIL does suck. There are a class of IO-bound problems where threading/evented models in Python can be used effectively but that's not the class of problems the author is talking about here.

nemothekid · on July 13, 2014

Considering the author's use case (mathematical modeling) and language (Python), threading and event-based models would have no real performance benefit.

Event based models only shine when you are doing IO-bound tasks. They won't help you when you are chewing CPU.

Threading models in Python aren't attractive because of the GIL. If you are doing a parallel matrix operation you can only ever use one CPU because of the GIL. Not attractive.

chuckledog · on July 13, 2014

Any thoughts how software-transactional memory might apply for a use case like this?

freyrs3 · on July 13, 2014

STM is not a great fit for this kind of problem, there's no need for all the transaction machinery if the problem is embarrassingly parallel. In an ideal world what you want is threads which just split the work sections like they do in C/C++/Haskell.

rasbt · on July 12, 2014

Right. Yes, I am aware of the threading module. It depends on the type of application what to prefer: memory-bound vs CPU-bound tasks

Igglyboo · on July 13, 2014

Just curious but could this be possible to get around by having multiple copies of the python interpreter installed on your sytstem? Maybe possibly changing some config strings so the GIL thinks it's something else.

quadlock · on July 13, 2014

ProcessPoolExecuter from the futures package essentially does this, it runs multiple instances of the interpreter and then distributes instructions to them.