Parallel Processing in Python

Author: Guest Published: August 9, 2010 12 comments

Categories: python

Tags: analysis, python

Today we have a guest post by Ian Crossfield (UCLA) on parallel computing with python.

When analyzing astronomical data, one often finds oneself repeating the same tasks over and over again (e.g. fitting a model to a PSF, measuring the width of a spectral feature, etc.). Sometimes the results of one computation influence the next, in which case there’s no choice but to compute everything sequentially. In many cases, though, the computations are independent and (with sufficient computing power) could all be performed simultaneously. You like your laptop or workstation (and don’t want to worry about the hassle of learning about your local computing cluster), but aren’t sure how to take advantage of the multiple CPU cores available on even mid-range modern computers. What’s an astronomer to do?

Python has many and various options to enable this sort of parallel processing; I’ve only tried two. One, “Parallel Python,” seemed to require a high degree of specificity about module dependencies and other issues I would rather not have to deal with; in constrast, getting the “PProcess” module to run is simplicity itself, and I recommend it to anyone who has need of parallel-processing capabilities.

Just download the “pprocess” package and put it in your Python path. Then it’s simply a matter of using pprocess’ slick “Map” wrapper functionality; essentially this creates a wrapper function that you call with exactly the same arguments you normally use for your function; see the example below. You can spawn as many simultaneous computations as you’d like with the “limit” option, but you won’t see much benefit once you set this to the number of CPU cores you have at your disposal.

First, here is a schematic example

# Just replace this (calling one function twice, in series):
desired_values = [function(args) for args in [args1,args2]]
 
# ... with this (calling one function twice, in parallel):
import pprocess
nproc = 2  	# maximum number of simultaneous processes desired
results = pprocess.Map(limit=nproc, reuse=1)
parallel_function = results.manage(pprocess.MakeReusable(function))
parallel_function(args1)
parallel_function(args2)
desired_values = results[0:2]

Now for a concrete example:

import pprocess
import time
import numpy as np
 
# Define a function to parallelize:
def takeuptime(ntrials):
    """A function to waste CPU cycles"""
    for ii in range(ntrials):
        junk = np.std(np.random.randn(1e5))
    return junk
 
list_of_args = [500, 500]
 
# Serial computation:
tic=time.time()
serial_results = [takeuptime(args) for args in list_of_args]
print "%f s for traditional, serial computation." % (time.time()-tic)
 
# Parallel computation:
nproc = 2  	# maximum number of simultaneous processes desired
results = pprocess.Map(limit=nproc, reuse=1)
parallel_function = results.manage(pprocess.MakeReusable(takeuptime))
tic=time.time()
[parallel_function(args) for args in list_of_args];  # Start computing things
parallel_results = results[0:3]
print "%f s for parallel computation." % (time.time() - tic)

Do you parallelize your analysis? Why or why not, and (if so) what do you use?

Information on other parallel computing packages can be found on the AstroBetter wiki.

12 comments… add one

John Aug 9, 2010 @ 10:14

I think some essential context here is to understand the so-called “Global Interpreter Lock”: that is, the memory-locking mechanism used by the (C)Python interpreter. Locking the whole interpreter makes the implementation of threading within the Python virtual machine easy, but means it can’t scale across multiple CPU cores. That’s why you need to consider external libraries for writing multi-CPU code, rather than just using multithreading as you might with some other languages.

Recent (2.6+) versions of Python provide the multiprocessing package in the standard library, which provides a “thread-like” interface for scaling across multiple CPUs using subprocesses. I can’t comment on how it compares to PProcess, but it’s always worth considering the standard library before you go shopping around.

Reply Link
Ross Aug 9, 2010 @ 10:50

What’s the advantage of pprocess over Python’s own multiprocessing module?

http://docs.python.org/dev/library/multiprocessing.html#module-multiprocessing

Reply Link
Adam Ginsburg Aug 9, 2010 @ 12:01

I’ve used mpi4py with mpirun because that’s what the simulators down the hall recommended. It’s very useful, but still tedious to get a code from serial to parallel. I’m convinced there should be an easier way to make embarrassingly parallel codes run on multiple processors. Neither PProcess nor multiprocessing seems any simpler than mpi4py as far as I can tell; it’s still up to the coder to decide how many processors to use and where.

One question I’ve never gotten a satisfactory answer to – there are some functions in IDL that run in parallel by default. I thought fft was one of them, but I’m pretty sure matrix multiplication is. Why is numpy/python not capable of similar feats?

Reply Link
Ian J. Crossfield Aug 9, 2010 @ 12:54

John, Ross — Would either of you be willing to sketch out what either of the sample code snippets above would look like when using the “multiprocessing” library?

Reply Link
John Aug 9, 2010 @ 15:30

No problem. I’m not sure what formatting commands this comment box will accept, so I’ve posted a simple multiprocessing-based approach at http://pastebin.com/iGPs699r.

(I also changed your takeuptime() function not to return at the end of the first iteration, which I’m pretty sure isn’t what you wanted…)

Reply Link
- Justin Oct 9, 2016 @ 14:32
  
  I appreciate the transfer over to multiprocessing 🙂
  
  Link
Jessica Aug 9, 2010 @ 16:02

The early “return” was my mistake in transcribing Ian’s tabs properly. It is now fixed.

Reply Link
Ivan Zolotukhin Aug 9, 2010 @ 16:26

I always use pypar + mpirun for embarrassingly parallel problems. It was not really straightforward in the beginning, but once the first script is ready, it becomes much easier. I wish there’s a simpler (multithreading) way to achieve a parallel execution using common memory objects without GIL problems, though.

Reply Link
Wolfgang Kerzendorf Aug 11, 2010 @ 2:47

So I had similar problems with parallel python, which led me to search for other packages. I needed to parallelize my problem over many machines. I also needed a scheduler as some iterations of the task took longer than others. I wrote a little program using execnet ( http://codespeak.net/execnet/)which is this incredible cool package (meaning execnet not my script 😉 ). You can run your sessions on another host without opening a client first. If people are interested I can send them my little script.

Reply Link
George Nov 11, 2011 @ 11:28

Hello and thanks for the post.It’s a little outdated but i saw it now.I am trying it but i get the same time fro serial and parallel computation.It seems that it doesn’t use my other core.(i have 2 cores)

Reply Link
Jessica Lu Feb 8, 2012 @ 14:08

We have a corresponding Astrobetter Wiki page where we are compiling the links to various packages. If you have specific coding examples that are astronomy related that you are willing to share, we can post or link to them on the wiki page.

https://www.astrobetter.com/wiki/tiki-index.php?page=Parallel+computing+with+python

Reply Link