Parallel Computing in Python Using Mpi4py: Stephen Weston
Parallel Computing in Python Using Mpi4py: Stephen Weston
Stephen Weston
June 2017
Parallel computing modules
There are many Python modules available that support parallel computing. See
http://wiki.python.org/moin/ParallelProcessing for a list, but a number
of the projects appear to be dead.
mpi4py
multiprocessing
jug
Celery
dispy
Parallel Python
Notes:
multiprocessing included in the Python distribution since version 2.6
Celery uses different transports/message brokers including RabbitMQ, Redis,
Beanstalk
IPython includes parallel computing support
Cython supports use of OpenMP
S. Weston (Yale) Parallel Computing in Python using mpi4py June 2017 2 / 26
Multithreading support
For parallel computing, dont use multiple threads: use multiple processes
The multiprocessing module provides an API very similar to the threading
module that supports parallel computing
There is no GIL in Jython or IronPython
Cython supports multitheaded programming with the GIL disabled
In this mpi4py example every worker displays its rank and the world size:
from mpi4py import MPI
comm = MPI.COMM_WORLD
print("%d of %d" % (comm.Get_rank(), comm.Get_size()))
Notes:
MPI Init is called when mpi4py is imported
MPI Finalize is called when the script exits
Examples:
send and recv are the most basic communication operations. Theyre also a
bit tricky since they can cause your program to hang.
comm.send(obj, dest, tag=0)
comm.recv(source=MPI.ANY SOURCE, tag=MPI.ANY TAG, status=None)
tag can be used as a filter
dest must be a rank in communicator
source can be a rank or MPI.ANY SOURCE (wild card)
status used to retrieve information about recvd message
These are blocking operations
comm = MPI.COMM_WORLD
size = comm.Get_size()
rank = comm.Get_rank()
if rank == 0:
msg = Hello, world
comm.send(msg, dest=1)
elif rank == 1:
s = comm.recv()
print "rank %d: %s" % (rank, s)
if rank % 2 == 0:
comm.send(s, dest=dst)
m = comm.recv(source=src)
else:
m = comm.recv(source=src)
comm.send(s, dest=dst)
Objects that provide the appropriate scope for all communication operations
intra-communicators for operations within a group of processes
inter-communicators for operations between two groups of processes
MPI.COMM WORLD is most commonly used communicator
comm.barrier()
Synchronization operation
Every process in communicator group must execute before any can leave
Try to avoid this if possible
comm.bcast(obj, root=0)
Generic Python objects can be sent between processes using the lowercase
communication methods if they can be pickled.
import numpy as np
from mpi4py import MPI
comm = MPI.COMM_WORLD
x = np.arange(4, dtype=np.int) * comm.Get_rank()
a = rbind(comm, x)
comm = MPI.COMM_WORLD
x = np.arange(4, dtype=np.int) * comm.Get_rank()
a = rbind2(comm, x)
The trick is to split x into chunks, compute on your chunk, and then combine
everybodys results:
m = int(math.ceil(float(len(x)) / size))
x_chunk = x[rank*m:(rank+1)*m]
r_chunk = map(sqrt, x_chunk)
r = comm.allreduce(r_chunk)
import numpy as np
from scipy.cluster.vq import kmeans, whiten