Multiprocessing with NumPy Arrays
Last Updated :
24 Apr, 2025
Multiprocessing is a powerful tool that enables a computer to perform multiple tasks at the same time, improving overall performance and speed. In this article, we will see how we can use multiprocessing with NumPy arrays.
NumPy is a library for the Python programming language that provides support for arrays and matrices. In many cases, working with large arrays can be computationally intensive, and this is where multiprocessing can help. By dividing the work into smaller pieces, each of which can be executed simultaneously, we can speed up the overall processing time.
Syntax of the multiprocessing.Pool() method:
multiprocessing.pool.Pool([processes[, initializer[, initargs[, maxtasksperchild[, context]]]]])
A process pool object controls a pool of worker processes to which jobs can be submitted. It supports asynchronous results with timeouts and callbacks and has a parallel map implementation.
Parameters:
- processes: is the number of worker processes to use. If processes is None then the number returned by os.cpu_count() is used.
- initializer: If the initializer is not None, then each worker process will call initializer(*initargs) when it starts.
- maxtasksperchild: is the number of tasks a worker process can complete before it will exit and be replaced with a fresh worker process, to enable unused resources to be freed. The default maxtasksperchild is None, which means worker processes will live as long as the pool.
- context: can be used to specify the context used for starting the worker processes. Usually, a pool is created using the function multiprocessing.Pool() or the Pool() method of a context object. In both cases, context is set appropriately.
Example 1:
Here's an example that demonstrates how to use multiprocessing with NumPy arrays. In this example, we will create an array of random numbers, then use multiprocessing to multiply each element in the array by 2.
Python3
import numpy as np
import multiprocessing as mp
def multiply_elements(array):
return array * 2
if __name__ == '__main__':
np.random.seed(0)
arr = np.random.randint(0, 100, (100,))
print(arr)
print('------------------------------------------')
pool = mp.Pool(mp.cpu_count())
result = pool.map(multiply_elements, [arr[i:i+1]
for i in range(0, 100, 1)])
print(result)
The `multiprocessing.Pool` class is used to create a pool of processes that can be used to execute the `multiply_elements` function. The `map` method is used to apply the function to each subarray in the array. The result is a list of arrays, each of which contains the result of multiplying the corresponding subarray by 2.
Output :
[44 47 64 67 67 9 83 21 36 87 70 88 88 12 58 65 39 87 46 88 81 37 25 77
72 9 20 80 69 79 47 64 82 99 88 49 29 19 19 14 39 32 65 9 57 32 31 74
23 35 75 55 28 34 0 0 36 53 5 38 17 79 4 42 58 31 1 65 41 57 35 11
46 82 91 0 14 99 53 12 42 84 75 68 6 68 47 3 76 52 78 15 20 99 58 23
79 13 85 48]
------------------------------------------
[array([88]), array([94]), array([128]), array([134]), array([134]), array([18]), array([166]),
array([42]), array([72]), array([174]), array([140]), array([176]), array([176]), array([24]),
array([116]), array([130]), array([78]), array([174]), array([92]), array([176]), array([162]),
array([74]), array([50]), array([154]),array([144]), array([18]), array([40]), array([160]),
array([138]), array([158]),array([94]), array([128]), array([164]), array([198]), array([176]),
array([98]),array([58]), array([38]), array([38]), array([28]), array([78]), array([64]),
array([130]), array([18]), array([114]), array([64]), array([62]), array([148]), array([46]),
array([70]), array([150]), array([110]), array([56]), array([68]), array([0]), array([0]),
array([72]), array([106]), array([10]), array([76]), array([34]), array([158]), array([8]),
array([84]), array([116]), array([62]), array([2]), array([130]), array([82]), array([114]),
array([70]), array([22]), array([92]), array([164]), array([182]), array([0]), array([28]),
array([198]), array([106]), array([24]), array([84]), array([168]), array([150]), array([136]),
array([12]), array([136]), array([94]), array([6]), array([152]), array([104]), array([156]),
array([30]), array([40]), array([198]), array([116]), array([46]), array([158]), array([26]),
array([170]), array([96])]
 Example 2:
Let's consider a simple example where we want to compute the sum of all elements in an array. In a single-process implementation, we would write the following code:
Python3
# import modules
import numpy as np
import time
def single_process_sum(arr):
return np.sum(arr)
if __name__ == '__main__':
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
start_time = time.perf_counter()
result = single_process_sum(arr)
end_time = time.perf_counter()
# calculating execution time
total_time = end_time - start_time
print(result)
print(total_time)
Output:
The time.perf_counter() method of the time module is used to measure the performance counter. It returns a float value.
55
4.579999999998474e-05
Now, let's see how we can perform the same computation using multiple processes. The following code demonstrates how to use the multiprocessing module to split the array into smaller chunks and perform the computation in parallel:
Python3
# import modules
import numpy as np
import time
import multiprocessing as mp
def worker(arr):
return np.sum(arr)
def multi_process_sum(arr):
num_processes = 4
chunk_size = int(arr.shape[0] / num_processes)
chunks = [arr[i:i + chunk_size] for i in range(0, arr.shape[0], chunk_size)]
pool = mp.Pool(processes=num_processes)
results = pool.map(worker, chunks)
return sum(results)
if __name__ == '__main__':
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
start_time = time.perf_counter()
result = multi_process_sum(arr)
end_time = time.perf_counter()
# calculating executing time
total_time = end_time - start_time
print(result)
print(total_time)
Output:
55
0.3379081
In the above code, we have split the original array into 4 equal parts and assigned each part to a separate process. The worker function takes an array as input and returns the sum of its elements. The multi_process_sum function creates a pool of 4 processes and maps the worker function to each chunk of the array. The results from each process are collected and summed up to get the final result. The output of the above code will also be 55, just like the single-process implementation.
In conclusion, by using multiprocessing with NumPy arrays, we can speed up computationally intensive operations and improve the overall performance of our applications. It is a powerful tool that can help to optimize code and make the most of a computer's resources.
Similar Reads
How To Save Multiple Numpy Arrays
NumPy is a powerful Python framework for numerical computing that supports massive, multi-dimensional arrays and matrices and offers a number of mathematical functions for modifying the arrays. It is an essential store for Python activities involving scientific computing, data analysis, and machine
3 min read
NumPy Array Broadcasting
Broadcasting in NumPy allows us to perform arithmetic operations on arrays of different shapes without reshaping them. It automatically adjusts the smaller array to match the larger array's shape by replicating its values along the necessary dimensions. This makes element-wise operations more effici
5 min read
Python Lists VS Numpy Arrays
Here, we will understand the difference between Python List and Python Numpy array. What is a Numpy array?NumPy is the fundamental package for scientific computing in Python. Numpy arrays facilitate advanced mathematical and other types of operations on large numbers of data. Typically, such operati
7 min read
Multiprocessing in Python and PyTorch
Multiprocessing is a technique in computer science by which a computer can perform multiple tasks or processes simultaneously using a multi-core CPU or multiple GPUs. It is a type of parallel processing in which a program is divided into smaller jobs that can be carried out simultaneously. The progr
12 min read
Numpy Array Indexing
Array indexing in NumPy refers to the method of accessing specific elements or subsets of data within an array. This feature allows us to retrieve, modify and manipulate data at specific positions or ranges helps in making it easier to work with large datasets. In this article, weâll see the differe
5 min read
NumPy Array in Python
NumPy (Numerical Python) is a powerful library for numerical computations in Python. It is commonly referred to multidimensional container that holds the same data type. It is the core data structure of the NumPy library and is optimized for numerical and scientific computation in Python. Table of C
2 min read
Basics of NumPy Arrays
NumPy stands for Numerical Python and is used for handling large, multi-dimensional arrays and matrices. Unlike Python's built-in lists NumPy arrays provide efficient storage and faster processing for numerical and scientific computations. It offers functions for linear algebra and random number gen
4 min read
Splitting Arrays in NumPy
NumPy arrays are an essential tool for scientific computing. However, at times, it becomes necessary to manipulate and analyze specific parts of the data. This is where array splitting comes into play, allowing you to break down an array into smaller sub-arrays, making the data more manageable. It i
6 min read
Joining NumPy Array
NumPy provides various functions to combine arrays. In this article, we will discuss some of the major ones.numpy.concatenatenumpy.stacknumpy.blockMethod 1: Using numpy.concatenate()The concatenate function in NumPy joins two or more arrays along a specified axis. Syntax:numpy.concatenate((array1, a
2 min read
Saving and loading NumPy Arrays
The savetxt() and loadtxt() functions in NumPy are primarily designed for 1D and 2D arrays (text files with row/column format). When dealing with a 3D NumPy array, these functions can be a bit limited because they cannot directly handle the 3D structure. However, you can reshape the 3D array into a
3 min read