0% found this document useful (0 votes)
12 views

Parallel Computing Workshop Part I NN

Uploaded by

Chandni Rana
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Parallel Computing Workshop Part I NN

Uploaded by

Chandni Rana
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 55

WORKSHOP: Parallel Computing With MATLAB (Part I)

Jonathan Murray
Application Engineer
December 2023

© 2023 The MathWorks, Inc.


1
Chatting

▪ Send to at least the Host,


Presenter & Panelists
▪ Ideally, send to All Attendees

2
Agenda

▪ Part I – Parallel Computing with MATLAB on the Desktop


– Parallel Computing Toolbox
– MATLAB Online
▪ Part II – Scaling MATLAB to an HPC cluster
– MATLAB Parallel Server

3
Agenda

▪ Part I – Parallel Computing with MATLAB on the Desktop


– Parallel Computing Toolbox
– MATLAB Online
▪ Part II – Scaling MATLAB to an HPC cluster
– MATLAB Parallel Server

4
Why use parallel computing?

Save time with parallel computing by carrying out computationally


and data-intensive problems in parallel (simultaneously)
▪ distribute your tasks to be executed in parallel
▪ distribute your data to solve big data problems
on your compute cores and GPUs or scaled up to clusters and
cloud computing

5
Why use parallel computing with MATLAB?

Save time with parallel computing by carrying out computationally


and data-intensive problems in parallel (simultaneously)
▪ distribute your tasks to be executed in parallel
▪ distribute your data to solve big data problems
on your compute cores and GPUs or scaled up to clusters and
cloud computing with minimal code changes, so you can focus on
your research
Core 1 Core 2

Core 3 Core 4

CPU with 4 cores 6


Before going parallel, optimize your code for the best performance

7
Before going parallel, optimize your code for the best performance

▪ Use the Profiler to find the code that runs slowest and
determine possible performance improvements

Use vectorization (matrix and vector operations)


instead of for-loops

8
Before going parallel, optimize your code for the best performance

▪ Use the Code Analyzer to automatically check your code for


coding (and performance) problems

Elapsed time is 0.075824 seconds.

Preallocate the maximum amount of space required for the array


instead of letting MATLAB repeatedly reallocate memory for the
growing array

Elapsed time is 0.013109 seconds.


9
Before going parallel, optimize your code for the best performance

▪ Replace code with MEX functions

Techniques for accelerating MATLAB algorithms and applications 10


Before going parallel, optimize your code for the best performance
with efficient programming practices
Pre-allocate memory instead of letting arrays be resized dynamically

Vectorize – Use matrix and vector operations instead of for-loops

Try using functions instead of scripts. Functions are generally faster

Create a new variable rather than assigning data of a different type to an existing variable

Place independent operations outside loops to avoid redundant computations

Avoid printing too much data on the screen, reuse existing graphics handles
Techniques to improve performance 11
MATLAB has built-in multithreading

Multi-core CPU

MATLAB

MATLAB multicore 12
Run parallel code by utilizing multiple CPU cores

parallel pool

MATLAB Client MATLAB workers


(your MATLAB session)
(MATLAB computational engines
Parallel Computing Toolbox
that run in the background
without a graphical desktop)

13
Download Instructions

▪ https://tinyurl.com/ParallelComputingWorkshop
– Click on Add to my Files
– Click Copy Folder

– https://www.mathworks.com/licensecenter/classroom/4265400/
– Click Access MATLAB Online (maybe prompted to sign-in again)
– Click Open MATLAB Online
– In Current Folder, double click on ParallelComputingWorkshop-2.0

14
Setup: Step 1 – Copy materials via MATLAB Drive
Click Add to my Files and then click Copy Folder.
For use on your MATLAB Desktop, click Download Shared Folder instead.

https://tinyurl.com/ParallelComputingWorkshop 15
Setup: Step 2 – Launch MATLAB Online

https://www.mathworks.com/licensecenter/classroom/4265400/ 16
Hands-On Exercise: Starting a parallel pool

parpool_intro 17
Scaling MATLAB applications and Simulink simulations

Automatic parallel support in toolboxes

Greater Control
Ease of Use

Common programming constructs

Advanced programming constructs

18
Automatic parallel support in toolboxes

19
Scaling MATLAB applications and Simulink simulations

Automatic parallel support in toolboxes

Greater Control
Ease of Use

Common programming constructs


(parfor, parfeval, parsim, …)

Advanced programming constructs

20
Explicit parallelism using parfor (parallel for-loop)
▪ Run iterations in parallel
▪ Examples: parameter sweeps, Monte Carlo simulations

MATLAB

Workers

Time
Time
21
Explicit parallelism using parfor
a = zeros(5, 1); a = zeros(5, 1);
b = pi; b = pi;
for i = 1:5 parfor i = 1:5
a(i) = i + b; a(i) = i + b;
end end
disp(a) disp(a)

MATLAB

Workers

22
Hands-On Exercise: Writing our first parfor

parfor_getting_started 23
DataQueue: Execute code as parfor iterations complete
function a = parforWaitbar

D = parallel.pool.DataQueue;
▪ Send data or messages from parallel h = waitbar(0, 'Please wait ...');
workers back to the MATLAB client afterEach(D, @nUpdateWaitbar)

N = 200;
p = 1;
▪ Retrieve intermediate values and track
computation progress parfor i = 1:N
a(i) = max(abs(eig(rand(400))));
send(D, i)
end

function nUpdateWaitbar(~)
waitbar(p/N, h)
p = p + 1;
end
end

24
Hands-On Exercise: Sending data with a dataqueue

dataqueue_getting_started 25
Execute functions in parallel asynchronously using parfeval

fetchNext
Outputs
MATLAB

Workers

for idx = 1:10


f(idx) = parfeval(@magic,1,idx);
end
Asynchronous execution on parallel workers
Useful for “needle in a haystack” problems for idx = 1:10
[completedIdx,value] = fetchNext(f);
magicResults{completedIdx} = value;
end 26
Run code in parallel

Synchronously with parfor Asynchronously with parfeval*


▪ You wait for your loop to complete to ▪ You can obtain intermediate
obtain your results results
▪ Your MATLAB client is blocked from ▪ Your MATLAB client is free to run
running any new computations other computations
▪ You cannot break out of loop early ▪ You can break out of loop early

* Runs function on parallel workers 27


Hands-On Exercise: Use parfeval to run functions in the
background

parfeval_plotter 28
Use Code Analyzer to fix problems when converting for-loops to
parfor-loops
parfor-loop iterations have no guaranteed order, and one loop iteration cannot depend on a
previous iteration; therefore, you may need to rewrite your code to use parfor

29
Common problems when rewriting a for-loop as a parfor-loop

Noninteger loop variables

Nested parallel loops

Dependent loop body

30
Hands-On Exercise: Rewrite for-loops into parfor-loops

parfor_conversions 31
Hands-On Exercise: Refactoring for-loops

parfor_refactoring 32
Optimizing parfor-loops

defined outside of the


parfor body and thus still 0
at the end of the parfor
loop variable (loop index)
temporary variable (set inside
must be consecutive, increasing,
the parfor body, cleared at
integers
the start of each iteration, not
sent back to client) sliced input variable (each
iteration works on a different slice
reduction variable of the array) → reduced
(accumulates across all communication overhead from
iterations in the same manner, client to workers ☺
usable after the loop)
broadcast variable (value
sliced output variable reduced required but not set inside the
communication overhead from loop, sent to all workers, large
workers to client ☺ variables increase
communication overhead )
Troubleshooting variables in parfor-loops 33
Consider parallel overhead* in deciding when to use parfor

parfor can be useful ☺ parfor might not be useful 


▪ for-loops with loop iterations that take ▪ for-loops with loop iterations that
long to execute take a short time to execute
▪ for-loops with many loop iterations that
take a short time, e.g., parameter sweep

Ways to improve parfor performance:


▪ Experiment which is faster: creating arrays before the loop or have each worker create
its own arrays inside the loop (saves transfer time, especially on a remote cluster)
▪ Use sliced variables or temporary variables
… Check mathworks.com/help/parallel-computing/improve-parfor-performance.html

* Parallel overhead: time required for communication, coordination, and data transfer from client to workers and back 34
Run multiple simulations in parallel with parsim

Workers

Time Time

▪ Run independent Simulink


simulations in parallel using
the parsim function
35
Hands-On Exercise: Parameters sweeps with Simulink

parallel_Simulink 36
Scaling MATLAB applications and Simulink simulations

Automatic parallel support in toolboxes

Greater Control
Ease of Use

Common programming constructs

Advanced programming constructs


(spmd,etc.)

37
Leverage NVIDIA GPUs without learning CUDA

MATLAB client
or worker

GPU cores

Device Memory

38
Leverage your GPU to accelerate your MATLAB code

▪ Ideal Problems
– massively parallel and/or
vectorized operations
– computationally intensive

▪ 1000+ GPU-supported
functions

▪ Use gpuArray and


gather to transfer data
between CPU and GPU

MATLAB GPU Computing 39


Hands-On Exercise: Offload computations to your GPU

gpus_getting_started 40
Parallel computing on your desktop, clusters, and clouds

GPU
GPU

Multi-core CPU
Multi-core CPU
MATLAB MATLAB Parallel Server
Parallel Computing Toolbox

▪ Prototype on the desktop


▪ Integrate with infrastructure
▪ Access directly through MATLAB
41
Scale to clusters and clouds

With MATLAB Parallel Server, you can…

▪ Change hardware with minimal code change

▪ Submit to on-premise or cloud clusters

▪ Support cross-platform submission


– Windows client to Linux cluster

42
Interactive parallel computing
Leverage cluster resources in MATLAB

>> parpool('cluster', 3);


>> myscript

MATLAB
Parallel Computing Toolbox

myscript.m
a = zeros(5, 1);
b = pi;
parfor i = 1:5
a(i) = i + b;
end
43
Run a parallel pool from specified profile

On local machine On cluster


▪ Start parallel pool of local workers ▪ Start parallel pool using cluster
object

▪ Start parallel pool of thread workers

☺ Reduced memory usage, faster


scheduling, lower data transfer costs
 Thread-based environments support
only a subset of the functions available for
process workers

mathworks.com/help/parallel-computing/choose-between-thread-based-and-process-based-environments.html 44
batch simplifies offloading computations
Submit MATLAB jobs to the cluster

>> job = batch('myscript','Pool',3);


parfor

worker

MATLAB
Parallel Computing Toolbox
pool

45
Hands-On Exercise: Use batch to offload serial and parallel
computations

batch_getting_started 46
batch simplifies offloading simulations
Submit Simulink jobs to the cluster

job = batchsim(in,'Pool',3);

parsim

worker

MATLAB
Parallel Computing Toolbox pool

47
Big Data Workflows
ACCESS DATA

More data and collections


of files than fit in memory

DEVELOP & PROTOTYPE ON THE DESKTOP


SCALE PROBLEM SIZE

Adapt traditional processing tools or To traditional clusters and Big


learn new tools to work with Big Data Data systems like Hadoop

48
tall arrays

▪ Data type designed for data that doesn’t fit into memory
▪ Lots of observations (hence “tall”)
▪ Looks like a normal MATLAB array
– Supports numeric types, tables, datetimes, strings, etc.
– Supports several hundred functions for basic math, stats, indexing, etc.
– Statistics and Machine Learning Toolbox support
(clustering, classification, etc.)

Working with tall arrays 49


Hands-On Exercise: Use tall arrays for Big Data

tall_getting_started 50
distributed arrays

▪ Distribute large matrices across workers running on a cluster


▪ Support includes matrix manipulation, linear algebra, and signal processing
▪ Several hundred MATLAB functions overloaded for distributed arrays

MATLAB
11 26 41 15 30 45 20 35 50
Parallel Computing Toolbox 21 36 51
12 27 42 16 31 46

13 28 43 17 32 47 22 37 52

MATLAB Parallel Server

Working with distributed arrays 51


Hands-On Exercise: Use distributed arrays for Big Data

distributed_getting_started 52
tall arrays vs. distributed arrays

▪ tall arrays are useful for out-of-memory datasets with a “tall” shape
– Can be used on a desktop, cluster, or with Spark/Hadoop
– Low-level alternatives are MapReduce and MATLAB API for Spark
▪ distributed arrays are useful for in-memory datasets on a cluster
– Can be any shape (“tall”, “wide”, or both)
– Low-level alternative is SPMD + gop (Global operation across all workers)

53
Further Resources

▪ MATLAB Documentation
– MATLAB → Software Development Tools → Performance and Memory
– Parallel Computing Toolbox

▪ Parallel and GPU Computing Tutorials


– https://www.mathworks.com/videos/series/parallel-and-gpu-computing-tutorials-
97719.html

▪ Parallel Computing with MATLAB and Simulink


– https://www.mathworks.com/solutions/parallel-computing.html

54
MATLAB and Simulink are registered trademarks of The MathWorks, Inc. See www.mathworks.com/trademarks for a list of additional trademarks. Other
product or brand names may be trademarks or registered trademarks of their respective holders. © 2023 The MathWorks, Inc.
© 2023 The MathWorks, Inc.
55

You might also like