0% found this document useful (0 votes)
27 views37 pages

Lecture 6

Uploaded by

jihem33832
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views37 pages

Lecture 6

Uploaded by

jihem33832
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

Lecture 6

Parallel Algorithms

Institute of Computer Science & Information Technology,


Faculty of Management & Computer Sciences,
The University of Agriculture, Peshawar, Pakistan.
Basic Terminologies (Concurrency
vs. Parallelism)
 Concurrency: making progress on
more than one task - seemingly at
the same time
 Parallel Execution: making
progress on more than one task
at the same time

 Parallel Concurrent Execution:


making progress on more than
one task - seemingly at the same
time – on more than one CPU

 Parallelism: Splitting a single task


into sub tasks which can be
processed in parallel

Dr. Muhammad Imran, ICS/IT, FMCS 2


Parallel Algorithm
 Algorithm development is a critical component of
problem solving using computers.
 A sequential algorithm is a sequence of basic
steps for solving a given problem using a serial
computer.
 A parallel algorithm has the added dimension of
concurrency and the algorithm designer must specify
sets of steps that can be executed simultaneously.
 Parallel Algorithm, is an algorithm which can do
multiple operations in a given unit time.
 Algorithms vary significantly in how parallelizable they
are, ranging from easily parallelizable to completely
un-parallelizable.

Dr. Muhammad Imran, ICS/IT, FMCS 3


Parallel Algorithms (1)
 Some problems are easy to divide up into pieces so
those could be solved as parallel problems.

 Some problems cannot be split up into parallel


portions, as they require the results from a
preceding step to effectively carry on with the next
step, these are called inherently serial problems.

 Parallel algorithms on Personal Computers have


become more common since the early 2000s
because of extensive improvements in
multiprocessing systems and the rise of multi-core
processors.
4
Dr. Muhammad Imran, ICS/IT, FMCS
Methodical Design (PCAM)
 Partitioning: The computation that is to
be performed and the data operated on by
this computation are decomposed into
small tasks.
 Communication: The communication
required to coordinate task execution is
determined, and appropriate
communication structures and algorithms
are defined.
 Agglomeration: The task and
communication structures defined in the
first two stages of a design are evaluated
with respect to performance requirements
and implementation costs.
 Mapping: Each task is assigned to a
processor in a manner that attempts to
satisfy the competing goals of maximizing
processor utilization and minimizing
communication costs.
Dr. Muhammad Imran, ICS/IT, FMCS 5
Assignment + Presentation
 Please visit the following link:
https://www.mcs.anl.gov/~itf/dbpp/text/no
de14.html#SECTION0230000000000000
0000
 Prepare a half-hour presentation on the
design of parallel algorithms using the
PCAM model.You can either explain the
model or select one of the case studies
for presentation.

Dr. Muhammad Imran, ICS/IT, FMCS 6


Some background

Dr. Muhammad Imran, ICS/IT, FMCS 7


Structure (data+ network)
 To apply any algorithm properly, it is very important
that you select a proper data structure.
 It is because a particular operation performed on a
data structure may take more time as compared to
the same operation performed on another data
structure.
 Therefore, the selection of a data structure must be
done considering the architecture and the type of
operations to be performed.
 The following data structures are commonly used in
parallel programming:
1. Linked List
2. Arrays
3. Hypercube Network

8
Dr. Muhammad Imran, ICS/IT, FMCS
Link List
 A linked list is a data structure having zero or more nodes
connected by pointers.
 Nodes may or may not occupy consecutive memory
locations.
 Each node has two or three parts – one data part that
stores the data and the other two are link fields that store
the address of the previous or next node.
 The first node’s address is stored in an external pointer
called head. The last node, known as tail, generally does not
contain any address. There are three types of linked lists:

1. Singly Linked List


2. Doubly Linked List
3. Circular Linked List

9
Dr. Muhammad Imran, ICS/IT, FMCS
Linked Lists (1)

Circular Linked List

10
Dr. Muhammad Imran, ICS/IT, FMCS
Arrays
 An array is a data structure where we can
store similar types of data.

 It can be one-dimensional or multi-


dimensional.

 Arrays can be created:


1. Statically
2. Dynamically
11
Dr. Muhammad Imran, ICS/IT, FMCS
Arrays (1)
 In statically declared arrays, dimension and size
of the arrays are known at the time of
compilation.

 In dynamically declared arrays, dimension and


size of the array are known at runtime.

 For shared memory programming, arrays can be


used as a common memory and for data parallel
programming, they can be used by partitioning
into sub-arrays.
12
Dr. Muhammad Imran, ICS/IT, FMCS
Hypercube Network
 Hypercube architecture is helpful for those
parallel algorithms where each task has to
communicate with other tasks.
 It is also known as n-cubes, where n is the
number of dimensions.
 The number of vertices (nodes) in the
hypercube is equal to 2^n, where n is the
number of dimensions or binary digits.

13
Dr. Muhammad Imran, ICS/IT, FMCS
Hypercube Network (1)

14
Dr. Muhammad Imran, ICS/IT, FMCS
Multiprocessor Models
a) A local memory machine
model consists of a set of n
processors each with its own
local memory all attached to a
common communication
network.
b) A modular memory machine
model consists of m memory
modules and n processors all
attached to a common network.
c) A Parallel Random Access
Machine (PRAM) Model
consists of a set of n processors
all connected to a common
shared memory

Dr. Muhammad Imran, ICS/IT, FMCS 15


Principles of Parallel Algorithm Design
 Specifying a parallel algorithm may include
some or all of the following:
◦ Identifying portions of the work that can be
performed concurrently.
◦ Mapping the concurrent pieces of work onto
multiple processes running in parallel.
◦ Distributing the input, output, and intermediate
data associated with the program.
◦ Managing accesses to data shared by multiple
processors.
◦ Synchronizing the processors at various stages of
the parallel program execution.

Dr. Muhammad Imran, ICS/IT, FMCS 16


Preliminaries: Decomposition, Tasks,
and Dependency Graphs
 Decomposition:
◦ The process of dividing a computation into
smaller parts, some or all of which may
potentially be executed in parallel.
 Tasks:
◦ Programmer-defined units of computation
into which the main computation is subdivided by
means of decomposition, is called Task.
◦ Simultaneous execution of multiple tasks is the
key to reducing the time required to solve the
entire problem.

Dr. Muhammad Imran, ICS/IT, FMCS 17


Dr. Muhammad Imran, ICS/IT, FMCS 18
Task Dependency Graph
 Some tasks may use data produced by other tasks and thus
may need to wait for these tasks to finish execution.
 An abstraction used to express such dependencies among
tasks and their relative order of execution is known as a
task-dependency graph
 It is a directed acyclic graph in which node are tasks and
the directed edges indicate the dependencies between them.
 The task corresponding to a node can be executed only
when all the predecessor [parent] tasks complete their
execution

Dr. Muhammad Imran, ICS/IT, FMCS 19


Dr. Muhammad Imran, ICS/IT, FMCS 20
Dr. Muhammad Imran, ICS/IT, FMCS 21
Dr. Muhammad Imran, ICS/IT, FMCS 22
Dr. Muhammad Imran, ICS/IT, FMCS 23
Degree of Concurrency
 The number of tasks that can be executed in parallel is the
degree of concurrency of a decomposition
 Maximum Degree of Concurrency
◦ The number of tasks that can be executed in parallel may change
over program execution, thus
◦ The maximum number of tasks that can be executed concurrently
in a parallel program at any given time is known as its maximum
degree of concurrency
◦ Usually, it is less than total number of tasks due to dependencies.
 Rule of thumb: For task-dependency graphs that are trees,
the maximum degree of concurrency is always equal to
the number of leaves in the tree
 The degree of concurrency increases as the decomposition
becomes finer in granularity and vice versa.

Dr. Muhammad Imran, ICS/IT, FMCS 24


Average Degree of Concurrency
 A relatively better measure for the
performance of a parallel program
 The average number of tasks that
can run concurrently over the entire
duration of execution of the program
 The ratio of the total amount of work
to the critical path length
 So, what is the critical path in the graph?

Dr. Muhammad Imran, ICS/IT, FMCS 25


Critical Path Length
 Critical Path: The longest directed path
between any pair of start and finish
nodes is known as the critical path.
 Critical Path Length: The sum of the
weights of nodes along this path
◦ the weight of a node is the size, or the amount of work
associated with the corresponding task.
 A shorter critical path favors a higher
average degree of concurrency.
 Both, maximum and average degree of
concurrency increases as tasks become
smaller(finer)

Dr. Muhammad Imran, ICS/IT, FMCS 26


What are the average degree of concurrency for the two task dependency graphs?

Dr. Muhammad Imran, ICS/IT, FMCS 27


Dr. Muhammad Imran, ICS/IT, FMCS 28
Dr. Muhammad Imran, ICS/IT, FMCS 29
Cont.
 The nodes in a task-interaction graph
represent tasks
 The edges connect tasks that interact
with each other
The edges in a task interaction graph are
usually undirected
◦ but directed edges can be used to indicate the
direction of flow of data, if it is unidirectional.
 The edge-set of a task-interaction graph
is usually a superset of the edge-set of the
task-dependency graph

Dr. Muhammad Imran, ICS/IT, FMCS 30


Dr. Muhammad Imran, ICS/IT, FMCS 31
Dr. Muhammad Imran, ICS/IT, FMCS 32
Processes and Mapping
 Logical processing or computing agent that
performs tasks is called process.
 The process of assigning tasks to logical
computing agents (i.e., processes) is called
mapping.
 Multiple tasks can be mapped on a single
process
◦ However, Independent task should be mapped onto different
processes
 Map tasks with high mutual-interactions onto a
single process
Dr. Muhammad Imran, ICS/IT, FMCS 33
Dr. Muhammad Imran, ICS/IT, FMCS 34
Dr. Muhammad Imran, ICS/IT, FMCS 35
Example

Dr. Muhammad Imran, ICS/IT, FMCS 36


Processes and Processors
 Processes are logical computing agents
that perform tasks
 Processors are the hardware units that
physically perform computations
 Depending on the problem, multiple
processes can be mapped on a single
processor
 But, in most of the cases, there is one-to-
one correspondence between
processors and processes
◦ So, we assume that there are as many processes as
the number of physical CPUs on the parallel computer

Dr. Muhammad Imran, ICS/IT, FMCS 37

You might also like