0% found this document useful (0 votes)

7 views39 pages

Chapter 1

The document introduces the necessity of parallel computing, emphasizing the need for increased performance due to growing computational problems. It discusses the challenges programmers face in utilizing multiple processors effectively and outlines methods for writing parallel programs. Additionally, it distinguishes between task and data parallelism, and explains coordination, communication, and synchronization among cores in parallel systems.

Uploaded by

hzfhzf137

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views39 pages

Chapter 1

Uploaded by

hzfhzf137

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 39

An Introduction to Parallel Programming

Peter Pacheco

Chapter 1
Why Parallel Computing?

Copyright © 2010, Elsevier Inc. All rights Reserved 1

# Chapter Subtitle
Roadmap
 Why we need ever-increasing performance.
 Why we’re building parallel systems.
 Why we need to write parallel programs.
 How do we write parallel programs?
 What we’ll be doing.
 Concurrent, parallel, distributed!

Copyright © 2010, Elsevier Inc. All rights Reserved 2

An intelligent solution
 Instead of designing and building faster
microprocessors, put multiple processors
on a single integrated circuit.

Copyright © 2010, Elsevier Inc. All rights Reserved 3

Now it’s up to the programmers
 Adding more processors doesn’t help
much if programmers aren’t aware of
them…
 … or don’t know how to use them.

 Serial programs don’t benefit from this

approach (in most cases).

Copyright © 2010, Elsevier Inc. All rights Reserved 4

Why we need ever-increasing
performance
 Computational power is increasing, but so
are our computation problems and needs.
 Problems we never dreamed of have been
solved because of past increases, such as
decoding the human genome.
 More complex problems are still waiting to
be solved.

Copyright © 2010, Elsevier Inc. All rights Reserved 5

Climate modeling

Copyright © 2010, Elsevier Inc. All rights Reserved 6

Drug discovery

Copyright © 2010, Elsevier Inc. All rights Reserved 7

Energy research

Copyright © 2010, Elsevier Inc. All rights Reserved 8

Data analysis

Copyright © 2010, Elsevier Inc. All rights Reserved 9

Solution
 Move away from single-core systems to
multicore processors.
 “core” = central processing unit (CPU)

 Introducing parallelism!!!

Copyright © 2010, Elsevier Inc. All rights Reserved 10

Why we need to write parallel
programs
 Running multiple instances of a serial
program often isn’t very useful.
 Think of running multiple instances of your
favorite game.

 What you really want is for

it to run faster.

Copyright © 2010, Elsevier Inc. All rights Reserved 11

Approaches to the serial problem
 Rewrite serial programs so that they’re
parallel.

 Write translation programs that

automatically convert serial programs into
parallel programs.
 This is very difficult to do.
 Success has been limited.

Copyright © 2010, Elsevier Inc. All rights Reserved 12

More problems
 Some coding constructs can be
recognized by an automatic program
generator, and converted to a parallel
construct.
 However, it’s likely that the result will be a
very inefficient program.

Copyright © 2010, Elsevier Inc. All rights Reserved 13

Example
 Compute n values and add them together.
 Serial solution:

Copyright © 2010, Elsevier Inc. All rights Reserved 14

Example (cont.)
 We have p cores, p much smaller than n.

Each core uses it’s own private variables

and executes this block of code
independently of the other cores.

Copyright © 2010, Elsevier Inc. All rights Reserved 15

Example (cont.)
 After each core completes execution of the
code, is a private variable my_sum
contains the sum of the values computed
by its calls to Compute_next_value.

 Ex., 8 cores, n = 24, then the calls to

Compute_next_value return:
1,4,3, 9,2,8, 5,1,1, 5,2,7, 2,5,0, 4,1,8, 6,5,1, 2,3,9

Copyright © 2010, Elsevier Inc. All rights Reserved 16

Example (cont.)
 Once all the cores are done computing
their private my_sum, they form a global
sum by sending results to a designated
“master” core which adds the final result.

Copyright © 2010, Elsevier Inc. All rights Reserved 17

Example (cont.)

Copyright © 2010, Elsevier Inc. All rights Reserved 18

Example (cont.)
Core 0 1 2 3 4 5 6 7
my_sum 8 19 7 15 7 13 12 14

Global sum
8 + 19 + 7 + 15 + 7 + 13 + 12 + 14 = 95

Core 0 1 2 3 4 5 6 7
my_sum 95 19 7 15 7 13 12 14

Copyright © 2010, Elsevier Inc. All rights Reserved 19

But wait!
There’s a much better way
to compute the global sum.

Copyright © 2010, Elsevier Inc. All rights Reserved 20

Better parallel algorithm
 Don’t make the master core do all the
work.
 Share it among the other cores.
 Pair the cores so that core 0 adds its result
with core 1’s result.
 Core 2 adds its result with core 3’s result,
etc.
 Work with odd and even numbered pairs of
cores.

Copyright © 2010, Elsevier Inc. All rights Reserved 21

Better parallel algorithm (cont.)
 Repeat the process now with only the
evenly ranked cores.
 Core 0 adds result from core 2.
 Core 4 adds the result from core 6, etc.

 Now cores divisible by 4 repeat the

process, and so forth, until core 0 has the
final result.

Copyright © 2010, Elsevier Inc. All rights Reserved 22

Multiple cores forming a global
sum

Copyright © 2010, Elsevier Inc. All rights Reserved 23

Analysis
 In the first example, the master core
performs 7 receives and 7 additions.

 In the second example, the master core

performs 3 receives and 3 additions.

 The improvement is more than a factor of 2!

Copyright © 2010, Elsevier Inc. All rights Reserved 24

Analysis (cont.)
 The difference is more dramatic with a
larger number of cores.
 If we have 1000 cores:
 The first example would require the master to
perform 999 receives and 999 additions.
 The second example would only require 10
receives and 10 additions.

 That’s an improvement of almost a factor

of 100!
Copyright © 2010, Elsevier Inc. All rights Reserved 25
How do we write parallel
programs?
 Task parallelism
 Partition various tasks carried out solving the
problem among the cores.

 Data parallelism
 Partition the data used in solving the problem
among the cores.
 Each core carries out similar operations on it’s
part of the data.

Professor P

15 questions
300 exams

Professor P’s grading assistants

TA#1 TA#3
TA#2

Division of work –
data parallelism

TA#1
100 exams
TA#3

100 exams

100 exams
TA#2

Division of work –
task parallelism

TA#1
TA#3
Questions 11 - 15
Questions 1 - 5

TA#2
Questions 6 - 10

Division of work –
data parallelism

Division of work –
task parallelism

Tasks
1)Receiving
2)Addition

Coordination
 Cores usually need to coordinate their work.
 Communication – one or more cores send
their current partial sums to another core.
 Load balancing – share the work evenly
among the cores so that one is not heavily
loaded.
 Synchronization – because each core works
at its own pace, make sure cores do not get
too far ahead of the rest.

What we’ll be doing
 Learning to write programs that are
explicitly parallel.
 Using the C language.
 Using three different extensions to C.
 Message-Passing Interface (MPI)
 Posix Threads (Pthreads)
 OpenMP

Type of parallel systems
 Shared-memory
 The cores can share access to the computer’s
memory.
 Coordinate the cores by having them examine
and update shared memory locations.
 Distributed-memory
 Each core has its own, private memory.
 The cores must communicate explicitly by
sending messages across a network.

Type of parallel systems

Shared-memory Distributed-memory

Terminology
 Concurrent computing – a program is one
in which multiple tasks can be in progress
at any instant.
 Parallel computing – a program is one in
which multiple tasks cooperate closely to
solve a problem
 Distributed computing – a program may
need to cooperate with other programs to
solve a problem.

Concluding Remarks (1)

 Serial programs typically don’t benefit from

multiple cores.
 Automatic parallel program generation
from serial program code isn’t the most
efficient approach to get high performance
from multicore computers.

Concluding Remarks (2)
 Learning to write parallel programs
involves learning how to coordinate the
cores.
 Parallel programs are usually very
complex and therefore, require sound
program techniques and development.

Kenya Medical Training College Proposal
33% (3)
Kenya Medical Training College Proposal
13 pages
Chapter - 2 - Parallel Hardware and Parallel Software
No ratings yet
Chapter - 2 - Parallel Hardware and Parallel Software
143 pages
Lesson 2 Current Trends and Emerging Technologies - JENCY JOY MALASIG
No ratings yet
Lesson 2 Current Trends and Emerging Technologies - JENCY JOY MALASIG
15 pages
Gravimetic Feeders
100% (1)
Gravimetic Feeders
26 pages
Why Parallel Computing?: Peter Pacheco
No ratings yet
Why Parallel Computing?: Peter Pacheco
84 pages
2023 Toyota Crown
No ratings yet
2023 Toyota Crown
9 pages
Week1 - Parallel and Distributed Computing
100% (1)
Week1 - Parallel and Distributed Computing
46 pages
CSE524sp10 01
No ratings yet
CSE524sp10 01
62 pages
Quotation of Classroom Block at Springs Educational Services 2024
No ratings yet
Quotation of Classroom Block at Springs Educational Services 2024
2 pages
SPE-199091-MS, Electric Submersible Pump Troubleshooting Guide, An Effective Way To Improve System Performance and Reduce Avoidable System Failues
100% (1)
SPE-199091-MS, Electric Submersible Pump Troubleshooting Guide, An Effective Way To Improve System Performance and Reduce Avoidable System Failues
18 pages
LV Circuit Breaker Calculator Guide (Level 2) European Arc Guide EAG
No ratings yet
LV Circuit Breaker Calculator Guide (Level 2) European Arc Guide EAG
5 pages
IBM Security Guardium Data Protection Level 2
No ratings yet
IBM Security Guardium Data Protection Level 2
13 pages
HPC Parallel
No ratings yet
HPC Parallel
122 pages
COL380: Introduction To Parallel & Distributed Programming
No ratings yet
COL380: Introduction To Parallel & Distributed Programming
20 pages
Chapter 2
No ratings yet
Chapter 2
143 pages
CH - 5. Memory Management
No ratings yet
CH - 5. Memory Management
86 pages
CMP 252 - Parallelism Fundamentals
No ratings yet
CMP 252 - Parallelism Fundamentals
64 pages
Specification For Fireproofingof Structural Steel and Equipment
No ratings yet
Specification For Fireproofingof Structural Steel and Equipment
11 pages
Parallel Computing 1 Unit
No ratings yet
Parallel Computing 1 Unit
59 pages
Shared Memory and Accelerators
No ratings yet
Shared Memory and Accelerators
88 pages
EE664: Introduction To Parallel Computing: Dr. Gaurav Trivedi Lectures 5-14
No ratings yet
EE664: Introduction To Parallel Computing: Dr. Gaurav Trivedi Lectures 5-14
170 pages
L04 Parallel Programming Models I
No ratings yet
L04 Parallel Programming Models I
72 pages
01 - Lecture Intro To HPC
No ratings yet
01 - Lecture Intro To HPC
62 pages
Chapter 1
No ratings yet
Chapter 1
47 pages
Introduction To Parallel Programming: Center For Institutional Research Computing
No ratings yet
Introduction To Parallel Programming: Center For Institutional Research Computing
98 pages
Com - Upgadata.up7723 Logcat
No ratings yet
Com - Upgadata.up7723 Logcat
47 pages
Unit VI Parallel Programming Concepts
No ratings yet
Unit VI Parallel Programming Concepts
90 pages
3-Parallel Software
No ratings yet
3-Parallel Software
35 pages
CS526 3 Design of Parallel Programs
No ratings yet
CS526 3 Design of Parallel Programs
83 pages
Parallel and Distributed Computing Module I
No ratings yet
Parallel and Distributed Computing Module I
26 pages
001 - DDS IIIT Jan 10th
No ratings yet
001 - DDS IIIT Jan 10th
34 pages
14013204-3 - Parallel Computing - Lecture1
No ratings yet
14013204-3 - Parallel Computing - Lecture1
52 pages
01 (Why Parallel Computing)
No ratings yet
01 (Why Parallel Computing)
24 pages
Iterative Construct in Java
No ratings yet
Iterative Construct in Java
39 pages
Fortimanager v6.4.11 Release Notes
No ratings yet
Fortimanager v6.4.11 Release Notes
45 pages
Parallel Algorithms: Theory and Practice: Deterministi C Parallelism
No ratings yet
Parallel Algorithms: Theory and Practice: Deterministi C Parallelism
51 pages
Lecture 9 - Parallel Algorithms
No ratings yet
Lecture 9 - Parallel Algorithms
28 pages
OpenACC 1
No ratings yet
OpenACC 1
44 pages
?simplify Allocations With SAP Analytics Cloud?
No ratings yet
?simplify Allocations With SAP Analytics Cloud?
15 pages
Parallel Algorithms: Theory and Practice
No ratings yet
Parallel Algorithms: Theory and Practice
44 pages
04 Progbasics
No ratings yet
04 Progbasics
62 pages
Using File Server Resource Manager To Screen For Ransomware
No ratings yet
Using File Server Resource Manager To Screen For Ransomware
19 pages
G Suite Interview Questions
No ratings yet
G Suite Interview Questions
7 pages
Toshiba 500gb Dt01aca Dt01aca050!3!5 Internal Hard Hdkpc01 282179 User Manual
No ratings yet
Toshiba 500gb Dt01aca Dt01aca050!3!5 Internal Hard Hdkpc01 282179 User Manual
2 pages
Multicore02 2
No ratings yet
Multicore02 2
18 pages
Introduction
No ratings yet
Introduction
17 pages
How To Parallelise An Application
No ratings yet
How To Parallelise An Application
30 pages
HPC Note
No ratings yet
HPC Note
39 pages
Lecture Parallelism DC PDF
No ratings yet
Lecture Parallelism DC PDF
7 pages
01-Parallel Computing
No ratings yet
01-Parallel Computing
7 pages
High Performance Computing: Sabah Sayed
No ratings yet
High Performance Computing: Sabah Sayed
22 pages
BCSE412L - Parallel Computing 01
No ratings yet
BCSE412L - Parallel Computing 01
27 pages
Watercolor Organic Shapes SlidesMania
No ratings yet
Watercolor Organic Shapes SlidesMania
23 pages
Clase01 - Introducción Al Paralelismo
No ratings yet
Clase01 - Introducción Al Paralelismo
30 pages
Clase01 - Introducción Al Paralelismo
No ratings yet
Clase01 - Introducción Al Paralelismo
30 pages
Parallel Computing972003 1223239697675005 9
No ratings yet
Parallel Computing972003 1223239697675005 9
32 pages
Mbeya University of Science and Technology: Admission Requirements
No ratings yet
Mbeya University of Science and Technology: Admission Requirements
15 pages
ParallelIzation Principles
No ratings yet
ParallelIzation Principles
40 pages
Simulating Ocean Currents
No ratings yet
Simulating Ocean Currents
35 pages
T34 Catlogue - Catalogue - V2 - 2023
No ratings yet
T34 Catlogue - Catalogue - V2 - 2023
8 pages
Z0200461 Busi55215 - 2023
No ratings yet
Z0200461 Busi55215 - 2023
9 pages
02 - Lecture #2
No ratings yet
02 - Lecture #2
29 pages
Unit1 RMD PDF
No ratings yet
Unit1 RMD PDF
27 pages
Computação Paralela
No ratings yet
Computação Paralela
18 pages
Grande y Lopez 2011 - The Implementation of An International Charter in The Field of Virtual Archeology
No ratings yet
Grande y Lopez 2011 - The Implementation of An International Charter in The Field of Virtual Archeology
6 pages
02 - Introduction To Concurrent Systems PDF
No ratings yet
02 - Introduction To Concurrent Systems PDF
31 pages
Powervu Model D9835 Satellite Receiver: Description
No ratings yet
Powervu Model D9835 Satellite Receiver: Description
4 pages
.Trashed-1650000204-Hpc Prac Exam
No ratings yet
.Trashed-1650000204-Hpc Prac Exam
5 pages
Brakes Volvo Trucks
No ratings yet
Brakes Volvo Trucks
2 pages
Cirvyn Ithinus
No ratings yet
Cirvyn Ithinus
2 pages
HPC Overview
No ratings yet
HPC Overview
45 pages
Cray-1 (1976) : The World's Most Expensive Love Seat
No ratings yet
Cray-1 (1976) : The World's Most Expensive Love Seat
18 pages
Introduction To Parallel Computation: Akl@cs - Queensu.ca
No ratings yet
Introduction To Parallel Computation: Akl@cs - Queensu.ca
2 pages
PKG List (Submit To Mr. Jeong)
No ratings yet
PKG List (Submit To Mr. Jeong)
6 pages
Intro Parallel Programming 2015
No ratings yet
Intro Parallel Programming 2015
38 pages
A Survey of Parallel Programming Models and Tools in The Multi and Many-Core Era
No ratings yet
A Survey of Parallel Programming Models and Tools in The Multi and Many-Core Era
18 pages
CUBO - Work Schedule
No ratings yet
CUBO - Work Schedule
1 page
Parallel Programming
No ratings yet
Parallel Programming
18 pages
Model SLS
No ratings yet
Model SLS
2 pages
Lect 1 Overview
No ratings yet
Lect 1 Overview
17 pages
Part 1 - Lecture 1 - Introduction Parallel Computing
No ratings yet
Part 1 - Lecture 1 - Introduction Parallel Computing
33 pages
Lecture 1: Cryptography: 1.2.1 Symmetric Case
No ratings yet
Lecture 1: Cryptography: 1.2.1 Symmetric Case
3 pages
Parallel Programming: Aaron Bloomfield CS 415 Fall 2005
No ratings yet
Parallel Programming: Aaron Bloomfield CS 415 Fall 2005
24 pages
What Is Parallel Computing 1 PDF
No ratings yet
What Is Parallel Computing 1 PDF
21 pages
Parallel Computing: Charles Koelbel
No ratings yet
Parallel Computing: Charles Koelbel
12 pages
Mansi Kadam PC Lab Assignment 1
No ratings yet
Mansi Kadam PC Lab Assignment 1
4 pages
How To Sound Like A Parallel Programming Expert - Part 1 Introducing Concurrency and Parallelism
No ratings yet
How To Sound Like A Parallel Programming Expert - Part 1 Introducing Concurrency and Parallelism
4 pages
"SCILAB - An Open Source Substitute For MATLAB": Organized By: JNTUH College of Engineering, Sultanpur
No ratings yet
"SCILAB - An Open Source Substitute For MATLAB": Organized By: JNTUH College of Engineering, Sultanpur
4 pages
FPFF RF PDF
No ratings yet
FPFF RF PDF
1 page
Learn Multithreading with Modern C++
From Everand
Learn Multithreading with Modern C++
James Raynard
No ratings yet

Chapter 1

Uploaded by

Chapter 1

Uploaded by

An Introduction to Parallel Programming

Copyright © 2010, Elsevier Inc. All rights Reserved 1

Copyright © 2010, Elsevier Inc. All rights Reserved 2

Copyright © 2010, Elsevier Inc. All rights Reserved 3

 Serial programs don’t benefit from this

Copyright © 2010, Elsevier Inc. All rights Reserved 4

Copyright © 2010, Elsevier Inc. All rights Reserved 5

Copyright © 2010, Elsevier Inc. All rights Reserved 6

Copyright © 2010, Elsevier Inc. All rights Reserved 7

Copyright © 2010, Elsevier Inc. All rights Reserved 8

Copyright © 2010, Elsevier Inc. All rights Reserved 9

Copyright © 2010, Elsevier Inc. All rights Reserved 10

 What you really want is for

Copyright © 2010, Elsevier Inc. All rights Reserved 11

 Write translation programs that

Copyright © 2010, Elsevier Inc. All rights Reserved 12

Copyright © 2010, Elsevier Inc. All rights Reserved 13

Copyright © 2010, Elsevier Inc. All rights Reserved 14

Each core uses it’s own private variables

Copyright © 2010, Elsevier Inc. All rights Reserved 15

 Ex., 8 cores, n = 24, then the calls to

Copyright © 2010, Elsevier Inc. All rights Reserved 16

Copyright © 2010, Elsevier Inc. All rights Reserved 17

Copyright © 2010, Elsevier Inc. All rights Reserved 18

Copyright © 2010, Elsevier Inc. All rights Reserved 19

Copyright © 2010, Elsevier Inc. All rights Reserved 20

Copyright © 2010, Elsevier Inc. All rights Reserved 21

 Now cores divisible by 4 repeat the

Copyright © 2010, Elsevier Inc. All rights Reserved 22

Copyright © 2010, Elsevier Inc. All rights Reserved 23

 In the second example, the master core

 The improvement is more than a factor of 2!

Copyright © 2010, Elsevier Inc. All rights Reserved 24

 That’s an improvement of almost a factor

Copyright © 2010, Elsevier Inc. All rights Reserved 26

Copyright © 2010, Elsevier Inc. All rights Reserved 27

Copyright © 2010, Elsevier Inc. All rights Reserved 28

Copyright © 2010, Elsevier Inc. All rights Reserved 29

Copyright © 2010, Elsevier Inc. All rights Reserved 30

Copyright © 2010, Elsevier Inc. All rights Reserved 31

Copyright © 2010, Elsevier Inc. All rights Reserved 32

Copyright © 2010, Elsevier Inc. All rights Reserved 33

Copyright © 2010, Elsevier Inc. All rights Reserved 34

Copyright © 2010, Elsevier Inc. All rights Reserved 35

Copyright © 2010, Elsevier Inc. All rights Reserved 36

Copyright © 2010, Elsevier Inc. All rights Reserved 37

 Serial programs typically don’t benefit from

Copyright © 2010, Elsevier Inc. All rights Reserved 38

Copyright © 2010, Elsevier Inc. All rights Reserved 39

You might also like