0% found this document useful (0 votes)

10 views25 pages

Revision Slides

The document discusses pipelining in computer systems, illustrating how dividing operations into separate functional units can improve performance and reduce execution time for floating point additions. It also covers the concepts of speedup and efficiency in parallel programming, detailing Foster's methodology for designing parallel programs, which includes partitioning, communication, agglomeration, and mapping. Additionally, it distinguishes between concurrency and parallelism, providing examples of how each can be implemented in server applications.

Uploaded by

Arif Bin Muhammad Azahar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views25 pages

Revision Slides

Uploaded by

Arif Bin Muhammad Azahar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 25

Revision Slide

Dr. Bryan Raj

Department of Computer System & Technology
Faculty of Computer Science & Information Technology
University of Malaya
[email protected]
Pipelining
Pipelining example (1)

Add the floating point

numbers 9.87×104 and
6.54×103
Pipelining example (2)

 Assume each
operation takes one
nanosecond (10-9
seconds).

 This for loop takes

about 7000
nanoseconds.
Pipelining (3)
 Divide the floating point adder into 7 separate
pieces of hardware or functional units.
 First unit fetches two operands, second unit
compares exponents, etc.
 Output of one functional unit is input to the next.
Pipelining (4)

Table 2.3: Pipelined Addition.

Numbers in the table are subscripts of
operands/results.
Pipelining (5)
 One floating point addition still takes
7 nanoseconds.

 But 1000 floating point additions

now takes 1006 nanoseconds!
PERFORMANCE

8
Speedup
 Number of cores = p
 Serial run-time = Tserial
 Parallel run-time = Tparallel

u p
ed
s pe
r
lin
ea
Tparallel = Tserial / p

9
Speedup of a parallel program

Tserial
S=
Tparallel

10
Efficiency of a parallel program

Tserial

S Tparallel Tserial

E= = =
p p
.
p Tparallel

11
Speedups and
efficiencies of a parallel
program

12
Example
 We can parallelize 90% of a serial
program.
 Parallelization is “perfect”
regardless of the number of cores p
we use.
 Tserial = 20 seconds
 Runtime
0.9 x T of
/ p parallelizable
serial = 18 / p part is

13
Example (cont.)
 Runtime of “unparallelizable” part is
0.1 x Tserial = 2

 Overall parallel run-time is

Tparallel = 0.9 x Tserial / p + 0.1 x Tserial = 18 / p + 2

14
Example (cont.)
 Speed up

Tserial 20
S= =
0.9 x Tserial / p + 0.1 x Tserial 18 / p + 2

15
PARALLEL PROGRAM
DESIGN

16
Foster’s methodology

1. Partitioning: divide the computation to be

performed and the data operated on by the
computation into small tasks.

The focus here should be on identifying tasks

that can be executed in parallel.

17
Foster’s methodology
2. Communication: determine what communication
needs to be carried out among the tasks
identified in the previous step.

18
Foster’s methodology
3. Agglomeration or aggregation: combine tasks
and communications identified in the first step
into larger tasks.

For example, if task A must be executed before

task B can be executed, it may make sense to
aggregate them into a single composite task.

19
Foster’s methodology
4. Mapping: assign the composite tasks identified in
the previous step to processes/threads.

This should be done so that communication is

minimized, and each process/thread gets roughly
the same amount of work.

20
Concurrency VS Parallelism
Concurrency vs.
Parallelism
■ Concurrency means dealing with several things at once
□ Programming concept for the developer
□ In shared-memory systems, implemented by time sharing
■ Parallelism means doing several things at once
□ Demands parallel hardware
■ Parallel programming is a misnomer
□ Concurrent programming aiming at parallel execution
■ Any parallel software is concurrent software
□ Note: Some researchers disagree, most practitioners agree
■ Concurrent software is not always parallel software
Concurrency
□ Many server applications achieve scalability
by optimizing concurrency only (web server)
Parallelism
Server Example:
No Concurrency, No
Parallelism

Client Core 1 Storage

Receive request A

Fetch A data

Return A data

R
e
t
Client Core 1u Storage
r
n

r
e
s
u
Server Example:
Concurrency for
Throughput
Client 1 Client 2 Core 1 Storage

Receive request A

Fetch A data

R
e
c
e
i
v
e

r
e
q
u
e
s
t
Client 1 Client 2 Core 1 Storage
B
Server Example:
Parallelism for Throughput
Client 1 Client 2 Core 1 Core 2 Storage

Receive request A

Receive quest B
re
Fe tch A data

Fetch B data

Return B data

Return A data

Return re sult B'

Return result A'

Client 1 Client 2 Core 1 Core 2 Storage

3cx Basic Exam Corrected2
50% (2)
3cx Basic Exam Corrected2
7 pages
High Performance Computing Unit 1-2
No ratings yet
High Performance Computing Unit 1-2
60 pages
Why Parallel Computing?: Peter Pacheco
No ratings yet
Why Parallel Computing?: Peter Pacheco
84 pages
Parameter All Data Backup Procedure: "D:/Machine Backup/mmddyy/"
100% (1)
Parameter All Data Backup Procedure: "D:/Machine Backup/mmddyy/"
5 pages
5 - Designing Parallel Programs
No ratings yet
5 - Designing Parallel Programs
52 pages
James Smith - Build Your Own Web Server From Scratch in Node - JS - Learn Network Programming, HTTP, and WebSocket by Coding A Web Server (2024)
100% (2)
James Smith - Build Your Own Web Server From Scratch in Node - JS - Learn Network Programming, HTTP, and WebSocket by Coding A Web Server (2024)
132 pages
Parallel Computing
No ratings yet
Parallel Computing
28 pages
Week1 - Parallel and Distributed Computing
100% (1)
Week1 - Parallel and Distributed Computing
46 pages
High Performance Computing (HPC) - Lec2
No ratings yet
High Performance Computing (HPC) - Lec2
53 pages
Parallel and Distributed Computing
33% (3)
Parallel and Distributed Computing
10 pages
U1&u2 Padcom-25
No ratings yet
U1&u2 Padcom-25
95 pages
HPC Parallel
No ratings yet
HPC Parallel
122 pages
HPC - Unit-1 Insem Notes
No ratings yet
HPC - Unit-1 Insem Notes
76 pages
Pda 1
No ratings yet
Pda 1
72 pages
HPC Unit 2
No ratings yet
HPC Unit 2
72 pages
HPC Unit 1
No ratings yet
HPC Unit 1
65 pages
3.introduction To Parallelism
No ratings yet
3.introduction To Parallelism
64 pages
PDC Lecture 02
No ratings yet
PDC Lecture 02
35 pages
Communicating With An Energy Meter Using Iec 62056 Through TCP - Ip
No ratings yet
Communicating With An Energy Meter Using Iec 62056 Through TCP - Ip
7 pages
20-Comm-E Ethernet/Ip Adapter: User Manual
No ratings yet
20-Comm-E Ethernet/Ip Adapter: User Manual
288 pages
Chapter 1
No ratings yet
Chapter 1
47 pages
Introduction To Parallel Programming: Center For Institutional Research Computing
No ratings yet
Introduction To Parallel Programming: Center For Institutional Research Computing
98 pages
Clase01 - Introducción Al Paralelismo
No ratings yet
Clase01 - Introducción Al Paralelismo
30 pages
Clase01 - Introducción Al Paralelismo
No ratings yet
Clase01 - Introducción Al Paralelismo
30 pages
001 - DDS IIIT Jan 10th
No ratings yet
001 - DDS IIIT Jan 10th
34 pages
CS326 Parallel and Distributed Computing: SPRING 2021 National University of Computer and Emerging Sciences
No ratings yet
CS326 Parallel and Distributed Computing: SPRING 2021 National University of Computer and Emerging Sciences
47 pages
PDC 3
No ratings yet
PDC 3
26 pages
Parallel Programming
No ratings yet
Parallel Programming
42 pages
2-INTRODUCTION TO PDC - MOTIVATION - KEY CONCEPTS-03-Dec-2019Material - I - 03-Dec-2019 - Module - 1 PDF
No ratings yet
2-INTRODUCTION TO PDC - MOTIVATION - KEY CONCEPTS-03-Dec-2019Material - I - 03-Dec-2019 - Module - 1 PDF
63 pages
PC 1
No ratings yet
PC 1
53 pages
Chapter 1
No ratings yet
Chapter 1
39 pages
ch2 PC
No ratings yet
ch2 PC
44 pages
Group3 - Parallel - Computing - Techniques - Presentation Power Point 2025
No ratings yet
Group3 - Parallel - Computing - Techniques - Presentation Power Point 2025
27 pages
410A Week 5
No ratings yet
410A Week 5
23 pages
Parallel and Distributed Algorithms
No ratings yet
Parallel and Distributed Algorithms
65 pages
Lecture 9 - Parallel Algorithms
No ratings yet
Lecture 9 - Parallel Algorithms
28 pages
Partitioning
No ratings yet
Partitioning
37 pages
Lecture 3 and 4HPC
No ratings yet
Lecture 3 and 4HPC
24 pages
HPC Note
No ratings yet
HPC Note
39 pages
Chapter 23
No ratings yet
Chapter 23
24 pages
Perspective On Parallel Programming: CS 258, Spring 99 David E. Culler Computer Science Division U.C. Berkeley
No ratings yet
Perspective On Parallel Programming: CS 258, Spring 99 David E. Culler Computer Science Division U.C. Berkeley
42 pages
Lecture1 Introduction PDF
No ratings yet
Lecture1 Introduction PDF
43 pages
MPMC - Notes T Apparao
No ratings yet
MPMC - Notes T Apparao
105 pages
High Performance Computing For Computational Mechanics: ISCM-10
No ratings yet
High Performance Computing For Computational Mechanics: ISCM-10
63 pages
Parallel Computing: Lecture 4: Parallel Software: Basics
No ratings yet
Parallel Computing: Lecture 4: Parallel Software: Basics
31 pages
Mansi Kadam PC Lab Assignment 1
No ratings yet
Mansi Kadam PC Lab Assignment 1
4 pages
Debugging, Profiling, Performance Analysis, Optimization PDF
No ratings yet
Debugging, Profiling, Performance Analysis, Optimization PDF
56 pages
24csppc202 Multicore Architecture and Programming
No ratings yet
24csppc202 Multicore Architecture and Programming
21 pages
HPC Overview
No ratings yet
HPC Overview
45 pages
CS621 Cheatsheet
No ratings yet
CS621 Cheatsheet
11 pages
Simulating Ocean Currents
No ratings yet
Simulating Ocean Currents
35 pages
ParallelIzation Principles
No ratings yet
ParallelIzation Principles
40 pages
Parallel Computing
No ratings yet
Parallel Computing
24 pages
CC ZG501 Course Handout
No ratings yet
CC ZG501 Course Handout
8 pages
BSNL 5-Ess
100% (2)
BSNL 5-Ess
28 pages
Cray-1 (1976) : The World's Most Expensive Love Seat
No ratings yet
Cray-1 (1976) : The World's Most Expensive Love Seat
18 pages
01-Parallel Computing
No ratings yet
01-Parallel Computing
7 pages
High Performance Computing
No ratings yet
High Performance Computing
8 pages
.Trashed-1650000204-Hpc Prac Exam
No ratings yet
.Trashed-1650000204-Hpc Prac Exam
5 pages
Parallel Programming
No ratings yet
Parallel Programming
18 pages
HPC Detailed Notes
No ratings yet
HPC Detailed Notes
5 pages
Fallsem2019-20 Cse4001 Eth Vl2019201001348 Reference Material Cse4001 Parallel and Distributed Computing May 2019 (003) 18
No ratings yet
Fallsem2019-20 Cse4001 Eth Vl2019201001348 Reference Material Cse4001 Parallel and Distributed Computing May 2019 (003) 18
4 pages
Syllabus
No ratings yet
Syllabus
2 pages
Project - ParallelComputing BSR v2
No ratings yet
Project - ParallelComputing BSR v2
40 pages
Sardar Patel Institute of Technology: Department of Computer Engineering
No ratings yet
Sardar Patel Institute of Technology: Department of Computer Engineering
3 pages
PC Package Unit 1 Introduction To Computer
No ratings yet
PC Package Unit 1 Introduction To Computer
28 pages
Materi - Basic Excel For Beginners - Komunita X
100% (1)
Materi - Basic Excel For Beginners - Komunita X
45 pages
VHDL Manual
No ratings yet
VHDL Manual
77 pages
Hydra Router Attack
No ratings yet
Hydra Router Attack
6 pages
Hospital Management 2
No ratings yet
Hospital Management 2
12 pages
MEAN Stack Training TOC
No ratings yet
MEAN Stack Training TOC
9 pages
18 4 Installation Grid Preq2
No ratings yet
18 4 Installation Grid Preq2
3 pages
Icc Practical-2: Aim: To Install VM Virtualbox and Ubuntu On Macos. What Is Virtualization?
No ratings yet
Icc Practical-2: Aim: To Install VM Virtualbox and Ubuntu On Macos. What Is Virtualization?
8 pages
Q1 What Is Thread? Explain Any 2 Multithreading Models in Brief With Diagram
No ratings yet
Q1 What Is Thread? Explain Any 2 Multithreading Models in Brief With Diagram
4 pages
04 Addressing Modes PDF
No ratings yet
04 Addressing Modes PDF
15 pages
A Crash Course in Caching - Part 1 - by Alex Xu
No ratings yet
A Crash Course in Caching - Part 1 - by Alex Xu
9 pages
Agito AGM800 Product Manual Rev.2.0
No ratings yet
Agito AGM800 Product Manual Rev.2.0
21 pages
MP Lab Final Question Set
No ratings yet
MP Lab Final Question Set
3 pages
Log
No ratings yet
Log
16 pages
Logcat
No ratings yet
Logcat
29 pages
Ain Note - Oracle GoldenGate - Lag, Performance, Slow and Hung Processes
No ratings yet
Ain Note - Oracle GoldenGate - Lag, Performance, Slow and Hung Processes
3 pages
At Line Selection & Get Cursor Field - SCN
No ratings yet
At Line Selection & Get Cursor Field - SCN
19 pages
State Machine Entry and Debugging Tutorial
No ratings yet
State Machine Entry and Debugging Tutorial
32 pages
Resume of MD
No ratings yet
Resume of MD
3 pages
Review Question 1
No ratings yet
Review Question 1
3 pages
A Tale of Two Regeds Registry Editors
No ratings yet
A Tale of Two Regeds Registry Editors
2 pages
Harshithkothamasu SE 1
No ratings yet
Harshithkothamasu SE 1
1 page
Tushar Lawand CV
No ratings yet
Tushar Lawand CV
2 pages

Revision Slides

Uploaded by

Revision Slides

Uploaded by

Revision Slide

Dr. Bryan Raj

Add the floating point

 This for loop takes

Table 2.3: Pipelined Addition.

 But 1000 floating point additions

 Overall parallel run-time is

1. Partitioning: divide the computation to be

The focus here should be on identifying tasks

For example, if task A must be executed before

This should be done so that communication is

Client Core 1 Storage

Return re sult B'

Return result A'

Client 1 Client 2 Core 1 Core 2 Storage

You might also like