0% found this document useful (0 votes)
10 views25 pages

Revision Slides

The document discusses pipelining in computer systems, illustrating how dividing operations into separate functional units can improve performance and reduce execution time for floating point additions. It also covers the concepts of speedup and efficiency in parallel programming, detailing Foster's methodology for designing parallel programs, which includes partitioning, communication, agglomeration, and mapping. Additionally, it distinguishes between concurrency and parallelism, providing examples of how each can be implemented in server applications.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views25 pages

Revision Slides

The document discusses pipelining in computer systems, illustrating how dividing operations into separate functional units can improve performance and reduce execution time for floating point additions. It also covers the concepts of speedup and efficiency in parallel programming, detailing Foster's methodology for designing parallel programs, which includes partitioning, communication, agglomeration, and mapping. Additionally, it distinguishes between concurrency and parallelism, providing examples of how each can be implemented in server applications.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 25

Revision Slide

Dr. Bryan Raj


Department of Computer System & Technology
Faculty of Computer Science & Information Technology
University of Malaya
[email protected]
Pipelining
Pipelining example (1)

Add the floating point


numbers 9.87×104 and
6.54×103
Pipelining example (2)

 Assume each
operation takes one
nanosecond (10-9
seconds).

 This for loop takes


about 7000
nanoseconds.
Pipelining (3)
 Divide the floating point adder into 7 separate
pieces of hardware or functional units.
 First unit fetches two operands, second unit
compares exponents, etc.
 Output of one functional unit is input to the next.
Pipelining (4)

Table 2.3: Pipelined Addition.


Numbers in the table are subscripts of
operands/results.
Pipelining (5)
 One floating point addition still takes
7 nanoseconds.

 But 1000 floating point additions


now takes 1006 nanoseconds!
PERFORMANCE

8
Speedup
 Number of cores = p
 Serial run-time = Tserial
 Parallel run-time = Tparallel

u p
ed
s pe
r
lin
ea
Tparallel = Tserial / p

9
Speedup of a parallel program

Tserial
S=
Tparallel

10
Efficiency of a parallel program

Tserial

S Tparallel Tserial

E= = =
p p
.
p Tparallel

11
Speedups and
efficiencies of a parallel
program

12
Example
 We can parallelize 90% of a serial
program.
 Parallelization is “perfect”
regardless of the number of cores p
we use.
 Tserial = 20 seconds
 Runtime
0.9 x T of
/ p parallelizable
serial = 18 / p part is

13
Example (cont.)
 Runtime of “unparallelizable” part is
0.1 x Tserial = 2

 Overall parallel run-time is


Tparallel = 0.9 x Tserial / p + 0.1 x Tserial = 18 / p + 2

14
Example (cont.)
 Speed up

Tserial 20
S= =
0.9 x Tserial / p + 0.1 x Tserial 18 / p + 2

15
PARALLEL PROGRAM
DESIGN

16
Foster’s methodology

1. Partitioning: divide the computation to be


performed and the data operated on by the
computation into small tasks.

The focus here should be on identifying tasks


that can be executed in parallel.

17
Foster’s methodology
2. Communication: determine what communication
needs to be carried out among the tasks
identified in the previous step.

18
Foster’s methodology
3. Agglomeration or aggregation: combine tasks
and communications identified in the first step
into larger tasks.

For example, if task A must be executed before


task B can be executed, it may make sense to
aggregate them into a single composite task.

19
Foster’s methodology
4. Mapping: assign the composite tasks identified in
the previous step to processes/threads.

This should be done so that communication is


minimized, and each process/thread gets roughly
the same amount of work.

20
Concurrency VS Parallelism
Concurrency vs.
Parallelism
■ Concurrency means dealing with several things at once
□ Programming concept for the developer
□ In shared-memory systems, implemented by time sharing
■ Parallelism means doing several things at once
□ Demands parallel hardware
■ Parallel programming is a misnomer
□ Concurrent programming aiming at parallel execution
■ Any parallel software is concurrent software
□ Note: Some researchers disagree, most practitioners agree
■ Concurrent software is not always parallel software
Concurrency
□ Many server applications achieve scalability
by optimizing concurrency only (web server)
Parallelism
Server Example:
No Concurrency, No
Parallelism

Client Core 1 Storage

Receive request A

Fetch A data

Return A data

R
e
t
Client Core 1u Storage
r
n

r
e
s
u
Server Example:
Concurrency for
Throughput
Client 1 Client 2 Core 1 Storage

Receive request A

Fetch A data

R
e
c
e
i
v
e

r
e
q
u
e
s
t
Client 1 Client 2 Core 1 Storage
B
Server Example:
Parallelism for Throughput
Client 1 Client 2 Core 1 Core 2 Storage

Receive request A

Receive quest B
re
Fe tch A data

Fetch B data

Return B data

Return A data

Return re sult B'

Return result A'

Client 1 Client 2 Core 1 Core 2 Storage

You might also like