Pipelining
Pipelining
Overview
● Pipelining is widely used in modern
processors.
● Pipelining improves system performance in
terms of throughput.
● Pipelined organization requires sophisticated
compilation techniques.
Basic Concepts
Making the Execution of
Programs Faster
● Use faster circuit technology to build the
processor and the main memory.
● Arrange the hardware so that more than one
operation can be performed at the same time.
● In the latter way, the number of operations
performed per second is increased even
though the elapsed time needed to perform
any one operation is not changed.
Traditional Pipeline Concept
● Laundry Example
● Ann, Brian, Cathy, Dave
each have one load of clothes
to wash, dry, and fold A B C D
● Washer takes 30 minutes
Time
30 40 20 30 40 20 30 40 20 30 40 20
● Sequential laundry takes 6
A hours for 4 loads
● If they learned pipelining,
how long would laundry
B
take?
D
Traditional Pipeline Concept
6 PM 7 8 9 10 11 Midnight
Time
T
a 30 40 40 40 40 20
s
k A
● Pipelined laundry takes
3.5 hours for 4 loads
O B
r
d C
e
r D
Traditional Pipeline Concept
6 PM 7 8 9 ● Pipelining doesn’t help
latency of single task, it
Time helps throughput of entire
T workload
a 30 40 40 40 40 20 ● Pipeline rate limited by
slowest pipeline stage
s
A ● Multiple tasks operating
k
simultaneously using
different resources
O B
● Potential speedup = Number
r pipe stages
d C ● Unbalanced lengths of pipe
e stages reduces speedup
r ● Time to “fill” pipeline and
D
time to “drain” it reduces
speedup
● Stall for Dependences
Use the Idea of Pipelining in a
Computer
Fetch + Execution
Tim
I1 I2 I3 e
Tim
Clock 1 2 3 4 e
F E F E F E cycle
1 1 2 2 3 3 Instruction
I1 F1 E1
(a) Sequential execution
I2 F2 E2
Interstage
buffer B
1 I3 F3 E3
Instructio E ecutio
n fetc x nuni
(c) Pipelined
huni t execution
t
Idle periods –
stalls (bubbles)
Pipeline Performance
Load X(R1), R2
Structural
hazard
Pipeline Performance
● Again, pipelining does not result in individual
instructions being executed faster; rather, it is the
throughput that increases.
● Throughput is measured by the rate at which
instruction execution is completed.
● Pipeline stall causes degradation in pipeline
performance.
● We need to identify all hazards that may cause the
pipeline to stall and to find ways to minimize their
impact.
Data Hazards
Data Hazards
● We must ensure that the results obtained when instructions are
executed in a pipelined processor are identical to those obtained
when the same instructions are executed sequentially.
● Hazard occurs
A←3+A
B←4×A
● No hazard
A←5×C
B ← 20 + C
● When two operations depend on each other, they must be
executed sequentially in the correct order.
● Another example:
Mul R2, R3, R4
Add R5, R4, R6
Data Hazards
D : Dispatch/
Decode E : ecute W:
instruction
Ex results
Write
unit
Figure 8.10. Use of an instruction queue in the hardware organization of Figure 8.2b.