Chapter 3 PPTV 31 Sem IIv 31
Chapter 3 PPTV 31 Sem IIv 31
Instruction Pipelining
1
Outline
Basics
Pipeline Hazards
◦ Structural
◦ Data
◦ Control
Branch Handling
Branch Prediction
2
Basics
As computer systems evolve, greater
performance achieved through :
◦ Improvements in technology
Faster circuitry
◦ Organizational enhancement to the CPU
Use of multiple register --- than single Accumulator
Use of cache memory
Use of instruction pipelining
3
Basics…
Pipelining
◦ Used at an assembly line in a manufacturing plant
Products at various stages can be worked on
simultaneously
New inputs accepted at one end before the previously
accepted input appears at the output
4
Basics…
Sequential execution of an N-stage task:
5
Basics…
Pipelined execution of an N-stage task:
6
Basics…
Instruction pipelining
◦ An implementation technique in which multiple
instructions are overlapped in execution
A pipelined approach takes much less time
Pipeling paradox
7
Basics…
Instruction Pipelining ---applying
◦ Apply the above concept to instruction execution
A Simple approach
◦ Subdividing instruction processing into two stages
◦ The pipeline has two stages
Fetch stage
Execute stage
◦ Fetch stage:
Fetches an instruction and buffers it --- instruction pre-fetch or fetch
overlap
Passes the buffered instruction when the Ex stage is free
◦ Execute stage
Accepts instructions from fetch stage and processes the instruction
Doubles execution rate
8
Basics…
A Simple approach --- Drawbacks
◦ double execution rate unlikely
The execution time generally longer than fetch time
Fetch stage has to wait for some time
A conditional branch instruction makes the address of
the next instruction to be fetched unknown
Fetch stage waits until the address is supplied from
execute stage
Execute stage may have to wait while the next instruction
is fetched
9
Basics…
A typical instruction execution sequence:
◦ Fetch Instruction (FI):
Fetch the instruction
◦ Decode Instruction (DI):
Determine the op-code and the operand specifiers
◦ Calculate Operands (CO):
Calculate the effective addresses
◦ Fetch Operands (FO):
Fetch the operands
◦ Execute Instruction (EI):
perform the operation
◦ Write Operand (WO):
store the result in memory
10
Basics…
Applying pipelining
Assumption
◦ Various stages will be of nearly equal duration
◦ Each instruction goes through all six stages of the pipeline
Reduces execution time for 9 instructions
◦ From 54 time units to 14 time units
11
Basics…
Applying pipelining (continued)
12
Basics…
Typical Pipeling
13
Basics…
Typical Pipeling :
◦ Effect of a conditional branch on instruction pipeline
Assume
◦ Instruction 3 is conditional branch to instruction 15
◦ Not known until I3 is executed
◦ During time unit 8, instruction 15 enters the pipeline
◦ No instruction complete during time units 9 through 12
Branch penalty
14
Basics…(contd)
Alternative timing diagram for instruction pipelining
operation:
Each row shows the state of the pipeline at a given point in
time
15
Basics…(contd)
Number of Pipeline Stages
In general, a larger number of stages gives better
performance
However:
◦ A larger number of stages increases the overhead in
moving information between stages and
synchronization between stages
◦ The complexity of the CPU grows with the number of
stages
◦ It is difficult to keep a large pipeline at maximum rate
because of pipeline hazards
16
Pipeline Hazard
Pipeline Hazard
◦ situations that prevent the next instruction in the instruction
stream from executing during its designated clock cycle
◦ The instruction is said to be stalled
When an instruction is stalled:
◦ All instructions later in the pipeline than the stalled instruction
are also stalled
◦ No new instructions are fetched during the stall
◦ Instructions earlier than the stalled one continue as usual
Types of hazards (conflicts):
◦ Structural hazards
◦ Data hazards
◦ Control hazards
17
Pipeline Hazard...
Structural Hazards
◦ The hardware cannot support the combination of instructions that
we want to execute in the same clock cycle
◦ Hardware conflicts caused by the use of the same hardware resource
at the same time (e.g., memory conflicts)
Penalty: 1 cycle
18
Pipeline Hazard...
Structural Hazards … (contd)
19
Pipeline Hazard...
Data Hazards
◦ Occurs because one step must wait for another to complete
◦ Caused by reversing the order of data-dependent
operations due to the pipeline (e.g., WRITE/READ conflicts)
ADD A, R1 ; Mem(A) ← Mem(A) + R1;
SUB A, R2 ; Mem(A) ← Mem(A) - R2;
20
Pipeline Hazard...
Data Hazards (contd…)
◦ ADD A, R1 ; Mem(A) ← Mem(A) + R1;
◦ SUB A, R2 ; Mem(A) ← Mem(A) - R2;
◦ Penalty: 2 cycles
21
Pipeline Hazard...
Data Hazards Solutions (contd…)
◦ The penalty due to data hazards can be reduced by a technique called
forwarding (bypassing)
22
Pipeline Hazard...
Data Hazards Solutions(contd…)
◦ ADD A, R1 ; Mem(A) ← Mem(A) + R1;
◦ SUB A, R2 ; Mem(A) ← Mem(A) - R2;
23
Pipeline Hazard...
Control Hazards
◦ Arising from the need to make a decision based on the
results of one instruction while others are executing
◦ The flow of instruction address is not what the pipeline
expected
◦ Caused by branch instructions, which change the
instruction execution order
24
Pipeline Hazard...
Instruction Pipelining in Practice
25
Branch Handling
Stall
◦ Stop the pipeline until the branch instruction reaches the
last stage
26
Branch Handling ...
Multiple streams
◦ Replicate the initial portions of the pipeline and allow the
pipeline to fetch both instructions
Using two streams
◦ implement hardware to deal with all possibilities
27
Branch Handling ...
Multiple streams --- (Drawbacks)
◦ With multiple pipeline there are contention delays to access
the registers and to memory
◦ Additional branch instruction may enter the pipeline
Before the original decision is resolved
28
Branch Handling ...
Pre-fetch branch target
◦ when a conditional branch is recognized, the target of the
branch is pre-fetched, in addition to the instruction
following the branch
29
Branch Handling ...
Loop buffer:
◦ use a small, very high-speed memory to keep the n
most recently fetched instructions in sequence. If a
branch is to be taken, the buffer is first checked to see
if the branch target is in it
◦ Similar in principle to a cache dedicated to instructions
But
it only retains in sequence
Much smaller in size and hence lower cost
Delayed branch:
◦ re-arrange the instructions so that branching occur
later than specified
30
Branch Handling ...
Delayed Branch Example
31
Branch Predictions
When a branch is encountered, a prediction is made and the
predicted path is followed
The instructions on the predicted path are fetched
The fetched instruction can also be executed --- called
Speculative execution
◦ The results produced of these executions should be
marked as tentative
When the branch outcome is decided, if the prediction is
correct, the special tags on tentative results are removed
If not, the tentative results are removed. And the execution
goes to the other path
Branch prediction can base on static information or dynamic
information
32
Branch Predictions ...
Static Branch Predictions
◦ Predict always taken
Assume that jump will happen
Always fetch target instruction
33
Branch Predictions ...
Static Branch Predictions (contd…)
34
Branch Predictions ...
Static Branch Predictions (contd…)
◦ Predict never taken
Assume that jump will not happen
Always fetch next instruction
35
Branch Predictions ...
Dynamic Branch Predictions
◦ Attempts to improve the accuracy of prediction
By recording the history of conditional branch instructions
36
Branch Predictions ...
Taken/not Taken switch (contd…)
Single bit --- Drawback
Records whether the last execution of the instruction resulted in a
branch or not
In case of conditional branch instruction that is always taken such as
loop instruction
With only one bit of history
An error in prediction will occur twice
One for entering the loop
One for leaving the loop
37
Branch Predictions ...
Taken/not Taken switch (contd…)
◦ Using Two bits
Records the result of the last two instances of the execution of
the associated instruction
Using state diagram
38
Branch Predictions ...
Dynamic Branch Predictions …
◦ Branch History Table:
A small cache memory associated with the instruction fetch
stage of the pipeline
Store information regarding branches in a branch-history table
so as to more accurately predict the branch outcome
Assume that the branch will do what it did last time
39
Summary
Pipelining is a technique that exploits
parallelism among the instructions in a
sequential instruction stream
Pipelining has a substantial advantage that,
40