Unit 5 Pipeline Hazard
Unit 5 Pipeline Hazard
Computer Architecture
Five Stages of Pipeline
Pipelining is an implementation technique in which multiple
instructions are overlapped in execution.
The stages of instruction execution / pipelining are
IF --- Instruction Fetch
ID --- Instruction Decode / Register Read
EX --- Execute in ALU / calculate address
MEM --- Data memory access
WB ---- Write back in register
AL
U
lw $1, 100($0)
is used for
instructions
Instr 2 Ifetch Reg DMem Reg
and data
AL
U
Instr 3 Ifetch Reg DMem Reg
AL
U
Instr 4 Ifetch Reg DMem Reg
AL
U
add $1,$2,$3
Structural Hazard:
Instr 5 Can’t load data and Ifetch Reg DMem Reg
AL
U
fetch Instruction 4
during clock cycle 4
Pipeline Hazards Slide 4
Resolving structural hazards
Problem
Attempt to use the same hardware resource (Memory) by two
different instructions during the same cycle
Solution 1: Wait
Must detect the hazard
Must have mechanism to delay (stall) instruction access to
resource (Introduce bubble / NOP)
Serious: hazard cannot be ignored
Solution 2: Redesign the pipeline
Add more hardware to eliminate the structural hazard
In our example: use two memories with two memory ports
Instruction Memory Can be implemented as
caches
Data Memory
Pipeline Hazards Slide 5
Solution 1 : Detect Structural Hazard and Delay
Time (clock cycles)
Cycle 1 Cycle 2 Cycle 4 Cycle 5 Cycle 6 Cycle 7
Cycle 3
ID/Reg
I Load Ifetch Reg DMem Reg
AL
U
n A bubble is a
s NOP instruction
Instr 2 Ifetch Reg DMem Reg
AL
t
U
r. Instr 3 Ifetch Reg DMem Reg
AL
U
O Introduce a bubble
Bubble Bubble Bubble Bubble Bubble
r Stall to delay
instruction
fetching
d Instr 4 Ifetch Reg DMem Reg
AL
U
e
Pipeline Hazards Slide 6
r
Solution 2: Add More Hardware
(Use Instruction and data memory)
Eliminate structural hazard at design time
Use two separate memories with two memory ports
Instruction and data memories can be implemented as caches
IF ID EX
MEM WB
IF/ID ID/EX EX/MEM MEM/WB
Inc A
d
Imm16
00
Extend d zero
0
m Rs ALU result 0
PC
Address
u Registers A m
Instruction Rt
0 Address u
x
L
Reg_dst
m
Data_in
1 Instruction Data
Memory 0 Memory x
m u U 1
u Data_in
Rd x x
1 1
I Reg AL DM Reg
U
M
or $6, $3, $2 I Reg AL DM Reg
U
M
add $7, $2, $2
Order
I Reg AL DM Reg
(No Stalling. Data written in Register $2 by sub U
instruction is read after Instruction add is Fetched) M
sw $8, 10($2) I Reg AL DM
Here ALU calculates the address $2+10 U
M
Exec/
address
Result of sub is needed by and, or, add, & sw instructions calculatio
n
Instructions and & or will read old value of $2 from reg file
During CC5, $2 is written and read – new value is read
I Reg AL DM Reg
U
M
or $6, I Reg AL DM Reg
U
M
$3, $2 add
Order
AL
IM Reg DM Reg
U
$7, $2,
sw $8, $2
10($2) AL DM
IM Reg
U
Pipeline Hazards Slide 11
2b. Operand Forwarding Unit
Forwarding unit generates ForwardA and ForwardB
That are used to control the two forwarding multiplexers
Uses Ra and Rb in ID/EX and Rw in EX/MEM & MEM/WB
IF/ID ID/EX EX/MEM MEM/
WB
Imm32
Imm16 ALUSrc
Extend
ALU result
m
Rs ALU result 0
A
u m
A
ALU result
Rt
Registers
x Address u
B
Load data
m L Data
m Memory x
Rw Rb Ra
u U 1
u Data_in
Rb x
Rd
x
Rw
Rw
Rw
Forwarding Unit
20
sub $2, $1, $3 IM R eg
AL
DM Reg
U
bubble AL DM Reg
U
Branch_Target IM R eg ALU
+4 A
PCSrc d
Imm16
Extend d Zero
0
m
PC = 1000
m Rs
Address
u u
Instruction
Registers A ALU result
x x
m
Instruction
Rt
Reg_dst m L
Data_in
1
Memory u
u U
x
Op
Writeback data x
W M E
Main
W M
Control
1004
+4 A
PCSrc d
100 Imm16
Extend d Zero
0
m
PC = 1004
m Rs
$1
Address
u u
Instruction Rt Registers A ALU result
$3
x x
m
Instruction Reg_dst m L
Data_in
1
Memory u
u U
x
beq
Writeback data x
W M E
Main
W M
Control
1008
1004
+4 A
PCSrc d
Imm16
100
Extend d Zero
0
m
PC = 1008
m Rs
1234 1234
Address
u u
Instruction Rt Registers A ALU result
x x
m
Instruction Reg_dst m L
Data_in
1
Memory u
u U
x
Writeback data x
W M E
Main Beq = 1
W M
Control
1012
1008
+4 A
1404
PCSrc d
Imm16
Extend d Zero = 1
0
1
m
PC = 1012
m Rs
Address
u u
Instruction Rt Registers A ALU result
x
0
x m
Instruction Reg_dst m L
Data_in
1
Memory u
u U
x
Writeback data x
W M E
Main Beq = 1
W M
Control
1016
1012
+4 A
PCSrc d
Imm16
Extend d Zero
0
m
PC = 1404
m Rs
Address
u u
Instruction Rt Registers A ALU result
x x
m
Instruction Reg_dst m L
Data_in
1
Memory u
u U
x
Writeback data x
W M E
Main
W M
Control
Hazard
detection
unit
M ID/EX
u
x
WB
EX/MEM
M
Control u M WB
MEM/WB
0
x
IF/ID EX M WB
4 Shift
left 2
M
Registers u
x
= Data
Instructio ALU
PC memory M
n u
memory M x
u
x
Sign
extend
M
u
x
Forwarding
unit
Delay Slot
From Target
Solution
Check the PC to see if the instruction being fetched is a branch
Store the branch target address in a table in the IF stage
Such a table is called the branch target buffer
If branch is predicted taken then
Next PC = branch target fetched from target buffer
Otherwise, if branch is predicted not taken then
Next PC = PC + 4
Zero-delay is achieved because Next PC is determined in IF stage
Pipeline Hazards Slide 37
Branch Target and Prediction Buffer
The branch target buffer is implemented as a small cache
That stores the branch target address of taken branches
We also have a branch prediction buffer
To store the prediction bits for branch instructions
The prediction bits are dynamically determined by the hardware
Prediction Buffer
mux
Lookup
PC
Imm16 ALUSrc
Extend
ALU result
m
Rs ALU result 0
A
u m
A
ALU result
Rt
Registers
x Address u
B
Load data
m L Data
m Memory x
Rw Rb Ra
u U 1
u Data_in
Rb x
Rd
x
Rw
Rw
Rw