Data Hazards
Data Hazards
Three kinds
Bypassing/forwarding
Speculation
Data Dependences
data dependence occurs whenever one
Ainstruction
needs a value produced by another.
sw
$t1, 0($t2)
ld
$t3, 0($t2)
ld
$t4, 16($s4)
Fetch
Deco
de
Fetch
EX
Mem
Deco
de
EX
Write
back
Mem
Write
back
Ideas?
is N?
What
it change?
Can
What can the compiler do?
Fetch
Deco
de
EX
Mem
Write
back
Rearrange
instructions add $s0, $t0, $t1
Solution 2: Stall
you need a value that is not ready, stall
When
Suspend the execution of the executing instruction
Fetch
Deco
de
Fetch
EX
Mem
Stall
Write
back
Deco
de
EX
Mem
Write
back
The compiler can still act like there are delay slots to avoid stalls.
Implementation details are not exposed in the ISA
9
= I * CPI * CT
ET
and CT are constant
IWhat
is the impact of stalling on CPI?
10
Solution 3: Bypassing/Forwarding
values are computed in Ex and Mem but
Data
publicized in write back
results known
inputs are needed
Fetch
Deco
de
EX
Mem
Write
back
12
Bypassing or Forwarding
Take the values, where ever they are
Cycles
add $s0, $t0, $t1
Fetch
Deco
de
Fetch
EX
Mem
Deco
de
EX
Write
back
Mem
Write
back
13
Forwarding Paths
Cycles
add $s0, $t0, $t1
Fetch
Deco
de
Fetch
EX
Mem
Deco
de
EX
Mem
Deco
de
EX
Mem
Deco
de
EX
Fetch
Fetch
Write
back
Write
back
Write
back
Mem
Write
back
14
Forwarding in Hardware
Add
Add
4
Shi<
le<
2
File
Write
Addr
Write
Data
16
Sign
Extend
Read
Data
2
32
ALU
Address
Write
Data
Read
Data
Mem/WB
Read Addr 2
Data
Memory
Read
Data
1
Exec/Mem
Register
Dec/Exec
Read
Address
Read
Addr
1
IFetch/Dec
PC
Instruc(on
Memory
Add
$s0, (0)$t0
Fetch
Deco
de
Fetch
EX
Mem
Deco
de
EX
Write
back
Mem
16
Will work.
Same dangers apply as before.
stall.
Always
when possible, stall otherwise
Forward
Here the compiler still has leverage
17
18
Guess!
Leads to speculation
Flushing the pipeline
Strategies for making better guesses
Control Hazards
Fetch
Deco
de
EX
Mem
Write
back
20
Computing the PC
instruction
Non-branch
PC = PC + 4
When is PC ready?
Fetch
Deco
de
EX
Mem
Write
back
21
Computing the PC
instructions
Branch
bne $s1, $s2, offset
Deco
de
EX
Mem
Write
back
22
Predict Not-taken
Cycles
Not-taken
Taken
Fetch
Deco
de
Fetch
EX
Mem
Deco
de
EX
Fetch
Deco
de
Write
back
Mem
EX
Write
back
Mem
Write
back
Squash
Fetch
Deco
de
24
in control
Changes
inputs to the control unit
New
The sign of the offset
27
Pentium 4 pipeline
1.Branches take 19 cycles to resolve
2.Identifying a branch takes 4 cycles.
3.Stalling is not an option.
4.Not quite as bad now, but BP is still very important.
29