Branch Handling
Branch Handling
Outline
What are branches? Reducing branch penalties Branch prediction Why is branch prediction necessary? Branch prediction basics Issues which affect accurate branch prediction Examples of real predictors
2
Branches
Instructions which can alter the flow of instruction execution in a program
Types of Branches
Conditional Direct
if - then- else for loops (bez, bnez, etc)
Unconditional
procedure calls (jal) goto (j)
Indirect
Predication
Branch Prediction
Predicting the outcome of a branch
Direction:
Taken / Not Taken Direction predictors
Target Address
PC+offset (Taken)/ PC+4 (Not Taken) Target address predictors
Branch Target Address Cache (BTAC) or Branch Target Buffer (BTB)
7
Dynamic
Prediction decisions may change during the execution of the program
9
10
Bimodal Prediction
Table of 2-bit saturating counters
PC
Taken
11 T Taken 10 T
PHT
Taken Taken Taken Taken 00 Not Taken 01 Not Taken 10 Not Taken 11 Not Taken
Not Taken
Taken
T/NT
Not Taken
Taken
Taken
Taken
Taken
00 Not Taken
01 Not Taken
10 Not Taken
11 Not Taken
...
11
Correlation
B1: if (x) ... B2: if (y) ... z=x&&y B3: if (z) ... B3 can be predicted with 100% accuracy based on the outcomes of B1 and B2
12
Two-Level Prediction
Uses two levels of information to make a direction prediction
Branch History Table (BHT) PHT
PHT type
A (adaptive), S (static)
PHT organization
g (global), p (per branch), s (per set)
14
PC
PHT
T/NT
BHT
T T NT T NT NT T T T T T T T NT NT NT NT T T T
T NT T T
PHT
T/NT
GAs Predictor
PAs Predictor
15
Hybrid Prediction
Two or more predictor components PC combined
Bimodal PAs
...
Different branches benefit from different types Selector of history
T/NT T/NT
T/NT
16
Special Branches
Procedure calls and returns
Calls are always taken Return address almost always known
17
Destructive
Prediction that would have been correct, predicted incorrectly
Neutral
No change in the accuracy
18
More Issues
Training time
Need to see enough branches to uncover pattern Need enough time to reach steady state
Wrong history
Incorrect type of history for the branch
Stale state
Predictor is updated after information is needed
20
UltraSPARC-III
14-stage pipeline, bpred accessed in instruction fetch stages 2-3 16K-entry 2-bit counter Gshare predictor
Bimodal predictor which XORs PC bits with global history register (except 3 lower order bits) to reduce aliasing
Miss queue
Halves mispredict penalty by providing instructions for immediate use
21
Pentium III
Dynamic branch prediction
512-entry BTB predicts direction and target, 4bit history used with PC to derive direction
Static branch predictor for BTB misses Return Address Stack (RAS), 4/8 entries Branch Penalties:
Not Taken: no penalty Correctly predicted taken: 1 cycle Mispredicted: at least 9 cycles, as many as 26, average 10-15 cycles
22
AMD Athlon K7
10-stage integer, 15-stage fp pipeline, predictor accessed in fetch 2K-entry bimodal, 2K-entry BTAC 12-entry RAS Branch Penalties:
Correct Predict Taken: 1 cycle Mispredict penalty: at least 10 cycles
23