0% found this document useful (0 votes)
741 views

Branch Handling

This document provides an introduction to branch prediction, which is a technique used in modern processors to reduce branch penalties and increase instruction level parallelism (ILP). It discusses the types of branches that exist in programs, as well as early techniques like stalling and branch delay slots that were used to handle branches. However, these techniques are not sufficient for today's deeper pipelines. The document then introduces branch prediction, describing basic direction prediction strategies and how more advanced two-level predictors work. It also discusses issues that can impact branch prediction accuracy and provides examples of branch predictors used in real processors like the Alpha 21264 and Pentium III.

Uploaded by

Prateek Sancheti
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
741 views

Branch Handling

This document provides an introduction to branch prediction, which is a technique used in modern processors to reduce branch penalties and increase instruction level parallelism (ILP). It discusses the types of branches that exist in programs, as well as early techniques like stalling and branch delay slots that were used to handle branches. However, these techniques are not sufficient for today's deeper pipelines. The document then introduces branch prediction, describing basic direction prediction strategies and how more advanced two-level predictors work. It also discusses issues that can impact branch prediction accuracy and provides examples of branch predictors used in real processors like the Alpha 21264 and Pentium III.

Uploaded by

Prateek Sancheti
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 23

Intro to Branch Prediction

Michele Co September 11, 2001 Department of Computer Science University of Virginia


1

Outline
What are branches? Reducing branch penalties Branch prediction Why is branch prediction necessary? Branch prediction basics Issues which affect accurate branch prediction Examples of real predictors
2

Branches
Instructions which can alter the flow of instruction execution in a program

Types of Branches
Conditional Direct
if - then- else for loops (bez, bnez, etc)

Unconditional
procedure calls (jal) goto (j)

Indirect

return (jr) virtual function lookup function pointers (jalr)

Techniques for handling branches


IF ID EX MEM WB

Stalling Branch delay slots


Relies on programmer/compiler to fill Depends on being able to find suitable instructions Ties resolution delay to a particular pipeline

Predication

if-conversion: control dependence to data dependence on branch condition

Why arent these techniques acceptable?


Branches are frequent - 15-25% Todays pipelines are deeper and wider
Higher performance penalty for stalling Misprediction Penalty = issue width * resolution delay cycles

A lot of cycles can be wasted!!!

Branch Prediction
Predicting the outcome of a branch
Direction:
Taken / Not Taken Direction predictors

Target Address
PC+offset (Taken)/ PC+4 (Not Taken) Target address predictors
Branch Target Address Cache (BTAC) or Branch Target Buffer (BTB)
7

Why do we need branch prediction?


Branch prediction
Increases the number of instructions available for the scheduler to issue. Increases instruction level parallelism (ILP) Allows useful work to be completed while waiting for the branch to resolve

Branch Prediction Strategies


Static
Decided before runtime Examples:
Always-Not Taken Always-Taken Backwards Taken, Forward Not Taken (BTFNT) Profile-driven prediction

Dynamic
Prediction decisions may change during the execution of the program
9

What happens when a branch is predicted?


On mispredict:
No speculative state may commit
Squash instructions in the pipeline Must not allow stores in the pipeline to occur
Cannot allow stores which would not have happened to commit

Need to handle exceptions appropriately

10

Bimodal Prediction
Table of 2-bit saturating counters
PC
Taken

Predict the most common direction Not


Taken

11 T Taken 10 T

PHT
Taken Taken Taken Taken 00 Not Taken 01 Not Taken 10 Not Taken 11 Not Taken

Not Taken

Taken

T/NT
Not Taken

01 NT Taken 00 NT Not Taken

Taken

Taken

Taken

Taken

00 Not Taken

01 Not Taken

10 Not Taken

11 Not Taken

...

Advantages: simple, cheap, good accuracy

11

Correlation
B1: if (x) ... B2: if (y) ... z=x&&y B3: if (z) ... B3 can be predicted with 100% accuracy based on the outcomes of B1 and B2

12

Two-Level Prediction
Uses two levels of information to make a direction prediction
Branch History Table (BHT) PHT

Captures patterned behavior of branches


Groups of branches are correlated Particular branches have particular behavior
13

Two-level Predictor Classification


Yeh and Patt 3-letter naming scheme
Type of history collected
G (global), P (per branch), S (per set) M (merge?)
added by Skadron, Martonosi, Clark

PHT type
A (adaptive), S (static)

PHT organization
g (global), p (per branch), s (per set)
14

Some Two-level Predictors


PC
GBHR

PC
PHT
T/NT

BHT
T T NT T NT NT T T T T T T T NT NT NT NT T T T

T NT T T

PHT
T/NT

GAs Predictor

PAs Predictor

15

Hybrid Prediction
Two or more predictor components PC combined
Bimodal PAs

...
Different branches benefit from different types Selector of history
T/NT T/NT

T/NT

16

Special Branches
Procedure calls and returns
Calls are always taken Return address almost always known

Return Address Stack (RAS)


On a procedure call, push the address of the instruction after the call onto the stack

17

Issues Affecting Accurate Branch Prediction


Aliasing
More than one branch may use the same BHT/PHT entry
Constructive
Prediction that would have been incorrect, predicted correctly

Destructive
Prediction that would have been correct, predicted incorrectly

Neutral
No change in the accuracy
18

More Issues
Training time
Need to see enough branches to uncover pattern Need enough time to reach steady state

Wrong history
Incorrect type of history for the branch

Stale state
Predictor is updated after information is needed

Operating system context switches


More aliasing caused by branches in different programs
19

Real Branch Predictors


Alpha 21264
8-stage pipeline, mispredict penalty 7 cycles 64 KB, 2-way instruction cache with line and way prediction bits (Fetch)
Each 4-instruction fetch block contains a prediction for the next fetch block

Hybrid predictor (Fetch)


12-bit GAg (4K-entry PHT, 2 bit counters) 10-bit PAg (1K-entry BHT, 1K-entry PHT, 3-bit counters)

20

UltraSPARC-III
14-stage pipeline, bpred accessed in instruction fetch stages 2-3 16K-entry 2-bit counter Gshare predictor
Bimodal predictor which XORs PC bits with global history register (except 3 lower order bits) to reduce aliasing

Miss queue
Halves mispredict penalty by providing instructions for immediate use
21

Pentium III
Dynamic branch prediction
512-entry BTB predicts direction and target, 4bit history used with PC to derive direction

Static branch predictor for BTB misses Return Address Stack (RAS), 4/8 entries Branch Penalties:
Not Taken: no penalty Correctly predicted taken: 1 cycle Mispredicted: at least 9 cycles, as many as 26, average 10-15 cycles
22

AMD Athlon K7
10-stage integer, 15-stage fp pipeline, predictor accessed in fetch 2K-entry bimodal, 2K-entry BTAC 12-entry RAS Branch Penalties:
Correct Predict Taken: 1 cycle Mispredict penalty: at least 10 cycles
23

You might also like