0% found this document useful (0 votes)
18 views6 pages

Branch Prediction

Uploaded by

ARPAN MURMU
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views6 pages

Branch Prediction

Uploaded by

ARPAN MURMU
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Branch Prediction in ARM Processors

Contents
1 Purpose of Branch Predictor 2

2 Types of Branch Prediction in ARM Processors 2


2.1 Static Branch Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.2 Dynamic Branch Prediction . . . . . . . . . . . . . . . . . . . . . . . . . 3

3 Branch Prediction Architecture in ARM Processors 3


3.1 Branch Target Buffer (BTB) . . . . . . . . . . . . . . . . . . . . . . . . . 3
3.2 Pattern History Table (PHT) . . . . . . . . . . . . . . . . . . . . . . . . 3
3.3 Global and Local Predictors . . . . . . . . . . . . . . . . . . . . . . . . . 4
3.4 Return Address Stack (RAS) . . . . . . . . . . . . . . . . . . . . . . . . 4

4 How Branch Prediction Works in ARM 4


4.1 Pipeline Stages with Branch Prediction . . . . . . . . . . . . . . . . . . . 4

5 ARMv8 and ARMv9 Enhancements 5


5.1 ARMv8-A (AArch64) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
5.2 ARMv9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

6 Key Features in ARM Branch Predictors 5


6.1 Low Latency Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
6.2 Energy Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
6.3 High Accuracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

7 Common Metrics for Evaluating Branch Predictors 5

8 Example of Branch Prediction in ARM 6


8.1 Backward Branch in a Loop . . . . . . . . . . . . . . . . . . . . . . . . . 6

9 Conclusion 6

1
Introduction
Modern processors use branch prediction to minimize pipeline stalls caused by con-
ditional instructions (e.g., loops, if-else). In ARM processors, which power mobile and
embedded systems, branch prediction is a key feature for achieving high performance,
energy efficiency, and accuracy.

The Need for Branch Prediction


• Problem: When a branch instruction is encountered, the processor must decide:

– Taken: Execute the branch.


– Not-taken: Continue executing sequential instructions.

• Challenge: Deciding which path to take delays execution.

• Solution: Use branch prediction to predict the outcome and fetch instructions
speculatively.

1 Purpose of Branch Predictor


The purpose of a branch predictor is to:

1. Reduce pipeline stalls caused by branch mispredictions.

2. Maintain high throughput in pipelined architectures.

3. Optimize performance in mobile, IoT, and high-performance systems.

Pipeline Delay
Without branch prediction, the processor stalls to determine the outcome of a branch:

Delay (cycles) = Pipeline Depth × Branch Misprediction Rate

2 Types of Branch Prediction in ARM Processors


ARM processors implement two main types of branch prediction: static and dynamic.

2.1 Static Branch Prediction


Static Prediction Rules:

• Backward Branches: Predicted as taken (e.g., loops).

• Forward Branches: Predicted as not-taken.

Advantages:

• Simple to implement.

2
• No hardware overhead.
Limitations:
• Cannot adapt to branch behavior.

2.2 Dynamic Branch Prediction


Dynamic prediction adapts to runtime branch behavior using:
• History-based predictors.

• Hybrid models (e.g., global + local history).

3 Branch Prediction Architecture in ARM Proces-


sors
ARM processors utilize multi-level architectures for branch prediction.

3.1 Branch Target Buffer (BTB)


The BTB is a cache storing:
• Branch instruction addresses.

• Predicted target addresses.


When a branch instruction is fetched:

Predicted Target Address = BTB[Branch P C]

Role of BTB:
• Reduces latency by prefetching target instructions.

• Updates dynamically with new branches.

3.2 Pattern History Table (PHT)


The PHT tracks branch outcomes using 2-bit saturating counters:
1. Strongly Taken (ST): Always predict taken.

2. Weakly Taken (WT): Predict taken, but can switch.

3. Weakly Not-Taken (WNT): Predict not-taken, but can switch.

4. Strongly Not-Taken (SNT): Always predict not-taken.

Next State = f (Current State, Branch Outcome)


Example Transition:
Not-Taken
WT −−−−−−→ WNT

3
3.3 Global and Local Predictors
• Global History Register (GHR):

– Tracks outcomes of the last N branches.


– Correlates branch predictions with global patterns.

Example:
GHR = [T, T, N T, T ] (T: Taken, NT: Not-Taken)

• Local History Table (LHT):

– Tracks branch-specific history.


– Useful for branches with unique behavior.

3.4 Return Address Stack (RAS)


The RAS predicts the return address for function calls:

1. Pushes return addresses during function calls.

2. Pops addresses during returns.

4 How Branch Prediction Works in ARM


4.1 Pipeline Stages with Branch Prediction
1. Fetch Stage:

• The branch predictor identifies branch instructions.


• Checks the BTB and PHT for prediction data.

2. Prediction Stage:

• If a branch is found in the BTB:

Target Address → Prefetch

• Otherwise:
– Use static prediction.

3. Execution Stage:

• Executes the branch.


• If the prediction was wrong:
– Flush the pipeline.
– Fetch correct instructions.

4
5 ARMv8 and ARMv9 Enhancements
5.1 ARMv8-A (AArch64)
Key Features:

• Improved BTB: Larger size to store more branches.

• Advanced Predictors: Combines local and global history.

5.2 ARMv9
New Enhancements:

• Neural Predictors: Leverages machine learning for irregular patterns.

• Thread-Specific Predictors: Separate tables for multi-threaded execution.

6 Key Features in ARM Branch Predictors


6.1 Low Latency Prediction
Predictions occur at the fetch stage to minimize stalls.

6.2 Energy Efficiency


Optimized for mobile devices where power consumption is critical.

6.3 High Accuracy


Combines global and local predictors to improve prediction rates.

7 Common Metrics for Evaluating Branch Predic-


tors
1. Branch Prediction Accuracy (BPA):

Correct Predictions
BPA = × 100
Total Branches

2. Misprediction Penalty: Cycles lost due to incorrect predictions.

3. BTB Hit Rate:


Branches Found in BTB
BTB Hit Rate =
Total Branches

5
8 Example of Branch Prediction in ARM
8.1 Backward Branch in a Loop
MOV R0 , #0 ; I n i t i a l i z e counter
LOOP:
ADD R0 , R0 , #1 ; Increment c o u n t e r
CMP R0 , #10 ; Compare c o u n t e r w i t h 10
BNE LOOP ; Branch i f not e q u a l ( backward branch )

• Static Prediction: Predicts as taken.

• Dynamic Prediction: Adapts to runtime behavior for higher accuracy.

9 Conclusion
ARM processors use advanced branch prediction mechanisms to achieve high perfor-
mance, low latency, and energy efficiency. Techniques like dynamic prediction, neural
predictors, and hybrid models ensure ARM architectures excel in modern workloads.

You might also like