The document provides 4 examples of calculating branch prediction performance metrics for different processor pipeline configurations:
1. Defines the probabilities of different branch types (unconditional, conditional taken/not taken).
2. Calculates CPI contribution from conditional branch stalls given parameters like branch frequency, BTB miss rate/penalty, prediction accuracy/misprediction penalty, and base CPI.
3. Computes CPI with and without a BTB, given branch frequency, BTB hit/miss rates and penalties, and prediction accuracy, to determine speedup from adding a BTB.
4. Calculates average CPI for a 5-stage pipeline processor with a branch prediction unit, accounting for different branch penalties and prediction outcomes
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
78 views
Sample Problems
The document provides 4 examples of calculating branch prediction performance metrics for different processor pipeline configurations:
1. Defines the probabilities of different branch types (unconditional, conditional taken/not taken).
2. Calculates CPI contribution from conditional branch stalls given parameters like branch frequency, BTB miss rate/penalty, prediction accuracy/misprediction penalty, and base CPI.
3. Computes CPI with and without a BTB, given branch frequency, BTB hit/miss rates and penalties, and prediction accuracy, to determine speedup from adding a BTB.
4. Calculates average CPI for a 5-stage pipeline processor with a branch prediction unit, accounting for different branch penalties and prediction outcomes
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 5
Example 1:
• What is the probability that a branch is taken?
Given: • 20% of branches are unconditional branches Of conditional branches, • 66% branch forward & are evenly split between taken & not taken And the rest • branch backwards & are always taken Example 2: What is the contribution to CPI of conditional branch stalls, given: • 15% branch frequency • A BTB for conditional branches only with a • 10% miss rate and a 3-cycle miss penalty • 92% prediction accuracy and a 7 cycle misprediction penalty • Base CPI is 1 • What is the hit rate ? 90%
BTB result Prediction Frequency (per Penalty (cycles) Stalls
instruction) Miss - .15 * .10 = .015 3 0.045
Hit correct .15 * .90 * .92 = .124 0 0
Hit incorrect .15 * .90 * .08 = .011 7 0.076
Total contribution to CPI 0.121
Example 3: • Suppose we have a deeply pipelined processor, for which we implemented a branch target buffer (BTB) for conditional branches only. • Assume that the misprediction penalty is always 4 cycles and the BTB miss penalty is always 3 cycles. • Assume 15% branch frequency, and 90% hit rate and 80% accuracy. • How much faster is the processor with the branch target buffer versus a processor that has a fixed 2 cycle branch penalty? Compute CPI without BTB and CPI with BTB Example 4: • Assume a processor with a standard five-stage pipeline (IF, ID, EX, MEM,WB) and a branch prediction unit (a branch history table) in the ID-stage. Branch resolution is performed in the EX-stage. There are four cases for conditional branches: • The branch is not taken and correctly predicted as not taken (NT/PNT) • The branch is not taken and predicted as taken (NT/PT) • The branch is taken and predicted as not taken (T/PNT) • The branch is taken and correctly predicted as taken (T/PT)
• Suppose that the branch penalties with this design are:
• NT/PNT: 0 cycles • T/PT: 1 cycle • NT/PT, T/PNT: 2 cycles Example 4 continued: a) Calculate the average CPI for the processor assuming a base CPI of 1.2. Assume 20% conditional branches and that 65% of these are taken on average. Assume further that the branch prediction unit mispredicts 12% of the conditional branches. b) In order to increase the clock frequency from 500 MHz to 600 MHz, a designer splits the IF-stage into two stages, IF1 and IF2. This makes it easier for the instruction cache to deliver instructions in time. This also affects the branch penalties for the branch prediction unit as follows: • NT/PNT: 0 cycles • T/PT: 2 cycles • NT/PT, T/PNT: 3 cycles c) How much faster is this new processor than the previous that runs on 500 MHz?