Branch Prediction Techniques in Computer Architecture - Prof. Alan Davis, Study notes of Computer Architecture and Organization

Various branch prediction techniques used in computer architecture, including static vs. Dynamic prediction, bimodal prediction, 1-bit and 2-bit prediction, correlating predictors, local and global predictors, and tournament predictors. The document also discusses the importance of branch prediction in reducing control hazards and improving processor performance.

Typology: Study notes

Pre 2010

Uploaded on 08/30/2009

koofers-user-5v4
koofers-user-5v4 🇺🇸

10 documents

1 / 16

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
Lecture 7: Branch prediction
Topics: bimodal, global, local branch prediction
(Sections 2.3-2.6)
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff

Partial preview of the text

Download Branch Prediction Techniques in Computer Architecture - Prof. Alan Davis and more Study notes Computer Architecture and Organization in PDF only on Docsity!

Lecture 7: Branch prediction

Topics: bimodal, global, local branch prediction^ (Sections 2.3-2.6)

Dynamic Vs. Static ILP •^

Static ILP:

The compiler finds parallelism

Æ

no scoreboarding

Æ

higher clock speeds and lower power + Compiler knows what is next

Æ

better global schedule

- Compiler can not react to dynamic events (cache misses) - Can not re-order instructions unless you provide hardware and extra instructions to detect violations (eats into the low complexity/power argument) - Static branch prediction is poor

Æ

even statically scheduled processors use hardware branch predictors

- Building an optimizing compiler is easier said than done - A comparison of the Alpha, Pentium 4, and Itanium (statically^ scheduled IA-64 architecture) shows that the Itanium is not^ much better in terms of performance, clock speed or power

Pipeline without Branch Predictor

IF (br) PC Reg Read^ Compare^ Br-target PC + 4 In the 5-stage pipeline, a branch completes in two cycles Æ If the branch went the wrong way, one incorrect instr is fetched Æ One stall cycle per incorrect branch

Pipeline with Branch Predictor

IF (br) PC Reg Read^ Compare^ Br-target In the 5-stage pipeline, a branch completes in two cycles Æ If the branch went the wrong way, one incorrect instr is fetched Æ One stall cycle per incorrect branch Branch Predictor

1-Bit Prediction

For each branch, keep track of what happened last time and use that outcome as the prediction

What are prediction accuracies for branches 1 and 2 below: while (1) { for (i=0;i<10;i++) { branch- … } for (j=0;j<20;j++) { branch- … } }

2-Bit Prediction^ •

For each branch, maintain a 2-bit saturating counter:^ if the branch is taken: counter = min(3,counter+1)^ if the branch is not taken: counter = max(0,counter-1)

If (counter >= 2), predict taken, else predict not taken - Advantage: a few atypical branches will not influence the^ prediction (a better measure of “the common case”) - Especially useful when multiple branches share the same counter (some bits of the branch PC are used to index into the branch predictor) - Can be easily extended to N-bits (in most processors, N=2)

Global Predictor

A single register that keeps track^ of recent history for all branches

Branch PC 8 bits 6 bits Table of 16K entries of 2-bit saturating counters Also referred to as a two-level predictor

Local Predictor

Branch PC Table of 16K entries of 2-bit saturating counters Table of 64 entries of 14-bit histories for a single branch 10110111011001 Use 6 bits of branch PC to index into local history table 14-bit history^ indexes into next level Also a two-level predictor that only uses local histories at the first level

Tournament Predictors •^

A local predictor might work well for some branches or programs, while a global predictor might work well for others

Provide one of each and maintain another predictor to identify which predictor is best for each branch Tournament Predictor Branch PC Table of 2-bit saturating counters Local Predictor Global Predictor M U^ X Alpha 21264: 1K entries in level-1 1K entries in level-2 4K entries 12-bit global history 4K entries Total capacity:?

Branch Target Prediction

In addition to predicting the branch direction, we must also predict the branch target address

Branch PC indexes into a predictor table; indirect branches might be problematic - Most common indirect branch: return from a procedure – can be easily handled with a stack of return addresses

Title

Bullet