Transcript
Page 1: Dynamic Branch Prediction

04/20/23 Lec4 ILP 1

Dynamic Branch Prediction

• Why does prediction work?– Underlying algorithm has regularities

– Data that is being operated on has regularities

– Instruction sequence has redundancies that are artifacts of way that humans/compilers think about problems

• Is dynamic branch prediction better than static branch prediction?

– Seems to be

– There are a small number of important branches in programs which have dynamic behavior

Page 2: Dynamic Branch Prediction

04/20/23 Lec4 ILP 2

Dynamic Branch Prediction

• Performance = ƒ(accuracy, cost of misprediction)

• Branch History Table: Lower bits of PC address index table of 1-bit values

– Says whether or not branch taken last time

– No address check

• Problem: in a loop, 1-bit BHT will cause two mispredictions (avg is 9 iteratios before exit):

– End of loop case, when it exits instead of looping as before

– First time through loop on next time through code, when it predicts exit instead of looping

Page 3: Dynamic Branch Prediction

04/20/23 Lec4 ILP 3

• Solution: 2-bit scheme where change prediction only if get misprediction twice

• Red: stop, not taken

• Green: go, taken

• Adds hysteresis to decision making process

Dynamic Branch Prediction

T

T NT

NT

Predict Taken

Predict Not Taken

Predict Taken

Predict Not TakenT

NTT

NT

Page 4: Dynamic Branch Prediction

04/20/23 Lec4 ILP 4

18%

5%

12%10%

9%

5%

9% 9%

0%1%

0%2%4%6%8%

10%12%14%16%18%20%

eqnt

ott

espr

esso gc

c li

spice

dodu

c

spice

fppp

p

mat

rix30

0

nasa

7

Mis

pre

dic

tio

n R

ate

BHT Accuracy

• Mispredict because either:– Wrong guess for that branch

– Got branch history of wrong branch when index the table

• 4096 entry table:

IntegerFloating Point

Page 5: Dynamic Branch Prediction

Correlating Branch Predictor

• It may possible to improve the accuracy if we look at the behavior of other branches.

if (aa == 2)aa = 0;

if (bb == 0)bb = 0;

if (aa != bb)

The behavior of b3 is correlated with the behavior of b1 and b2.

Page 6: Dynamic Branch Prediction

Correlating Predictors

• Two-level predictors

if (d == 0)d = 1;

if (d == 1)

Page 7: Dynamic Branch Prediction

initial value of d

b1 value of d before b2

b2

0 nt 1 nt

1 t 1 nt

2 t 2 t

Page 8: Dynamic Branch Prediction

1-bit Predictor (Initialized to NT)

d b1 predic

b1 action

new b1 pr

b2 predic

b2 action

new b2 pr

2 nt t t nt t t

0 t nt nt t nt nt

2 nt t t nt t T

0 t nt nt t nt nt

Page 9: Dynamic Branch Prediction

(1,1) Predictor

• Every branch has two separate prediction bits.

– First bit: the prediction if the last branch in the program is not taken.

– Second bit: the prediction if the last branch in the program is taken.

• Write the pair of prediction bits together.

Page 10: Dynamic Branch Prediction

Combinations & Meaning

Prediction bits Prediction if not taken

Prediction if taken

NT/NT NT NT

NT/T NT T

T/NT T NT

T/T T T

Page 11: Dynamic Branch Prediction

(1,1) Predictor Example

d b1 pred

b1 action

new b1

pred

b2 pred

b2 action

new b2

pred

2 nt/nt t t/nt nt/nt t nt/t

0 t/nt nt t/nt nt/t nt nt/t

2 t/nt t t/nt nt/t t nt/t

0 t/nt nt t/nt nt/t nt nt

Initial prediction bits are nt/ntInitial last branch is ntUsed prediction bit is colored with red

Page 12: Dynamic Branch Prediction

04/20/23 Lec4 ILP 12

Correlated Branch Prediction

• Idea: record m most recently executed branches as taken or not taken, and use that pattern to select the proper n-bit branch history table

• In general, (m,n) predictor means record last m branches to select between 2m history tables, each with n-bit counters

– Thus, old 2-bit BHT is a (0,2) predictor

• Global Branch History: m-bit shift register keeping T/NT status of last m branches.

• Each entry in table has m n-bit predictors.

Page 13: Dynamic Branch Prediction

04/20/23 Lec4 ILP 13

Correlating Branches

(2,2) predictor

– Behavior of recent branches selects between four predictions of next branch, updating just that prediction

Branch address

2-bits per branch predictor

Prediction

2-bit global branch history

4

Page 14: Dynamic Branch Prediction

04/20/23 Lec4 ILP 14

0%

Fre

quen

cy o

f M

isp

redi

ctio

ns

0%1%

5%6% 6%

11%

4%

6%5%

1%2%

4%

6%

8%

10%

12%

14%

16%

18%

20%

4,096 entries: 2-bits per entry Unlimited entries: 2-bits/entry 1,024 entries (2,2)

Accuracy of Different Schemes

4096 Entries 2-bit BHTUnlimited Entries 2-bit BHT1024 Entries (2,2) BHT

nasa

7

mat

rix3

00

dodu

cd

spic

e

fppp

p

gcc

expr

esso

eqnt

ott li

tom

catv

Page 15: Dynamic Branch Prediction

04/20/23 Lec4 ILP 15

Tournament Predictors

• Multilevel branch predictor

• Use n-bit saturating counter to choose between predictors

• Usual choice between global and local predictors

Page 16: Dynamic Branch Prediction

04/20/23 Lec4 ILP 16

Tournament Predictors

Tournament predictor using, say, 4K 2-bit counters indexed by local branch address. Chooses between:

• Global predictor

– 4K entries index by history of last 12 branches (212 = 4K)

– Each entry is a standard 2-bit predictor

• Local predictor

– Local history table: 1024 10-bit entries recording last 10 branches, index by branch address

– The pattern of the last 10 occurrences of that particular branch used to index table of 1K entries with 3-bit saturating counters

Page 17: Dynamic Branch Prediction

04/20/23 Lec4 ILP 17

Comparing Predictors

• Advantage of tournament predictor is ability to select the right predictor for a particular branch

– Particularly crucial for integer benchmarks

– A typical tournament predictor will select the global predictor almost 40% of the time for the SPEC integer benchmarks and less than 15% of the time for the SPEC FP benchmarks

0%

1%

2%

3%

4%

5%

6%

7%

8%

9%

10%

0 8 16 24 32 40 48 56 64 72 80 88 96 104 112 120 128

Total predictor size (Kbits)

Con

ditio

nal b

ranc

h m

ispr

edic

tion

rate

Local

Correlating

Tournament

2-bit BHTSPEC89

Page 18: Dynamic Branch Prediction

04/20/23 Lec4 ILP 18

Pentium 4 Misprediction Rate (per 1000 instructions, not per branch)

11

13

7

12

9

10 0 0

5

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

164.gzip

175.vpr

176.gcc

181.mcf

186.crafty

168.wupwise

171.swim

172.mgrid

173.applu

177.mesa

Bra

nch

mis

pre

dic

tion

s p

er

10

00

In

str

ucti

on

s

SPECint2000 SPECfp2000

6% misprediction rate per branch SPECint (19% of INT instructions are branch)

2% misprediction rate per branch SPECfp(5% of FP instructions are branch)

Page 19: Dynamic Branch Prediction

• Branch target calculation is costly and stalls the instruction fetch.

• BTB stores PCs the same way as caches

• The PC of a branch is sent to the BTB

• When a match is found the corresponding Predicted PC is returned

• If the branch was predicted taken, instruction fetch continues at the returned predicted PC

Branch Target Buffers (BTB)

Page 20: Dynamic Branch Prediction

Branch Target Buffers

Branch PC Predicted PC

=?

PC

of in

structio

nF

ET

CH

Extra Prediction

statebits

Yes: inst is branch,Next PC = predicted PC

No: proceed normally (Next PC = PC+4)

Page 21: Dynamic Branch Prediction

04/20/23 Lec4 ILP 21

Dynamic Branch Prediction Summary

• Prediction becoming important part of execution

• Branch History Table: 2 bits for loop accuracy

• Correlation: Recently executed branches correlated with next branch

– Either different branches (GA)

– Or different executions of same branches (PA)

• Tournament predictors take insight to next level, by using multiple predictors

– usually one based on global information and one based on local information, and combining them with a selector

– In 2006, tournament predictors using 30K bits are in processors like the Power5 and Pentium 4

• Branch Target Buffer: include branch address & prediction


Top Related