h-pattern : a hybrid pattern based dynamic branch predictor with performance based adaptation
DESCRIPTION
H-Pattern : A Hybrid Pattern Based Dynamic Branch Predictor with Performance Based Adaptation. Indian Institute of Technology Madras Department of Computer Science & Engineering. Approach. Conditional branch instructions often follow patterns which periodically repeat. - PowerPoint PPT PresentationTRANSCRIPT
H-Pattern: A Hybrid Pattern Based Dynamic
BranchPredictor with Performance Based
Adaptation
Samir OtivSecond Year Undergraduate
Kaushik GarikipatiSecond Year Undergraduate
Milan PatnaikMTech
Dr. V KamakotiProfessor
Indian Institute of Technology MadrasDepartment of Computer Science & Engineering
Approach
Conditional branch instructions often follow patterns which periodically repeat.
If a branch instruction is found to follow a certain repeating pattern, a predictor must have the ability to accurately predict its outcome for as long as the pattern persists.
Predicting ALL patterns with periods of ANY length: Impossible, given a fixed storage budget.
Approach
STRATEGY: Restrict ourselves to capturing patterns with a period only up to a certain predetermined length
Objective: Creating a predictor that captures patterns with periods of lengths of up to n-bits.
Challenges:1. Using minimum space2. The patterns followed can change – must dynamically relearn
Solution
For every branch: Store local history of 2n bits
If a branch instruction follows a pattern of execution with a period p, where p is at most equal to n, then the most recent set of n bits must be identical to the set of n bits that occurred p executions prior.
outcome(hi) = outcome(hi+p) (where hi = ith most recent execution)
To predict, all we do is compare the most recent n bits to successively older History Patterns (substrings of n bits of the local history), and stop at the first match. The bit, just after this matching substring, is our prediction for the next execution.(The picture on the next slide should clarify)
Here, with n=8, we store a local history of 16 bits.
The branch instruction follows a repeating pattern –(110)-, which has a period of 3.
The bit string h0 to h7 (Current Pattern) matches precisely with the bit string h3 to h11 (Matched Current Pattern).
The prediction returned is the bit just after the matched current pattern – h2.
Illustration
H-Pattern: nBPAT + AltPred
nBPAT: n-Bit Pattern PredictorAltPred: Any other alternate branch predictor
When no pattern is detected (i.e. no pattern match occurs), AltPred is used.
When a pattern is detected, the better performing predictor is used.
The nBPAT Predictor
Every entry of the predictor is comprised of:• A 2n-bit shift register for local history• A saturating counter to keep track of the better performing predictor
(as described in ‘Combining Branch Predictors’ by Scott McFarling)
Storage:Various configurations possible – tagged/tagless/direct mapped/associative
The nBPAT Algorithm
To Predict:
1. Match the current pattern (h0 to hn-1) with successively older history patterns
2. If the first match is found at hi, then hi-1 is the predicted outcome. If the most significant bit of the saturating selection counter is 1, then return hi-1.
3. If there is no match, or if the most significant bit is 0, use AltPred
To Update:1. If AltPred mispredicted and
nBPAT correctly predicted, increment the saturating selection counter.
2. If AltPred correctly predicted and nBPAT mispredicted, decrement the saturating selection counter.
3. If nBPAT was not ready, don’t change the saturating counter
4. Update the local history by inserting the outcome of the branch into the local history shift register
Combinations of H-Pattern
H-Pattern: Various configuration decisionsAltPred Component: Several possible options, for instance:• Gshare• TAGE• ISL-TAGE
nBPAT Storage Structure:• Tagged/Tagless• Associative/Direct Mapped
H-Pattern with Gshare
Configuration:• Tagless, direct-mapped table used for nBPAT – indexed by few of the
least significant bits of the PC• 50% of the storage budget assigned to nBPAT
Outcome:Distinct improvement in accuracy observed, as will be shown soon.
H-Pattern with Gshare
4KB 32KB Unlimited0
1
2
3
4
5
6
7
8
6.7
5.2
4.6
6.4
4.7
3.8
Mispredictions per Kilo Instructions – CBP 2014 Framework
Gshare H-Pattern with Gshare
H-Pattern with TAGE/ISL-TAGE
Minimal portion of storage allocated to nBPAT
The storage structure must facilitate maximum accuracy by nBPAT for very small storage spaces.
Proportion of the storage budget allocated to nBPAT was different for different budgets
Improvement in accuracy was lesser than that achieved with Gshare
H-Pattern with TAGE/ISL-TAGE
CONFIGURATION: nBPAT STORAGEPartially tagged, 2-way set-associative.
Selection Counter: 4-bits
Useful Counter:Included in every entry. Serves as a measure of the effectiveness of an
entry in the table.Decremented if:1. No pattern match found2. Misprediction by nBPAT & correct prediction by AltPredIncremented if misprediction by AltPred and correct prediction by nBPAT.All useful counters are reset periodically using a global reset counter.
This correctly captures the notion of an entry in the table being effective or ineffective, and aids in the entry replacement policy.
H-Pattern with TAGE/ISL-TAGE
UPDATE ALGORITHM:1. If the TAGE predictor MISPREDICTED and there is no tag match in
nBPAT 2-way associative table, and, either of the 2 potential entry locations have Useful = 0, then, make Tag = [BranchTag] and Useful = [Maximum].
2. If the entry ALREADY exists in the nBPAT 2-way associative table, then,
1. If nBPAT was not ready, OR, nBPAT mispredicted and TAGE correctly predicted, decrease useful.
2. If nBPAT correctly predicted and TAGE mispredicted, increase useful3. Update the nBPAT entry as described earlier in the nBPAT algorithm4. Update the TAGE/ISL-TAGE predictor
Reference TAGE Configurations
The optimized configuration for an 8-table TAGE predictor, as specified in the paper “A case for (partially) Tagged Geometric history length branch prediction”, by André Seznec and Pierre Michaud, was used.• 4KB: History Lengths = 5 to 127• 32KB: History Lengths = 5 to 450
Whereas for the unlimited case, 18 tagged tables were used.History Lengths = 3 to 2000
H-Pattern with TAGE Configurations
• 4KB:Tag length was reduced by 1 in every alternate table starting from T2.4-BPAT predictor used with 7-bit tagged entries & 3-bit useful counters.
• 32KB:Table T6 of TAGE has been halved in size. 8-BPAT predictor used with 8-bit tagged entries & 4-bit useful counter.
• Unlimited:8-BPAT predictor used with 16-bit tagged width.
H-Pattern with TAGE
4KB 32KB Unlimited0
0.5
1
1.5
2
2.5
3
3.5
43.735
2.678
2.177
3.712
2.644
2.164
TAGE H-Pattern with TAGEMispredictions per Kilo Instructions – CBP 2014 Framework
Reference ISL-TAGE Configurations
• 4KB:Configuration was same as the 8-component predictor specified in the paper “A case for (partially) Tagged Geometric history length branch prediction”, by André Seznec and Pierre Michaud, with space freed from the base bimodal predictor by having only 2K prediction entries and 1K hysteresis entries to accommodate statistical corrector and loop predictor. History lengths = 5 to 126.
• 32KB:Configuration (including history lengths) was identical to the one specified in
the paper “A 64KBit ISL-TAGE branch predictor ”, by André Seznec, with all storage tables halved.
• Unlimited:18 tagged tables were used.History Lengths = 3 to 2000
H-Pattern with ISL-TAGE Configurations
• 4KBFrom the reference 4KB ISL-TAGE, freed one tag bit from every
alternate table starting from T2.4-BPAT predictor used with 7-bit tagged entries & 3-bit useful
counters.
• 32KBFrom the reference 32KB ISL-TAGE, halved the last shared table and reduced the size of statistical corrector and loop predictor.4-BPAT predictor used with 6-bit tagged entries & 3-bit useful
counters.
• UnlimitedIn combination with the reference Unlimited ISL-TAGE predictor, an 8-BPAT predictor was used with 16-bit tagged entries & 4-bit useful counters.
H-Pattern with ISL-TAGE
4KB 32KB Unlimited0
0.5
1
1.5
2
2.5
3
3.5
43.706
2.549
2.076
3.691
2.542
2.058
ISL-TAGE H-Pattern with ISL-TAGE
Mispredictions per Kilo Instructions – CBP 2014 Framework
Further Statistics: Success rates
H-Pattern with Component Unlimited 32KB 4KBGshare nBPAT 99.30% 99.10% 98.60%
AltPred 94.50% 93.40% 91.30%TAGE nBPAT 99.50% 99.59% 98.98%
AltPred 97.93% 97.15% 96.86%ISL-TAGE nBPAT 99.62% 99.56% 98.94%
AltPred 98.09% 97.37% 96.93%
𝑆𝑢𝑐𝑐𝑒𝑠𝑠𝑟𝑎𝑡𝑒=𝑁𝑜 .𝑜𝑓 𝑠𝑢𝑐𝑐𝑒𝑠𝑠𝑓𝑢𝑙𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛𝑠𝑁𝑜 .𝑜𝑓 𝑎𝑡𝑡𝑒𝑚𝑝𝑡𝑒𝑑𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛𝑠
Thank You