performance implications of faults in prediction arrays nikolas ladas yiannakis sazeides veerle...
TRANSCRIPT
![Page 1: Performance Implications of Faults in Prediction Arrays Nikolas Ladas Yiannakis Sazeides Veerle Desmet University of Cyprus Ghent University DFR’ 10 Pisa,](https://reader036.vdocuments.site/reader036/viewer/2022062519/5697bff41a28abf838cbcf93/html5/thumbnails/1.jpg)
Performance Implications of Faults in Prediction ArraysPerformance Implications of Faults in Prediction Arrays
Nikolas LadasYiannakis Sazeides Veerle Desmet
University of Cyprus Ghent University
DFR’ 10Pisa, Italy - 24/1/2010
HiPEAC2010
![Page 2: Performance Implications of Faults in Prediction Arrays Nikolas Ladas Yiannakis Sazeides Veerle Desmet University of Cyprus Ghent University DFR’ 10 Pisa,](https://reader036.vdocuments.site/reader036/viewer/2022062519/5697bff41a28abf838cbcf93/html5/thumbnails/2.jpg)
2
MotivationMotivation● Technology scaling: Opportunities and Challenges● Reliability and computing tomorrow
● Failures will not be exceptional● Various sources of failures
● Manufacturing: imperfections, process-variation● Physical phenomena: soft-errors, wear-out● Power constraints: control operation below Vcc-min
● Key challenge: provide reliable operation with little or no performance degradation in the presence of faults with low-overhead solutions
Nikolas Ladas 24/1/2010
![Page 3: Performance Implications of Faults in Prediction Arrays Nikolas Ladas Yiannakis Sazeides Veerle Desmet University of Cyprus Ghent University DFR’ 10 Pisa,](https://reader036.vdocuments.site/reader036/viewer/2022062519/5697bff41a28abf838cbcf93/html5/thumbnails/3.jpg)
3
Architectural vs Non-Architectural FaultsArchitectural vs Non-Architectural Faults● So far research mainly focused on correctness● Emphasis architectural structures, e.g. caches, registers,
buses, alus etc● However, faults can occur in non-architectural structures,
e.g. predictor and replacement arrays● Faults in non-architectural structures may degrade
performance● Not issue for soft-errors● Can be problem for persistent faults: wear-out, process-
variation, operation below Vcc-min
Nikolas Ladas 24/1/2010
![Page 4: Performance Implications of Faults in Prediction Arrays Nikolas Ladas Yiannakis Sazeides Veerle Desmet University of Cyprus Ghent University DFR’ 10 Pisa,](https://reader036.vdocuments.site/reader036/viewer/2022062519/5697bff41a28abf838cbcf93/html5/thumbnails/4.jpg)
4
Non-architectural ResourcesNon-architectural Resources Arrays
• line predictor• branch direction predictor• return-address-stack• indirect jump predictor• memory dependence prediction• way, hit/miss, bank predictors • replacement arrays (various caches)• hysteresis arrays (various predictors)
• ... Non-Arrays
• branch target address adder• memory prefetch adder• ....
EV6 like core array bits breakdown
Nikolas Ladas 24/1/2010
![Page 5: Performance Implications of Faults in Prediction Arrays Nikolas Ladas Yiannakis Sazeides Veerle Desmet University of Cyprus Ghent University DFR’ 10 Pisa,](https://reader036.vdocuments.site/reader036/viewer/2022062519/5697bff41a28abf838cbcf93/html5/thumbnails/5.jpg)
5
This talk…This talk…● Quantify performance implications of faults in non-
architectural array-structures● Identify which non-architectural array-structures are
the most sensitive to faults● Do we need to worry about protecting these
structures?
Nikolas Ladas 24/1/2010
![Page 6: Performance Implications of Faults in Prediction Arrays Nikolas Ladas Yiannakis Sazeides Veerle Desmet University of Cyprus Ghent University DFR’ 10 Pisa,](https://reader036.vdocuments.site/reader036/viewer/2022062519/5697bff41a28abf838cbcf93/html5/thumbnails/6.jpg)
6
OutlineOutline● Fault model / Experimental framework● Performance implications of faults when all non-
architectural arrays are faulty● Criticality of the non-architectural arrays studied● Fault semantics● Conclusions and future direction
Nikolas Ladas 24/1/2010
![Page 7: Performance Implications of Faults in Prediction Arrays Nikolas Ladas Yiannakis Sazeides Veerle Desmet University of Cyprus Ghent University DFR’ 10 Pisa,](https://reader036.vdocuments.site/reader036/viewer/2022062519/5697bff41a28abf838cbcf93/html5/thumbnails/7.jpg)
7
Faults and ArraysFaults and Arrays Faults may occur in different parts of an array We only consider cell faults
.
.
.
. . .cell cell ce
llcell cell cell
cell
cell
cell
cell
cell
cell
cell
cell
cell
cell
cell
cell
cell
cell
cell
cell
cell
cell
cell
cell
cell
cell
cell
cell
cell
cell
cell
cell
cell
cell
cell
cell
cell
cell
cell cell
cell cell cell cell cell cell
WL
WL
WL
WL
WL
WL
WL
WL
WL
WL
BL BL’ BL
BL’ BL BL’ BL BL’ BL BL’ BL BL’
cell
driver
decoder bitline
wordlinewordline
Nikolas Ladas 24/1/2010
![Page 8: Performance Implications of Faults in Prediction Arrays Nikolas Ladas Yiannakis Sazeides Veerle Desmet University of Cyprus Ghent University DFR’ 10 Pisa,](https://reader036.vdocuments.site/reader036/viewer/2022062519/5697bff41a28abf838cbcf93/html5/thumbnails/8.jpg)
8
Array Fault Modeling Key ParametersArray Fault Modeling Key Parameters Number of faults:
• consider % of cells that are faulty: 0.125 and 0.5• Understand performance trends with increasing number of faults
Fault Locations• consider random fault locations each affecting 1 cell• Try to capture average behavior
Model for each fault• each faulty cell randomly set at either stuck-at-1 or stuck-at-0
Nikolas Ladas 24/1/2010
![Page 9: Performance Implications of Faults in Prediction Arrays Nikolas Ladas Yiannakis Sazeides Veerle Desmet University of Cyprus Ghent University DFR’ 10 Pisa,](https://reader036.vdocuments.site/reader036/viewer/2022062519/5697bff41a28abf838cbcf93/html5/thumbnails/9.jpg)
9
Processor ModelProcessor Model• EV7 like processor with 15 stage pipeline• 4-way ooo, mispredictions resolved at commit
• Non-Architectural Arrays Considered• Line Predictor Array: 4K entries, 11
bits/entry• Line Predictor Hysteresis Array: 4K entries, 2 bits/entry• LRU array for 2-way 64KB 64B/block I$ : 512 entries, 1 bit/entry• LRU array 2-way 64KB 64B/block D$ : 512 entries, 1 bit/entry • Gshare Direction Predictor: 32Kentries, 2bits/entry• Return address stack: 16 entries, 31bits/entry• Memory dependence predictor (load-wait) 1024 entries, 1 bit/entry
• sim-alpha simulator • SPEC CPU 2000 benchmarks – 100 M instructions
• Representative regionsNikolas Ladas 24/1/2010
![Page 10: Performance Implications of Faults in Prediction Arrays Nikolas Ladas Yiannakis Sazeides Veerle Desmet University of Cyprus Ghent University DFR’ 10 Pisa,](https://reader036.vdocuments.site/reader036/viewer/2022062519/5697bff41a28abf838cbcf93/html5/thumbnails/10.jpg)
10
ExperimentsExperiments Baseline performance: runs with no faults For experiments with faults:
• For each run all arrays with faults have same % of faulty bits 0.125, 0.5
• ALL experiments are performed using the same 100 randomly generated fault maps (50 for each % of faulty bits)
0.125% 0.5% Gshare Direction Predictor 65536 bits: 82 328 Line Predictor Array 45056 bits: 56 225 Line Predictor Hysteresis Array 8192 bits: 10 41 Memory dependence predictor 1024 bits: 1 5 2-way 64KB 64B/block I$ LRU array 512 bits: 1 3 2-way 64KB 64B/block D$ LRU array 512 bits: 1 3 Return address stack 496 bits: 1 3
Nikolas Ladas 24/1/2010
![Page 11: Performance Implications of Faults in Prediction Arrays Nikolas Ladas Yiannakis Sazeides Veerle Desmet University of Cyprus Ghent University DFR’ 10 Pisa,](https://reader036.vdocuments.site/reader036/viewer/2022062519/5697bff41a28abf838cbcf93/html5/thumbnails/11.jpg)
11
Performance with 0.125% Faulty Bits (all arrays faulty)Performance with 0.125% Faulty Bits (all arrays faulty)
Nikolas Ladas 24/1/2010
![Page 12: Performance Implications of Faults in Prediction Arrays Nikolas Ladas Yiannakis Sazeides Veerle Desmet University of Cyprus Ghent University DFR’ 10 Pisa,](https://reader036.vdocuments.site/reader036/viewer/2022062519/5697bff41a28abf838cbcf93/html5/thumbnails/12.jpg)
12
Performance with 0.5% of Faulty Bits (all arrays faulty)Performance with 0.5% of Faulty Bits (all arrays faulty)
Nikolas Ladas 24/1/2010
![Page 13: Performance Implications of Faults in Prediction Arrays Nikolas Ladas Yiannakis Sazeides Veerle Desmet University of Cyprus Ghent University DFR’ 10 Pisa,](https://reader036.vdocuments.site/reader036/viewer/2022062519/5697bff41a28abf838cbcf93/html5/thumbnails/13.jpg)
13
Observations with all arrays faultyObservations with all arrays faulty• Performance degradation substantial even with small % of faulty bits• Both INT and FP benchmarks can degrade
0.125 0.5• Average degradation 1% 3.5%• Max degradation 39% 53% • Degradation is benchmark specific
• Instruction mix (different number and type of vulnerable instructions)• Programs with high accuracy more vulnerable than those with low accuracies• When few arrays entries accessed by a program it takes large number of faults to
have faulty entries accessed• Some benchmarks are memory dominated
• Worst-case degradation much greater than average • Will cause performance variation between otherwise identical
cores/chips • Are all bits equally vulnerable? Which unit(s) matter the most?
Nikolas Ladas 24/1/2010
![Page 14: Performance Implications of Faults in Prediction Arrays Nikolas Ladas Yiannakis Sazeides Veerle Desmet University of Cyprus Ghent University DFR’ 10 Pisa,](https://reader036.vdocuments.site/reader036/viewer/2022062519/5697bff41a28abf838cbcf93/html5/thumbnails/14.jpg)
14
Performance for Each StructurePerformance for Each Structure(0.125% faulty bits)(0.125% faulty bits)
26 benchmarks x 50 experiments for each section
Nikolas Ladas 24/1/2010
![Page 15: Performance Implications of Faults in Prediction Arrays Nikolas Ladas Yiannakis Sazeides Veerle Desmet University of Cyprus Ghent University DFR’ 10 Pisa,](https://reader036.vdocuments.site/reader036/viewer/2022062519/5697bff41a28abf838cbcf93/html5/thumbnails/15.jpg)
15
Performance for Each StructurePerformance for Each Structure(0.5% faulty bits)(0.5% faulty bits)
26 benchmarks x 50 experiments for each section
Nikolas Ladas 24/1/2010
![Page 16: Performance Implications of Faults in Prediction Arrays Nikolas Ladas Yiannakis Sazeides Veerle Desmet University of Cyprus Ghent University DFR’ 10 Pisa,](https://reader036.vdocuments.site/reader036/viewer/2022062519/5697bff41a28abf838cbcf93/html5/thumbnails/16.jpg)
16
ObservationsObservations• For the processor configuration used in this study the
various non-architectural units are not equally vulnerable to same fraction of faults.
• RAS and BPRED are the most sensitive to faults• Line predictor and load-wait predictor degrade
performance significantly when there are 0.5% faults• 2-way I$ and D$ are not sensitive even at 0.5% of faults
in the LRU array
Nikolas Ladas 24/1/2010
![Page 17: Performance Implications of Faults in Prediction Arrays Nikolas Ladas Yiannakis Sazeides Veerle Desmet University of Cyprus Ghent University DFR’ 10 Pisa,](https://reader036.vdocuments.site/reader036/viewer/2022062519/5697bff41a28abf838cbcf93/html5/thumbnails/17.jpg)
17
Reasons for Variable Vulnerability across unitsReasons for Variable Vulnerability across units● Semantics of faults vary across unit● Some faults cause flushing the pipeline, others delay the
execution of an instruction, others cause a one-cycle bubble● Faults causing delays can be less severe since they can be hidden
in the shadow of a misprediction or with ooo● Units with typically higher accuracy more vulnerable (RAS
and conditional predictor)● Even within a unit faults can have different semantics
Nikolas Ladas 24/1/2010
![Page 18: Performance Implications of Faults in Prediction Arrays Nikolas Ladas Yiannakis Sazeides Veerle Desmet University of Cyprus Ghent University DFR’ 10 Pisa,](https://reader036.vdocuments.site/reader036/viewer/2022062519/5697bff41a28abf838cbcf93/html5/thumbnails/18.jpg)
18
Semantics of Faults for a 2-bit Replacement Semantics of Faults for a 2-bit Replacement State Action0x Replace1x No replace0/1 Stack-at value
00R
11N
10N
01R
00R
01R
11N
10N
11N
01R
00R
10N
00R
11N
Always Replace
Never Replace
01R
10N
Nikolas Ladas 24/1/2010
![Page 19: Performance Implications of Faults in Prediction Arrays Nikolas Ladas Yiannakis Sazeides Veerle Desmet University of Cyprus Ghent University DFR’ 10 Pisa,](https://reader036.vdocuments.site/reader036/viewer/2022062519/5697bff41a28abf838cbcf93/html5/thumbnails/19.jpg)
19
Repair mechanism: XOR RemappingRepair mechanism: XOR RemappingAccess map
Fault map
0
40
3
20
50
100
0
0
1
0
0
1
1
0
0
0
XOR 1
•Access map: counts access/entry during an interval•Fault Map: indicates which entries are faulty (can be determined at manufacturing test or at very coarse intervals using BIST)•Remap the index using XOR to minimize faulty accesses•At regular intervals search for the optimal XOR value using the access map and fault map
After remapping
Faulty accesses: 143 70
Nikolas Ladas 24/1/2010
![Page 20: Performance Implications of Faults in Prediction Arrays Nikolas Ladas Yiannakis Sazeides Veerle Desmet University of Cyprus Ghent University DFR’ 10 Pisa,](https://reader036.vdocuments.site/reader036/viewer/2022062519/5697bff41a28abf838cbcf93/html5/thumbnails/20.jpg)
Results
•26 benchmarks x 10 fault maps per category•Recovers most of the performance degradation•Possible to make things worse if we remap when there is no need
20Nikolas Ladas 24/1/2010
![Page 21: Performance Implications of Faults in Prediction Arrays Nikolas Ladas Yiannakis Sazeides Veerle Desmet University of Cyprus Ghent University DFR’ 10 Pisa,](https://reader036.vdocuments.site/reader036/viewer/2022062519/5697bff41a28abf838cbcf93/html5/thumbnails/21.jpg)
21
Summary-ConclusionsSummary-Conclusions● Faults in non-architectural arrays can degrade processor
performance ● Not all faults are equally important. Fault semantics vary.
● RAS and conditional branch predictor the most critical● Faults can cause performance non-determinism across
otherwise identical chips or within the cores of the same chip
Nikolas Ladas 24/1/2010
![Page 22: Performance Implications of Faults in Prediction Arrays Nikolas Ladas Yiannakis Sazeides Veerle Desmet University of Cyprus Ghent University DFR’ 10 Pisa,](https://reader036.vdocuments.site/reader036/viewer/2022062519/5697bff41a28abf838cbcf93/html5/thumbnails/22.jpg)
22
Future WorkFuture Work● Develop analytical model to predict the performance
distribution for a given failure rate● Understand implications of faults for other architectural
and non-architectural structures
Nikolas Ladas 24/1/2010
![Page 23: Performance Implications of Faults in Prediction Arrays Nikolas Ladas Yiannakis Sazeides Veerle Desmet University of Cyprus Ghent University DFR’ 10 Pisa,](https://reader036.vdocuments.site/reader036/viewer/2022062519/5697bff41a28abf838cbcf93/html5/thumbnails/23.jpg)
23
AcknowledgmentsAcknowledgments Costas Kourougiannis
Funding: University of Cyprus, Ghent University, HiPEAC, Intel
Nikolas Ladas 24/1/2010
![Page 24: Performance Implications of Faults in Prediction Arrays Nikolas Ladas Yiannakis Sazeides Veerle Desmet University of Cyprus Ghent University DFR’ 10 Pisa,](https://reader036.vdocuments.site/reader036/viewer/2022062519/5697bff41a28abf838cbcf93/html5/thumbnails/24.jpg)
24
Thanks!Thanks!
![Page 25: Performance Implications of Faults in Prediction Arrays Nikolas Ladas Yiannakis Sazeides Veerle Desmet University of Cyprus Ghent University DFR’ 10 Pisa,](https://reader036.vdocuments.site/reader036/viewer/2022062519/5697bff41a28abf838cbcf93/html5/thumbnails/25.jpg)
25
BACKUP SLIDESBACKUP SLIDES
![Page 26: Performance Implications of Faults in Prediction Arrays Nikolas Ladas Yiannakis Sazeides Veerle Desmet University of Cyprus Ghent University DFR’ 10 Pisa,](https://reader036.vdocuments.site/reader036/viewer/2022062519/5697bff41a28abf838cbcf93/html5/thumbnails/26.jpg)
26
Fault SemanticsFault Semantics Line Predictor Array:
• incorrect prediction• Conditional, returns get corrected within a cycle, indirects are resolved much later
Line Predictor Hysteresis Array: • Always update prediction on a misprediction• Never update
2-way 64KB 64B/block I$ and D$ LRU arrays• Converts sets with faulty LRU bit to direct mapped sets, more misses but can hide
Gshare Direction Predictor• faulty entries always predict taken or always not-taken• Incorrect prediction that gets resolved late (25% chance been lucky)
Return address stack• Return misprediction is resolved late
Memory dependence predictor (load-wait)• Independent load wait (common case we should not wait) can partially hide• Dependent load not wait (this should rarely be a serious problem)
Nikolas Ladas 24/1/2010
![Page 27: Performance Implications of Faults in Prediction Arrays Nikolas Ladas Yiannakis Sazeides Veerle Desmet University of Cyprus Ghent University DFR’ 10 Pisa,](https://reader036.vdocuments.site/reader036/viewer/2022062519/5697bff41a28abf838cbcf93/html5/thumbnails/27.jpg)
27
Processor PipelineProcessor Pipeline
27
. . .
0
4
40954092
8
Adder
4 nops
PCCT1
CT 2
Instruction cache
Line predictor
Branch predictor
RAS
Update line prediction
adder
CT 3
Update program counter
4 nops
CT 4
Miss
Hit
L2
4xn
n n
4xn
=
NLS_PC
Correct PC
Fetch stage Slot stage Commit stage
. . .
Writeback stage
Assign value to PC
(indirect jump)
![Page 28: Performance Implications of Faults in Prediction Arrays Nikolas Ladas Yiannakis Sazeides Veerle Desmet University of Cyprus Ghent University DFR’ 10 Pisa,](https://reader036.vdocuments.site/reader036/viewer/2022062519/5697bff41a28abf838cbcf93/html5/thumbnails/28.jpg)
28
Line predictor Logical structure Line predictor Logical structure
28
.
.
.
.
.
.
TAG 91 2
Predecode bits
Instruction Cache
way0 way1
51
2
inst0 inst1 inst2 inst3
sb0 sb1 sb2 sb3 sb0 sb1 sb2 sb3
inst0 inst1 inst2 inst3
sbX
Valid sb
![Page 29: Performance Implications of Faults in Prediction Arrays Nikolas Ladas Yiannakis Sazeides Veerle Desmet University of Cyprus Ghent University DFR’ 10 Pisa,](https://reader036.vdocuments.site/reader036/viewer/2022062519/5697bff41a28abf838cbcf93/html5/thumbnails/29.jpg)
31
Functional Faults and Array Logical ViewFunctional Faults and Array Logical View
cell
output bit
row address
data_in
Not practical to study faults at physical levelFunctional Models: Abstractions that ease study of faultsFault locations: cell, input address, input/output dataWe only consider cell faults
![Page 30: Performance Implications of Faults in Prediction Arrays Nikolas Ladas Yiannakis Sazeides Veerle Desmet University of Cyprus Ghent University DFR’ 10 Pisa,](https://reader036.vdocuments.site/reader036/viewer/2022062519/5697bff41a28abf838cbcf93/html5/thumbnails/30.jpg)
32
BIST for Detecting Faults and Updating Fault MapBIST for Detecting Faults and Updating Fault Map
![Page 31: Performance Implications of Faults in Prediction Arrays Nikolas Ladas Yiannakis Sazeides Veerle Desmet University of Cyprus Ghent University DFR’ 10 Pisa,](https://reader036.vdocuments.site/reader036/viewer/2022062519/5697bff41a28abf838cbcf93/html5/thumbnails/31.jpg)
33
Example Remapping Search AlgoExample Remapping Search Algo
![Page 32: Performance Implications of Faults in Prediction Arrays Nikolas Ladas Yiannakis Sazeides Veerle Desmet University of Cyprus Ghent University DFR’ 10 Pisa,](https://reader036.vdocuments.site/reader036/viewer/2022062519/5697bff41a28abf838cbcf93/html5/thumbnails/32.jpg)
34
Interleaved vs Non-Interleaved Design Style (1)Interleaved vs Non-Interleaved Design Style (1) Each array wordline contains many entries
Entries in the physical implementation are bit-interleaved• More area efficient
![Page 33: Performance Implications of Faults in Prediction Arrays Nikolas Ladas Yiannakis Sazeides Veerle Desmet University of Cyprus Ghent University DFR’ 10 Pisa,](https://reader036.vdocuments.site/reader036/viewer/2022062519/5697bff41a28abf838cbcf93/html5/thumbnails/33.jpg)
35
Interleaved vs Non-Interleaved Design Style (2)Interleaved vs Non-Interleaved Design Style (2) But a cluster faults affects more entries in interleaved design
For architectural structures: • Soft-errors prefer interleaved• Hard-errors: map to spare/disable block/set
For non-architectural structures: • Soft-errors – no need for protection • Hard-errors: prefer non-interleaved (if area not issue)
![Page 34: Performance Implications of Faults in Prediction Arrays Nikolas Ladas Yiannakis Sazeides Veerle Desmet University of Cyprus Ghent University DFR’ 10 Pisa,](https://reader036.vdocuments.site/reader036/viewer/2022062519/5697bff41a28abf838cbcf93/html5/thumbnails/34.jpg)
36
4K LP:No Interleaving vs Interleaving (average 4K LP:No Interleaving vs Interleaving (average random)random)
![Page 35: Performance Implications of Faults in Prediction Arrays Nikolas Ladas Yiannakis Sazeides Veerle Desmet University of Cyprus Ghent University DFR’ 10 Pisa,](https://reader036.vdocuments.site/reader036/viewer/2022062519/5697bff41a28abf838cbcf93/html5/thumbnails/35.jpg)
37
Random results without and with remappingRandom results without and with remapping
![Page 36: Performance Implications of Faults in Prediction Arrays Nikolas Ladas Yiannakis Sazeides Veerle Desmet University of Cyprus Ghent University DFR’ 10 Pisa,](https://reader036.vdocuments.site/reader036/viewer/2022062519/5697bff41a28abf838cbcf93/html5/thumbnails/36.jpg)
38
Expected InvariantsExpected InvariantsWith increasing faults more performance degradation
Frequently accessed entries more critical than less accessed entries
Cell stuck-at-1 more critical if bits stored in the cell are biased towards zero
![Page 37: Performance Implications of Faults in Prediction Arrays Nikolas Ladas Yiannakis Sazeides Veerle Desmet University of Cyprus Ghent University DFR’ 10 Pisa,](https://reader036.vdocuments.site/reader036/viewer/2022062519/5697bff41a28abf838cbcf93/html5/thumbnails/37.jpg)
39
Worst-case - Hit rateWorst-case - Hit rate
![Page 38: Performance Implications of Faults in Prediction Arrays Nikolas Ladas Yiannakis Sazeides Veerle Desmet University of Cyprus Ghent University DFR’ 10 Pisa,](https://reader036.vdocuments.site/reader036/viewer/2022062519/5697bff41a28abf838cbcf93/html5/thumbnails/38.jpg)
40
Random results without and with remappingRandom results without and with remapping