virtually6aged)sampling)dmr)) · openrisc processor spec2000 sensitivity vector for fan in for each...
TRANSCRIPT
![Page 1: Virtually6Aged)Sampling)DMR)) · OpenRISC processor SPEC2000 Sensitivity vector for fan in for each gate during a 100000 cycle sampling window 9 months from degradation,SS-DMR mode](https://reader033.vdocuments.site/reader033/viewer/2022050114/5f4b06fc0246c13e6c081aeb/html5/thumbnails/1.jpg)
Raghuraman)Balasubramanian))
Karthikeyan)Sankaralingam))
)
Virtually6Aged)Sampling)DMR))Unifying)Circuit)Failure)Prediction)and)Detection)
![Page 2: Virtually6Aged)Sampling)DMR)) · OpenRISC processor SPEC2000 Sensitivity vector for fan in for each gate during a 100000 cycle sampling window 9 months from degradation,SS-DMR mode](https://reader033.vdocuments.site/reader033/viewer/2022050114/5f4b06fc0246c13e6c081aeb/html5/thumbnails/2.jpg)
Microprocessor)Reliability)!)
! A)lot)of)research)on)how)to…)! Mitigate)/)Recover)/)Repair)…)
! Detect):)DMR,)Diva,)Argus,)BIST,)SWAT…)! Predict):)Canaries,)Razor,)WearMon…)
! Coverage,)detection)latency,)fault)type…))
Failu
re)Rate)
Time)(years))
More)devices)will)fail)on)the)field)in))future)technology)nodes)
2)
![Page 3: Virtually6Aged)Sampling)DMR)) · OpenRISC processor SPEC2000 Sensitivity vector for fan in for each gate during a 100000 cycle sampling window 9 months from degradation,SS-DMR mode](https://reader033.vdocuments.site/reader033/viewer/2022050114/5f4b06fc0246c13e6c081aeb/html5/thumbnails/3.jpg)
Circuit)Failure)Prediction)
3)
! Our)goals)! Low)Design)Complexity)
! Low)Overheads)
! High)Accuracy)! Full)Coverage)
3)
Time (years)A gate fails,causes errors
Can we predict the failure?
![Page 4: Virtually6Aged)Sampling)DMR)) · OpenRISC processor SPEC2000 Sensitivity vector for fan in for each gate during a 100000 cycle sampling window 9 months from degradation,SS-DMR mode](https://reader033.vdocuments.site/reader033/viewer/2022050114/5f4b06fc0246c13e6c081aeb/html5/thumbnails/4.jpg)
To)get)there…)
4)
Lets)start)from)a)good)baseline))Sampling6DMR)
4)
![Page 5: Virtually6Aged)Sampling)DMR)) · OpenRISC processor SPEC2000 Sensitivity vector for fan in for each gate during a 100000 cycle sampling window 9 months from degradation,SS-DMR mode](https://reader033.vdocuments.site/reader033/viewer/2022050114/5f4b06fc0246c13e6c081aeb/html5/thumbnails/5.jpg)
Sampling+DMR)
Nomura,)Shuou,)et)al.)"Sampling+dmr:)practical)and)low6overhead)permanent)fault)detection.")International*Symposium*on*Computer*Architecture*(ISCA),*2011)
! Permanent)fault)detection))! )100%)coverage)
! <)2%)Energy)overheads))
)
Checked coreChecked core
Checker core
Reliability Manager
ProcessorSignature Generator
Comparator
Control
Error
Checker ID DMR ActiveAge mode
Router
Trace StallCacherefill
Rou
ter
Full
time
Checked core
Checker coreCoupled
time
Checked core
Checker core
DMR Mode
DMR Mode Normal Operation
Occupied/FreeChecking Checking
Coupled
5)
![Page 6: Virtually6Aged)Sampling)DMR)) · OpenRISC processor SPEC2000 Sensitivity vector for fan in for each gate during a 100000 cycle sampling window 9 months from degradation,SS-DMR mode](https://reader033.vdocuments.site/reader033/viewer/2022050114/5f4b06fc0246c13e6c081aeb/html5/thumbnails/6.jpg)
But)There)is)a)Problem)
µP
SamplingWindows
Architectural Errors
A Gate Fails
Time (years)
Sampling6DMR)
With)Infrequently))Occurring)Errors) Missed Errors !
Sampling6DMR)
Virtually)Aged)
Virtual)aging)makes)the)gates)behave)as)if)they)were)6)months)older)
6) 6)
![Page 7: Virtually6Aged)Sampling)DMR)) · OpenRISC processor SPEC2000 Sensitivity vector for fan in for each gate during a 100000 cycle sampling window 9 months from degradation,SS-DMR mode](https://reader033.vdocuments.site/reader033/viewer/2022050114/5f4b06fc0246c13e6c081aeb/html5/thumbnails/7.jpg)
Virtually)Aged)Sampling)DMR)
Virtual)Aging)))))Fault)Exposure))
)• In)most)gates)the)faults)are)automatically)exposed)• A)new)mechanism)to)expose)faults)in)other)gates)
)
Detect)Errors)
Time (years)Today Virtual Age
A gate fails,causes errors
Sampling - DMR is active
Checked Core(Virtually Aged)
Checker Core
Applications
7)
![Page 8: Virtually6Aged)Sampling)DMR)) · OpenRISC processor SPEC2000 Sensitivity vector for fan in for each gate during a 100000 cycle sampling window 9 months from degradation,SS-DMR mode](https://reader033.vdocuments.site/reader033/viewer/2022050114/5f4b06fc0246c13e6c081aeb/html5/thumbnails/8.jpg)
Executive)Summary)
! Virtually)Aged)Sampling6DMR)! Microprocessor)Failure)Prediction)
! Full)logic)coverage)! With)<)0.7%)energy)overhead)
! Negligible)performance)overhead)
)
8) 8)
![Page 9: Virtually6Aged)Sampling)DMR)) · OpenRISC processor SPEC2000 Sensitivity vector for fan in for each gate during a 100000 cycle sampling window 9 months from degradation,SS-DMR mode](https://reader033.vdocuments.site/reader033/viewer/2022050114/5f4b06fc0246c13e6c081aeb/html5/thumbnails/9.jpg)
Outline)
! Motivation)and)Overview)
! Virtual)aging?)! Are)all)gates)covered?)! Evaluation)Methodology)
! Results)! Related)work)
! Questions)
9) 9)
![Page 10: Virtually6Aged)Sampling)DMR)) · OpenRISC processor SPEC2000 Sensitivity vector for fan in for each gate during a 100000 cycle sampling window 9 months from degradation,SS-DMR mode](https://reader033.vdocuments.site/reader033/viewer/2022050114/5f4b06fc0246c13e6c081aeb/html5/thumbnails/10.jpg)
Virtual)Aging)
Time in Months
50psSlack = 70ps
Slack = 20 ps
Clo
ck P
erio
dG
ate
dela
y
As)a)chip)wears)out,)the)gates)become)slower)
Vdd
Gat
e de
lay
50ps
40mV
As)we)decrease)Vdd,)the)gates)become)slower)
Virtual)aging)=>))Reducing)Vdd)==)66month)Delay)Degradation)
10) 10)
![Page 11: Virtually6Aged)Sampling)DMR)) · OpenRISC processor SPEC2000 Sensitivity vector for fan in for each gate during a 100000 cycle sampling window 9 months from degradation,SS-DMR mode](https://reader033.vdocuments.site/reader033/viewer/2022050114/5f4b06fc0246c13e6c081aeb/html5/thumbnails/11.jpg)
Outline)
! Motivation)and)Overview)
! Virtual)aging)! Are$all$gates$covered?$! Evaluation)Methodology)
! Results)! Related)work)
! Questions)
11) 11)
![Page 12: Virtually6Aged)Sampling)DMR)) · OpenRISC processor SPEC2000 Sensitivity vector for fan in for each gate during a 100000 cycle sampling window 9 months from degradation,SS-DMR mode](https://reader033.vdocuments.site/reader033/viewer/2022050114/5f4b06fc0246c13e6c081aeb/html5/thumbnails/12.jpg)
Are)all)gates)covered?)
12)
! Most)gates)(near6critical)paths))✔)
! Initial)worst6case)propagation)delay)�$clock)period)! Wearout)�)propagation)delay�)>)clock)period)
! Delay$fault$is$naturally$exposed$)
! Some)gates)(non6critical)paths))✗)! Initial)worst6case)propagation)delay)<<)clock)period)
! Wearout)�)propagation)delay�)<)clock)period)�)Fault)is)not)manifested)
! Delay$degradation$is$benign$! Eventually)catastrophic)breakdown)!)
Photo credit : Wikimedia Commons 12)
![Page 13: Virtually6Aged)Sampling)DMR)) · OpenRISC processor SPEC2000 Sensitivity vector for fan in for each gate during a 100000 cycle sampling window 9 months from degradation,SS-DMR mode](https://reader033.vdocuments.site/reader033/viewer/2022050114/5f4b06fc0246c13e6c081aeb/html5/thumbnails/13.jpg)
Soft)and)Hard)breakdown)
CLK InD Q
Time
D Q
D Q
Guardband
Degradation
Timing ViolationSoft breakdown
0 years
2.5 years
3 years
Capture edgeCLK CLK
D Q
Input
D Q Fault Exposed
2.5 years +Virtual Aging(a) Near-Critical Paths
Fault Manifested
Clock Input
D Q
D Q
D Q
Large slack
Degradation
Hard breakdown
Capture edge
CLK CLK
D Q
Input
D Q No Fault seen
Fault Manifested
Q' Fault ExposedPhased Clock
(b) Non-Critical Paths
13)
Degradation)=)f(utilization,))))))))))))))))))))))))))))))))))))operating)conditions,)) ) ))))))))))))))))))process)variations))
Any)gate)may)fail.))
13)
![Page 14: Virtually6Aged)Sampling)DMR)) · OpenRISC processor SPEC2000 Sensitivity vector for fan in for each gate during a 100000 cycle sampling window 9 months from degradation,SS-DMR mode](https://reader033.vdocuments.site/reader033/viewer/2022050114/5f4b06fc0246c13e6c081aeb/html5/thumbnails/14.jpg)
Fault)Capture)Logic)for)Non6Critical)Paths)
14)
CLK
Near-critical paths
Non-critical pathfast gate
14)
![Page 15: Virtually6Aged)Sampling)DMR)) · OpenRISC processor SPEC2000 Sensitivity vector for fan in for each gate during a 100000 cycle sampling window 9 months from degradation,SS-DMR mode](https://reader033.vdocuments.site/reader033/viewer/2022050114/5f4b06fc0246c13e6c081aeb/html5/thumbnails/15.jpg)
Comprehensive)Logic)coverage))15)
CLK
Near-critical paths
Non-critical pathfast gate
phased CLK Additional logic inserted to cover
fast gatesAging mode
Capture Flop
Clock Gate
Fault)Capture)Logic)for)Non6Critical)Paths)
15)
![Page 16: Virtually6Aged)Sampling)DMR)) · OpenRISC processor SPEC2000 Sensitivity vector for fan in for each gate during a 100000 cycle sampling window 9 months from degradation,SS-DMR mode](https://reader033.vdocuments.site/reader033/viewer/2022050114/5f4b06fc0246c13e6c081aeb/html5/thumbnails/16.jpg)
Virtually)Aged)SDMR)
16)
CLK
Processor circuit
Clock Phase Shifting Logic
No Modifications to Critical Paths
DVS
Virtual Ager
Fault ExposureSupply Voltage
Low OverheadsGenerality: { Soft Breakdown
High AccuracyHard Breakdown }
Low Design-Complexity
16)
![Page 17: Virtually6Aged)Sampling)DMR)) · OpenRISC processor SPEC2000 Sensitivity vector for fan in for each gate during a 100000 cycle sampling window 9 months from degradation,SS-DMR mode](https://reader033.vdocuments.site/reader033/viewer/2022050114/5f4b06fc0246c13e6c081aeb/html5/thumbnails/17.jpg)
Outline)
! Motivation)and)Overview)
! Virtual)aging??)! Are)all)gates)covered??)! Evaluation$Methodology$! Results)! Related)work)
! Questions)
17) 17)
![Page 18: Virtually6Aged)Sampling)DMR)) · OpenRISC processor SPEC2000 Sensitivity vector for fan in for each gate during a 100000 cycle sampling window 9 months from degradation,SS-DMR mode](https://reader033.vdocuments.site/reader033/viewer/2022050114/5f4b06fc0246c13e6c081aeb/html5/thumbnails/18.jpg)
Evaluation)Methodology)
Synopsys))HSPICE)+)MOSRA)
Delay*as*a*function**of*Time/Vdd*
1
1
0
Delay)Aware)Simulation)
µP1
1
0
Applications)
Input*Sequences*
µP1
1
0
Applications)
µP
DMR)Error??)
Fault*Vector*
Time in Months
50psSlack = 70ps
Slack = 20 ps
Clo
ck P
erio
dG
ate
dela
y
• Full)SPEC)benchmarks)• OpenRISC)Processor)• ~400,000)Fault)Injection))))))))Experiments)
18)
![Page 19: Virtually6Aged)Sampling)DMR)) · OpenRISC processor SPEC2000 Sensitivity vector for fan in for each gate during a 100000 cycle sampling window 9 months from degradation,SS-DMR mode](https://reader033.vdocuments.site/reader033/viewer/2022050114/5f4b06fc0246c13e6c081aeb/html5/thumbnails/19.jpg)
Outline)
! Motivation)and)Overview)
! Virtual)aging?)! Are)all)gates)covered?)! Evaluation)Methodology)
! Results$! Related)work)
! Questions)
19) 19)
![Page 20: Virtually6Aged)Sampling)DMR)) · OpenRISC processor SPEC2000 Sensitivity vector for fan in for each gate during a 100000 cycle sampling window 9 months from degradation,SS-DMR mode](https://reader033.vdocuments.site/reader033/viewer/2022050114/5f4b06fc0246c13e6c081aeb/html5/thumbnails/20.jpg)
Results)
20)
1. Is)delay)degradation)measurably)observable?)
2. Can)voltage)reduction)mimic)virtual)aging?)
3. Do)the)manifested)faults)get)exposed)to)the))μ)arch)and)cause)timing)faults?)
4. Do)the)faults)exposed)to)the)microarchitecture)translate)to)architectural)errors,)then)detected?)
5. What)are)the)overheads?)
20)
Paper*includes*results*on*running*10*SPEC*benchmarks*to*completion*spanning*almost*400,000*experimental*runs*
![Page 21: Virtually6Aged)Sampling)DMR)) · OpenRISC processor SPEC2000 Sensitivity vector for fan in for each gate during a 100000 cycle sampling window 9 months from degradation,SS-DMR mode](https://reader033.vdocuments.site/reader033/viewer/2022050114/5f4b06fc0246c13e6c081aeb/html5/thumbnails/21.jpg)
1. Is)delay)degradation)measurably)observable?)
21)
! 5)gates)represent)fault)sites)! Model)paths)through)these)gates)in)HSPICE)
! MOSRA)wearout)models)
21)
![Page 22: Virtually6Aged)Sampling)DMR)) · OpenRISC processor SPEC2000 Sensitivity vector for fan in for each gate during a 100000 cycle sampling window 9 months from degradation,SS-DMR mode](https://reader033.vdocuments.site/reader033/viewer/2022050114/5f4b06fc0246c13e6c081aeb/html5/thumbnails/22.jpg)
2. Can)voltage)reduction)mimic)virtual)aging?)
22)
! HSPICE)@)Vdd)=)1.2)V,)Vdd)=)1.15V)
22)
![Page 23: Virtually6Aged)Sampling)DMR)) · OpenRISC processor SPEC2000 Sensitivity vector for fan in for each gate during a 100000 cycle sampling window 9 months from degradation,SS-DMR mode](https://reader033.vdocuments.site/reader033/viewer/2022050114/5f4b06fc0246c13e6c081aeb/html5/thumbnails/23.jpg)
5. What)are)the)overheads?)
! Synthesized)with)32nm)Synopsys)process)
! Implemented)additional)logic)for)fast)paths)
)
23)
OpenRISC) OpenSPARC)
Logic) Processor) Logic) Processor)
Gates)on)Fast)Path) 39%) 30%)
Area)Overhead) 28.9%) 8.9%) 22.2%) 6.8%)
Peak)Power)Increase) 3.2%) 2.54%) 2.21%) 0.99%)
Energy)Increase) 0.9%) 0.7%) 1.02%) 1.07%)
23)
![Page 24: Virtually6Aged)Sampling)DMR)) · OpenRISC processor SPEC2000 Sensitivity vector for fan in for each gate during a 100000 cycle sampling window 9 months from degradation,SS-DMR mode](https://reader033.vdocuments.site/reader033/viewer/2022050114/5f4b06fc0246c13e6c081aeb/html5/thumbnails/24.jpg)
Results)6)Summary)
24) 24)
! Experimental)Result)Predict$failures$9$months$in$advance$$using$a$Vdd$reduction$of$50mV$$
! Empirical)result)+)Mathematical)modeling))Can$predict$failure$within$0.4$days$$in$all$but$1$of$1$billion$chips$
![Page 25: Virtually6Aged)Sampling)DMR)) · OpenRISC processor SPEC2000 Sensitivity vector for fan in for each gate during a 100000 cycle sampling window 9 months from degradation,SS-DMR mode](https://reader033.vdocuments.site/reader033/viewer/2022050114/5f4b06fc0246c13e6c081aeb/html5/thumbnails/25.jpg)
Outline)
! Motivation)and)Overview)
! Virtual)aging?)! Are)all)gates)covered?)! Evaluation)Methodology)
! Results)! Related$work$! Questions)
25) 25)
![Page 26: Virtually6Aged)Sampling)DMR)) · OpenRISC processor SPEC2000 Sensitivity vector for fan in for each gate during a 100000 cycle sampling window 9 months from degradation,SS-DMR mode](https://reader033.vdocuments.site/reader033/viewer/2022050114/5f4b06fc0246c13e6c081aeb/html5/thumbnails/26.jpg)
Circuit)Failure)Prediction)
26)
! Predict)the)onset)of)failures)! Low)Design)Complexity)
! Low)Overheads)
! High)Accuracy)! Full)Coverage)
Time (years)A gate fails,causes errors
Can we predict the failure?
26)
![Page 27: Virtually6Aged)Sampling)DMR)) · OpenRISC processor SPEC2000 Sensitivity vector for fan in for each gate during a 100000 cycle sampling window 9 months from degradation,SS-DMR mode](https://reader033.vdocuments.site/reader033/viewer/2022050114/5f4b06fc0246c13e6c081aeb/html5/thumbnails/27.jpg)
Technique$ Complexity$ Overheads$ Accuracy$ Coverage$
Canary)circuits) ✓ ✓ ✗ ✗
Related)Work)
27)
On6chip)test)circuits))
27)
![Page 28: Virtually6Aged)Sampling)DMR)) · OpenRISC processor SPEC2000 Sensitivity vector for fan in for each gate during a 100000 cycle sampling window 9 months from degradation,SS-DMR mode](https://reader033.vdocuments.site/reader033/viewer/2022050114/5f4b06fc0246c13e6c081aeb/html5/thumbnails/28.jpg)
Technique$ Complexity$ Overheads$ Accuracy$ Coverage$
Canary)circuits) ✓ ✓ ✗ ✗ Age)Detection))(Shadow))Latches) ✗ ✗ ✓ ✗
Related)Work)
28)
Detect)aging)in)select)near6critical)paths)
27)
![Page 29: Virtually6Aged)Sampling)DMR)) · OpenRISC processor SPEC2000 Sensitivity vector for fan in for each gate during a 100000 cycle sampling window 9 months from degradation,SS-DMR mode](https://reader033.vdocuments.site/reader033/viewer/2022050114/5f4b06fc0246c13e6c081aeb/html5/thumbnails/29.jpg)
Technique$ Complexity$ Overheads$ Accuracy$ Coverage$
Canary)circuits) ✓ ✓ ✗ ✗ Age)Detection))(Shadow))Latches) ✗ ✗ ✓ ✗ BIST/DFT)Aging)Analysis) ✗ ✗ ✓ ✗
Related)Work)
29) 27)
Periodic)testing)(offline))using)on6chip)test)vectors)
![Page 30: Virtually6Aged)Sampling)DMR)) · OpenRISC processor SPEC2000 Sensitivity vector for fan in for each gate during a 100000 cycle sampling window 9 months from degradation,SS-DMR mode](https://reader033.vdocuments.site/reader033/viewer/2022050114/5f4b06fc0246c13e6c081aeb/html5/thumbnails/30.jpg)
Technique$ Complexity$ Overheads$ Accuracy$ Coverage$
Canary)circuits) ✓ ✓ ✗ ✗ Age)Detection))(Shadow))Latches) ✗ ✗ ✓ ✗ BIST/DFT)Aging)Analysis) ✗ ✗ ✓ ✗ Continuous)Delay)Tracking) ✗ ✗ ✓ ✗
Related)Work)
30) 27)
Measure)+)Analyze)(online))
![Page 31: Virtually6Aged)Sampling)DMR)) · OpenRISC processor SPEC2000 Sensitivity vector for fan in for each gate during a 100000 cycle sampling window 9 months from degradation,SS-DMR mode](https://reader033.vdocuments.site/reader033/viewer/2022050114/5f4b06fc0246c13e6c081aeb/html5/thumbnails/31.jpg)
Technique$ Complexity$ Overheads$ Accuracy$ Coverage$
Canary)circuits) ✓ ✓ ✗ ✗ Age)Detection))(Shadow))Latches) ✗ ✗ ✓ ✗ BIST/DFT)Aging)Analysis) ✗ ✗ ✓ ✗ Continuous)Delay)Tracking) ✗ ✗ ✓ ✗ Virtually)Aged))Sampling)DMR) ✓ ✓ ✓ ✓
Related)Work)
31) 27)
Reduce)Vdd)+)Expose)Faults)
![Page 32: Virtually6Aged)Sampling)DMR)) · OpenRISC processor SPEC2000 Sensitivity vector for fan in for each gate during a 100000 cycle sampling window 9 months from degradation,SS-DMR mode](https://reader033.vdocuments.site/reader033/viewer/2022050114/5f4b06fc0246c13e6c081aeb/html5/thumbnails/32.jpg)
Contributions)
! Virtually)Aged)Sampling6DMR)! Microprocessor)Failure)Prediction)
! Full)logic)coverage)! With)<)0.7%)energy)overhead)
! Negligible)performance)overhead)
! A)new)state6of6the6art)in)evaluation))! Accurate)wearout)models)at)the)gate)level))
! And)impact)on)full)system)(running)full)benchmarks))
)
32)
Thank)You)
28)
![Page 33: Virtually6Aged)Sampling)DMR)) · OpenRISC processor SPEC2000 Sensitivity vector for fan in for each gate during a 100000 cycle sampling window 9 months from degradation,SS-DMR mode](https://reader033.vdocuments.site/reader033/viewer/2022050114/5f4b06fc0246c13e6c081aeb/html5/thumbnails/33.jpg)
How)Devices)Degrade)
! NBTI,)HCI,)TDDB)
! Over)time,)Threshold)Voltage)Increases)))))))))))Propagation)Delay)Increases)
! NOT)covered:)Electromigration,)thermal)runaway)
td =2LC
WµeffCox(Vdd −Vth)2
⇒
33)
Target)failure)mechanisms)for)which)))delay)degradation)is)a)symptom)
33)
![Page 34: Virtually6Aged)Sampling)DMR)) · OpenRISC processor SPEC2000 Sensitivity vector for fan in for each gate during a 100000 cycle sampling window 9 months from degradation,SS-DMR mode](https://reader033.vdocuments.site/reader033/viewer/2022050114/5f4b06fc0246c13e6c081aeb/html5/thumbnails/34.jpg)
Variations)
! Process)variations)(Static)*! Some)processors)are)more))susceptible)
)
)
! Voltage)variations)(Dynamic)*! Variations)~1)order)of)magnitude))smaller)compared)to)degradation)
! Similar)conditions)in)actual)failure)&)virtual)aging)
Reddi,)Vijay)Janapa,)et)al.)"Voltage)noise)in)production)processors.")Micro,*IEEE)31.1)(2011).)
34)
![Page 35: Virtually6Aged)Sampling)DMR)) · OpenRISC processor SPEC2000 Sensitivity vector for fan in for each gate during a 100000 cycle sampling window 9 months from degradation,SS-DMR mode](https://reader033.vdocuments.site/reader033/viewer/2022050114/5f4b06fc0246c13e6c081aeb/html5/thumbnails/35.jpg)
When)does)this)not)work?)
35)
! Only)when)the)conditions)change)drastically)between)prediction)and)actual)failure)! Change)in)program)behavior)
! Operating)conditions)(Temperature,)Voltage)etc.,))
! Program)hides)fault)exposure)(but)stresses)it))
! As)long)as)the)fault)is)manifested)0.4)days)before)the)actual)failure)–)Aged6SDMR)works.)
![Page 36: Virtually6Aged)Sampling)DMR)) · OpenRISC processor SPEC2000 Sensitivity vector for fan in for each gate during a 100000 cycle sampling window 9 months from degradation,SS-DMR mode](https://reader033.vdocuments.site/reader033/viewer/2022050114/5f4b06fc0246c13e6c081aeb/html5/thumbnails/36.jpg)
Evaluation)setup)
OpenRISC RTL
Synopsys Design Compiler
32nm
lib32
nm lib
HSPICE + MOSRA
1 11
1 0
CLK
Gate under test
Worst case path
TimeVoltageSwitching Activity
Time
Dela
y
CLK
OpenRISCprocessor
SPEC2000Sensitivity vector for fan in for each gate during a 100000 cycle sampling window
9 months from degradation,SS-DMR mode
1
1
0
Tim
ing
Faul
t Rat
e
Xilinx Zynq FPGA
Arch
itect
ural
Erro
r Rat
e
Aged SDMR Emulation
Gate subcircuit
Netlist
Synopsys Design Compiler- STA
Fast Gates
Script to insertcapture logic
Modified Netlist
Area, Power, Energy overheads
@ different gates@ supply voltage reduction@ switching activity variation
Q1 : Is delay degradation in CMOS logic measurably observable? Is this deterministic?Q2 : Can reducing supply voltage virtually manifest wearout faults?
Q3 : Do these faults get exposed to themicro-architecture and cause timing faults? Q5 : What are the overheads?
Q4 : Do these timing faults translate to architectural errors, then detected?
OpenRISC Processor
Architecture state
OpenRISC Processor
Architecture state
Test controller
Checker
SPEC2000
Fault Vectors
Ager
Delay Aware Simulation
Sensitivity vector extractionExposing Faults
Detecting Architectural Errors
Manifesting Faults
Estimating Overheads
NetlistVCS
A1, A2 : Figure 8 A4 : Table 5
A4 : Table 6A3 : Table 4
Xilinx Zynq FPGA
36)
![Page 37: Virtually6Aged)Sampling)DMR)) · OpenRISC processor SPEC2000 Sensitivity vector for fan in for each gate during a 100000 cycle sampling window 9 months from degradation,SS-DMR mode](https://reader033.vdocuments.site/reader033/viewer/2022050114/5f4b06fc0246c13e6c081aeb/html5/thumbnails/37.jpg)
3. Do)the)manifested)faults)get)exposed)to)the)μ6arch)and)cause)timing)faults?)
37)
! Delay)Aware)Simulation)
! Input)sequences)from)OpenRISC)FPGA)! 10)benchmarks)(6)SPEC)INT,)4)SPEC)FP))
! 5)million)cycle)traces)x)3)phases)of)the)program)
! Cycle)accurate)fault)vectors)
We)saw)timing)faults)appear)during)the)sampling)windows)
![Page 38: Virtually6Aged)Sampling)DMR)) · OpenRISC processor SPEC2000 Sensitivity vector for fan in for each gate during a 100000 cycle sampling window 9 months from degradation,SS-DMR mode](https://reader033.vdocuments.site/reader033/viewer/2022050114/5f4b06fc0246c13e6c081aeb/html5/thumbnails/38.jpg)
4. Do)the)faults)exposed)in)the)microarchitecture)translate)to)architectural)errors,)then)detected?)
38)
! Fault)vector)from)delay)aware)simulation)
! Injected)on)OpenRISC)on)FPGA)+)DMR)emulation)
Appln G1 G2 G3 G4 G5
ammp$ 1.60%$ 3.10%$ 5.10%$ 1.40%$ 1.40%$art$ 0.02%$ 2.70%$ 0.01%$ 2.60%$ 0.01%$bzip$ 2.30%$ 1.20%$ 0.90%$ 0.20%$ 0.07%$gzip$ 1.50%$ 0.03%$ 0.40%$ 0.04%$ 0.01%$mcf$ 3.40%$ 3.10%$ 0.90%$ 0.70%$ 0.02%$mesa$ 2.20%$ 1.00%$ 1.20%$ 0.09%$ 0.80%$parser$ 4.30%$ 1.30%$ 1.90%$ 0.50%$ 1.50%$quake$ 1.90%$ 0.90%$ 0.80%$ 0.20%$ 1.30%$twolf$ 3.30%$ 1.10%$ 0.02%$ 4.30%$ 1.90%$vpr$ 2.60%$ 0.80%$ 2.10%$ 0.70%$ 1.60%$
Architecture error rate using 100000 cycle sampling windows
![Page 39: Virtually6Aged)Sampling)DMR)) · OpenRISC processor SPEC2000 Sensitivity vector for fan in for each gate during a 100000 cycle sampling window 9 months from degradation,SS-DMR mode](https://reader033.vdocuments.site/reader033/viewer/2022050114/5f4b06fc0246c13e6c081aeb/html5/thumbnails/39.jpg)
Canary)based)
39)
CLK
Processor circuit
Representative test circuit
Delay measurement
Failure predicted?
Low OverheadsGenerality: { Soft Breakdown
Low Design-Complexity High AccuracyHard Breakdown }
39/29)
![Page 40: Virtually6Aged)Sampling)DMR)) · OpenRISC processor SPEC2000 Sensitivity vector for fan in for each gate during a 100000 cycle sampling window 9 months from degradation,SS-DMR mode](https://reader033.vdocuments.site/reader033/viewer/2022050114/5f4b06fc0246c13e6c081aeb/html5/thumbnails/40.jpg)
Age)Detection)Latches)
40)
CLK
Processor circuit Age detecting latch
Circuit level failuresGates coveredGates missed
Low OverheadsGenerality: { Soft Breakdown
High AccuracyHard Breakdown }
Low Design-Complexity
40/29)
![Page 41: Virtually6Aged)Sampling)DMR)) · OpenRISC processor SPEC2000 Sensitivity vector for fan in for each gate during a 100000 cycle sampling window 9 months from degradation,SS-DMR mode](https://reader033.vdocuments.site/reader033/viewer/2022050114/5f4b06fc0246c13e6c081aeb/html5/thumbnails/41.jpg)
BIST/DFT)Based)(Offline))
41)
CLK
Processor circuit
Test circuitBIST/DFT Generate Failure predicted?
Test vector in Test vector outAnalysis
Low OverheadsGenerality: { Soft Breakdown
High AccuracyHard Breakdown }†
Low Design-Complexity
41/29)
![Page 42: Virtually6Aged)Sampling)DMR)) · OpenRISC processor SPEC2000 Sensitivity vector for fan in for each gate during a 100000 cycle sampling window 9 months from degradation,SS-DMR mode](https://reader033.vdocuments.site/reader033/viewer/2022050114/5f4b06fc0246c13e6c081aeb/html5/thumbnails/42.jpg)
Continuous)Degradation)Tracking)
42)
CLK
Processor circuit
Degradation measurement and Analysis
Failure predicted?
Low OverheadsGenerality: { Soft Breakdown
High AccuracyHard Breakdown }†
Low Design-Complexity
42/29)
![Page 43: Virtually6Aged)Sampling)DMR)) · OpenRISC processor SPEC2000 Sensitivity vector for fan in for each gate during a 100000 cycle sampling window 9 months from degradation,SS-DMR mode](https://reader033.vdocuments.site/reader033/viewer/2022050114/5f4b06fc0246c13e6c081aeb/html5/thumbnails/43.jpg)
Evaluation)Methodology):))Key)Challenges)
! Aged6SDMR)is)a)Cross6layered)Approach))! Wearout)is)a)gate6level)phenomenon)
! Sampling6DMR)works)at)the)architecture)level)
! Application)dependency)! Technique)relies)on)the)application)to)expose)faults))
Run)full)applications)on)a)full)system)simulator)&)model)wearout)at)the)device)level)
43)