april 30, 2014 1 cost efficient soft-error protection for asics tuvia liran; ramon chips ltd....
TRANSCRIPT
April 30, 2014 1
Cost efficient soft-error protection for ASICs
Tuvia Liran; Ramon Chips Ltd.- .tuvia@ramon chips com
April 30, 2014 2
What is soft error
An error in the functionality of the chip/system due to corruption of data
Hard error: An error in the functionality of the chip/system due to permanent damage
April 30, 2014 3
Key topics
• SE basics• Sources and mechanisms of SE• Single event effects• Methodologies for SE mitigation• Summary
April 30, 2014 4
Terminology• SE - Soft Error• SER - Soft Error Rate• SEE – Single Event Effect
– SEU – Single Event Upset– SET – Single Event Transient– SEL – Single Event Latch-up
• SEFI – Single Event Functional Interrupt• LET – Rate of energy loss in Si/SiO2 [Mev/mg/cm2]
• SBU - Single bit upset• MCU - Multi-cell upset• MBU – Multi Bit Upset• MTTF – Mean Time to Fail• FIT – Failure in Time; =10-9/MTTF(h)
April 30, 2014 5
Myth Blasting• SE is relevant only for space environment
• SE is affecting memory cells only
• SE is not a reliability issue
• DRAM is more sensitive to SE than SRAM
• Latchup cannot cause SE
WRON
G
WRON
G
WRON
G
WRON
G
WRON
G
April 30, 2014 6
Sources of soft-errors• Terrestrial:
– Interactions with cosmic particles (Neutrons)
– Alpha particles from die/package
• Space:– Protons (>93%)– Alpha (~6%)– Neutrons– Electrons– Alpha and heavy particles
(up to 500Mev)
April 30, 2014 7
Interaction of Neutrons
• Typical flux – 20N/h/cm2@ sea level (NYC)• Energy of cosmic Neutron – 20-300Mev• Strong dependence on altitude• Higher concentration in polar areas (Lat >60)• Interaction with Boron (20% of B10) , mostly in BPSG
(@≥0.25µ) – most dominant effect • Interacts with various nuclei (W, Si, Pb,Au …)• Shielding is not practical
April 30, 2014 8
Alpha particles
• Alpha particle == He++
• Energy: 1-6MeV / particle (1.5Mev from N->B10)
– ~3.6eV requires to generate e-h pair
• Penetration range in Si – 15-30µ• Generating ~45fQb / Mev
– Critical charge of SRAM cell is ~1fQb– No SEU immune SRAM cell is not practical in VDSM
April 30, 2014 10
SBU/MCU in SRAM
Single bit upset Multi cell upset
Multi cell upset ≠ Multi bit upset
High MUX ratio -> lower MBU
April 30, 2014 11
Single Event Transient (SET)
• Pulses at logic nodes, caused by ionizing particles• Typical pulse width is 10÷200pS (@Alpha)• Such pulses might cause:
– Permanent error if sampled by FF– Permanent error if activates asynchronous signals
P(error)=T(pulse)/T(period)=T(pulse)*F
April 30, 2014 12
Frequency dependence of SEU/SET
BE
R [
log
]
Frequency [log]
SEU
SET
Total SEE
Lower effective frequency, such as by clock gating, reduces SET
SET can be mitigated by glitch filtering before sampling
April 30, 2014 13
Effective glitch filtration
Glitch filtering• Narrow pulses are filtered
by C-element+delay
Weak SCAN mux• Implementing resistive MUX
at the input of the flip-flop will filter narrow glitches
13
April 30, 2014 14
Impact of scaling
• Scaling reduces the “critical charge”• Faster devices do not “filter” narrow SET pulses• More devices/memory per chip – more sensitive
elements• Higher frequency increases the probability to SET• SE is a major issue in advanced UDSM
April 30, 2014 15
SEE mitigation concepts
• S/W techniques for error correction– Example: 3 processing + voting
• Mitigation at system level– Example: TMR
• Mitigation at chip level – Example: EDAC, SE protected FFs
• Mitigation by circuit– Example: SE protected FFs & SRAM cells
• Mitigation by Si process– Example: SOI (poor !!!)
April 30, 2014 16
SEE sensitivity - application
• Types of data, sorted by SEE sensitivity:– Configuration (FPGA, CNFREG…)– Control logic (FSM, uCode,…)– Executable data (.exe files, I-Cach,…)– Stored data (databases, D-Cach, …)– Temporary data (Video, Audio, …)
April 30, 2014 17
Summary
• SEE is becoming a significant reliability issue in VDSM technologies
• SRAMs are typically the most sensitive elements to SEE
• SE mitigation is possible, mostly by digital techniques and proper methodologies and tools
• Availability of EDA tools for analyzing SER and mitigating SEE is limited. Optima presents a very interesting tool.