lecture 12 silicon test and validation - stanford university › class › archive › ee › ee371...

32
EE 371 Lecture 12 J. Stinson 1 Lecture 12 Silicon Test and Validation Intel Corporation [email protected]

Upload: others

Post on 08-Jun-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture 12 Silicon Test and Validation - Stanford University › class › archive › ee › ee371 › ee...– Must wait to shift the entire chain out before capturing new data –

EE 371 Lecture 12J. Stinson 1

Lecture 12

Silicon Test and Validation

Intel [email protected]

Page 2: Lecture 12 Silicon Test and Validation - Stanford University › class › archive › ee › ee371 › ee...– Must wait to shift the entire chain out before capturing new data –

EE 371 Lecture 12J. Stinson 2

Introduction:With design complexity and raw transistor counts growing at a 2Xrate per generation, issues surrounding validation of silicon and test/manufacturing have become hot topics in the industry. Unlike software, hardware cannot be “patched” and must meet a much higher level of quality before being shipped to the customer. This lecture will go thru some of basic issues in both validation andmanufacturing of digitial designs.

Reading:

Page 3: Lecture 12 Silicon Test and Validation - Stanford University › class › archive › ee › ee371 › ee...– Must wait to shift the entire chain out before capturing new data –

EE 371 Lecture 12J. Stinson 3

Manufacturing vs. Validation

• Manufacturing Test– Reliable test of individual parts for volume shipment– Concerned with:

• Detecting defects (yield) – identifying parts that don’t work due to defects in fabrication

• Binning parts – identifying correct “bin” for parts (e.g. frequency binning)

• Reducing costs – test time and tester costs are key components

• Validation (Debug)– Verifying correct operation of the design– Concerned with:

• Logical functionality – the design produces correct logical output• Electrical functionality – the design works at speed across entire

spectrum of process, voltage, temperature, reliability spec’s

Page 4: Lecture 12 Silicon Test and Validation - Stanford University › class › archive › ee › ee371 › ee...– Must wait to shift the entire chain out before capturing new data –

EE 371 Lecture 12J. Stinson 4

Manufacturing Test

• Goal is highest possible quality at lowest possible cost– Quality is #1 concern

• Unlike software, silicon cannot be “patched” (usually)• Design cycles are 6 months to 6 years• EXTREMELY expensive to recall

– Reducing cost is still important• Test time directly impacts capacity and throughput• Tester costs are going up at an alarming rate

– Typical automated test equipment is $1-10M per tester• Multiple “sockets” directly impact both test time and tester costs

Page 5: Lecture 12 Silicon Test and Validation - Stanford University › class › archive › ee › ee371 › ee...– Must wait to shift the entire chain out before capturing new data –

EE 371 Lecture 12J. Stinson 5

Manufacturing Test (Defects)

• All manufacturing process introduce defects into wafers• Need to run minimum set of diagnostics on each part to ensure

no defects– Typical CPU goal is 500-1000 DPM (defects per million)– Heavily relies on statistics and sampling techniques to ensure

compliance

3.7 K DownbinDelay Defects

1.7 K KillerDelay Defects

1M Good Die

282 K Bad Die

Defects to Screen Per Good Unit

0.01.02.03.04.05.06.07.08.09.0

10.0

0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5Defect Density (/cm^2)

Def

ects

to S

cree

n 350 mil800 mil

Page 6: Lecture 12 Silicon Test and Validation - Stanford University › class › archive › ee › ee371 › ee...– Must wait to shift the entire chain out before capturing new data –

EE 371 Lecture 12J. Stinson 6

Manufacturing Test (Binning)

• Natural variation in VLSI fabrication– Marketing takes advantage by selling parts at multiple “bins”

• Frequency most common example: same design sold at 1.5GHZ, 1.4GHz and 1.3GHz

– Need to accurately determine to which bin each part belongs• Highest margin bins are usually the lowest occurrence

– Competing Goals• Need adequate “guardbands” to ensure correct operation• Maximize highest margin bins

Typical Fabrication Variance

Page 7: Lecture 12 Silicon Test and Validation - Stanford University › class › archive › ee › ee371 › ee...– Must wait to shift the entire chain out before capturing new data –

EE 371 Lecture 12J. Stinson 7

Manufacturing Test (Sockets)

• Process of elimination– Typical manufacturing test involves multiple “socketing” of parts– Each socket geared towards filtering defects and binning the parts

• Lower cost the “earlier” problems can be identified

Tester Tester

Lower Cost

Page 8: Lecture 12 Silicon Test and Validation - Stanford University › class › archive › ee › ee371 › ee...– Must wait to shift the entire chain out before capturing new data –

EE 371 Lecture 12J. Stinson 8

Validation (Debug)

• Goal is to ensure design meets all specifications– First step before going into manufacturing

• Typical CPU debug times: 6 months to 2 years• Determines binning and electrical stress content for manufacturing

– Defect content determined statistically

– Must anticipate all usage conditions w/in spec window• Process Variation

– Depending on volume, range can vary from 1-5 sigma• Voltage spec range

– Power supply and di/dt variances must be accounted (2-10% nominal)• Temperature spec range

– Typical from 0C to 110C junction temperature• “Feature” range

– Validate across all features (diff’t core frequencies, bus fractions, thermal throttling, etc.)

• Lifetime range– Typical CPU lifetime: 7-10 years

Page 9: Lecture 12 Silicon Test and Validation - Stanford University › class › archive › ee › ee371 › ee...– Must wait to shift the entire chain out before capturing new data –

EE 371 Lecture 12J. Stinson 9

Shmoo

• Voltage vs. Frequency graph (shmoo)– Shape is critical in determining potential issues

| |* |** |*** |***** |******* |*********-----------

Volta

ge

Period

|** |*** |*** |**** |***** |******* |*********-----------

|**********|**********|**********|*** |***** |******* |*********-----------

| |* |** |*** |**********|**********|**********-----------

| * |* * |** ** |*** ** |***** ** |******* *|*********-----------

| ****|* ***|** *|*** |***** |******* |*********-----------

|******** |****** |**** |**** |***** |******* |*********-----------

| * * * |* * ** **|** * |*** * * *|***** * *|******* * |*********-----------

Freq “Wall” Reverse Crack

Inverted Flakey Vcc Ceiling Vcc Floor

Normal

Page 10: Lecture 12 Silicon Test and Validation - Stanford University › class › archive › ee › ee371 › ee...– Must wait to shift the entire chain out before capturing new data –

EE 371 Lecture 12J. Stinson 10

BurnIn

• Need to check reliability across lifetime of design (7-10 years)– Obviously can’t wait that long…..– Most wearout mechanisms are accelerated by elevated temp and

voltage (burnin)• EM, SH, oxide wearout, PMOS degradation

• Many defect mechanisms can be accelerated with brief stress– “Infant Mortality” manufacturing screen to identify marginal parts

• Stress sample of parts at burnin conditions– Increased voltage/temperature dramatically increase power

• Both dynamic and leakage power• Creates an infrastructure cost (test time, capacity, etc.)

– Reduce frequency to help manage the power– Burnin functionality significantly increases design contraints

• Design must work at extreme voltage, temp and low frequency

Page 11: Lecture 12 Silicon Test and Validation - Stanford University › class › archive › ee › ee371 › ee...– Must wait to shift the entire chain out before capturing new data –

EE 371 Lecture 12J. Stinson 11

Test Platforms• System

– “Real world” system (cheap)• Ex: Quake running on Linux on a PC

– Easy to write code– Content can run for hours– Difficult to control electrical environment– Difficult to make deterministic

• Automated Test Equipment (ATE)– Specialized tester to “replay” waveforms at the pin level ($$$$)

• Ex: breadboard with a logic analyzer attached to input/outputs– Difficult to write code (“stored response”)

• Must simulate input stimulus to determine correct behavior– Simulation limits diagnostic content length– Excellent at controlling electrical environment– Should always be deterministic

Page 12: Lecture 12 Silicon Test and Validation - Stanford University › class › archive › ee › ee371 › ee...– Must wait to shift the entire chain out before capturing new data –

EE 371 Lecture 12J. Stinson 12

Test Content• Goals

– Stress design to check for functionality, electrical, defects– “Coverage” – term used to describe quality of test content to cover

specific issues or potential problems• Focused Test

– Diagnostics specifically written to test some aspect of the design– Can be functional, electrical or defect based– Best method of ensuring specific coverage– Very high cost in engineering resources

• Random Test– Random or pseudo-random test vectors

• Use software to “throw the kitchen sink” at the design– Need method of verifying output

• Simulation or comparison to “known good die”– Low engineering cost but can only achieve 60-80% coverage

Page 13: Lecture 12 Silicon Test and Validation - Stanford University › class › archive › ee › ee371 › ee...– Must wait to shift the entire chain out before capturing new data –

EE 371 Lecture 12J. Stinson 13

Design for Test/Manufacturing/Debug

Page 14: Lecture 12 Silicon Test and Validation - Stanford University › class › archive › ee › ee371 › ee...– Must wait to shift the entire chain out before capturing new data –

EE 371 Lecture 12J. Stinson 14

Scanout

• Observability registers (non-destructive)• Able to capture “snapshot” of state machine

– Does not destroy machine state (great for system debug)• Typically only done on 1-10% of state nodes

– Adds clock loading and cap to critical paths– Creates it’s own min/max delay issues

FFQD

FFQD

FFQD

FFQD

FFQD

FFQD

FFQD

FFQDLogic Block

ScanoutSO

D

SI

Smpl Shift

ScanoutSO

D

SI

Smpl Shift

ScanoutSO

D

SI

Smpl Shift

SampleShift

Page 15: Lecture 12 Silicon Test and Validation - Stanford University › class › archive › ee › ee371 › ee...– Must wait to shift the entire chain out before capturing new data –

EE 371 Lecture 12J. Stinson 15

Signature Mode

• Normal scanout can only “snapshot” data as fast as the longest scanout chain– Must wait to shift the entire chain out before capturing new data– If a normal scan chain is 3000 nodes, this means only one out of

every 3000 clock cycles can be “observed”• Signature mode

– Keep “sample” and “shift” signals asserted simultaneously• XOR the “Shift-In” data with the “logic” data every cycle• Creates a unique “signature” for the device running a particular

application– Excellent method of adding new observability during test/validation– Issue: need to get this working “at speed”

• Creates add’l max delay work

Page 16: Lecture 12 Silicon Test and Validation - Stanford University › class › archive › ee › ee371 › ee...– Must wait to shift the entire chain out before capturing new data –

EE 371 Lecture 12J. Stinson 16

Scan

• Observability and controllability registers (destructive)• Useful for gaining access to internal states• Strong movement towards full scan in industry (all sequential

elements are “scan-able”– Difficult to control domino/arrays/clocked logic– Can add up to 1-5% die area– Adds cap to critical paths

CLK_b

CLK

CLK_b

CLK

DataOut

OutData

SI SO

Shift

Shift_b

Page 17: Lecture 12 Silicon Test and Validation - Stanford University › class › archive › ee › ee371 › ee...– Must wait to shift the entire chain out before capturing new data –

EE 371 Lecture 12J. Stinson 17

ATPG

• Automated Test Pattern Generation– Method of content generation with high coverage, low effort– Use controllability of scan to generate automated test vectors– Can use software to ensure that every node in the design “toggles”

• Issues– Accurate modeling of design captured by ATPG software– Full vs. partial scan– Capturing delay defects and events

FFQD

FFQD

FFQD

FFQD

FFQD

FFQD

FFQD

FFQDLogic Block

FFQD

FFQD

FFQD

FFQDLogic Block

Inpu

t Pin

s

Out

put P

ins

Page 18: Lecture 12 Silicon Test and Validation - Stanford University › class › archive › ee › ee371 › ee...– Must wait to shift the entire chain out before capturing new data –

EE 371 Lecture 12J. Stinson 18

DAT

• Direct Array Testing– Similar to Scan– Allows test manipulation of an array (or register file)

• Uses either existing port access or adds new ports to the array– Powerful method of directly checking memory cells

• Relying on functional patterns or system to “touch” every memory cell on a 12MB cache is near impossible

Normal Address

DAT Address

Normal Data DAT Data

DAT Enable

Array

Page 19: Lecture 12 Silicon Test and Validation - Stanford University › class › archive › ee › ee371 › ee...– Must wait to shift the entire chain out before capturing new data –

EE 371 Lecture 12J. Stinson 19

BIST

• Built-In Self Test– Let the chip test itself– Special test state machine that can control normal state machine

• Can use either SCAN or DAT machinery to control• Can also be built into hardware itself (e.g. IA32 microcode)

– Supports algorithmic test generation• Random “seeds” to generate internal vectors• Capture “signature” output and compare to expected

– Much faster than externally manipulating internal state• Often only way to generate “back-to-back” cycle testing of portions of

the design– Can be part of power-up sequence of each part

• Good customer feature to ensure that the part is good at boot-up every time (quality)

Page 20: Lecture 12 Silicon Test and Validation - Stanford University › class › archive › ee › ee371 › ee...– Must wait to shift the entire chain out before capturing new data –

EE 371 Lecture 12J. Stinson 20

Debug Tools

Page 21: Lecture 12 Silicon Test and Validation - Stanford University › class › archive › ee › ee371 › ee...– Must wait to shift the entire chain out before capturing new data –

EE 371 Lecture 12J. Stinson 21

Focused Ion Beam (FIB)• Uses large particles (ions) to physically edit silicon after

fabrication– Enables both removal and deposition– Can “fix” problems without having to wait for full stepping

• Limited in # of parts (5-50 hours per part)• Limited in accessibility and scope

– Some edits are just too complicated

In Chamber High-Resolution (IR)

Microscope

Axial GasDeliveryMezzanines

Differential LaserInterferometer Stage

50kV-5nm Ion Column

Gas DeliveryGas DeliveryNeedleNeedle

Diffusion Diffusion

Shallow Trench Oxide

Metal Signal Line (signal)

Silicon Substrate

Gas DeliveryGas DeliveryNeedleNeedle

FocusedFocusedIonIon

BeamBeam

LCE Trench Floor

1um

Page 22: Lecture 12 Silicon Test and Validation - Stanford University › class › archive › ee › ee371 › ee...– Must wait to shift the entire chain out before capturing new data –

EE 371 Lecture 12J. Stinson 22

Focused Ion Beam (FIB)

A

B

C

FIB MetalDeposition

Old Signal

New Signal

FIB MetalDeposition

FIB Dielectric Deposition

FIB SignalCut Location

C

FIB ConnectionLocations A & B

Diffusion

(new signal)

Silicon Substrate

Diffusion

(old signal)

Metal Line

FIB Cut Location

C

FIB Connection Location A

FIB Connection Location B

Page 23: Lecture 12 Silicon Test and Validation - Stanford University › class › archive › ee › ee371 › ee...– Must wait to shift the entire chain out before capturing new data –

EE 371 Lecture 12J. Stinson 23

Device Probing

• Important to be able to collect waveforms from silicon– Best method of understanding “what’s going on?”

• Evolved substantially over last 15 years– Pico-probing – mechanical probing of metal pads on die– Scanning Electron Beam (SEM) – measure deflection of electrons

from metal lines– InfraRed Emission Microscopy (IREM) – detect thermally induced

IR emissions from silicon– Laser Voltage Probe (LVP) – measure light reflection changes in

laser beam reflected off xtors– Time Resolved Emission (TRE)/Picosecond Imaging Circuit

Analysis (PICA) – detect photonic emissions from switching xtors

Page 24: Lecture 12 Silicon Test and Validation - Stanford University › class › archive › ee › ee371 › ee...– Must wait to shift the entire chain out before capturing new data –

EE 371 Lecture 12J. Stinson 24

Device Probe Looping

• Need deterministic loop– Repeat the same diagnostic over and over again

• Must be 100% deterministic (same behavior every time)• Similar to oscilloscope operation

– Most probing technologies are detecting “rare” events• Need MANY of these rare events to swamp out noise

– Need excellent timing accuracy• Each time thru the “loop”, step forward a bit in time• Rely on averaging of many cycles to build a waveform• Short loop times necessary to prevent trigger “drift”

• Extremely difficult to probe w/in a system environment– Cannot easily make a system deterministic– Cannot easily create short loops

Page 25: Lecture 12 Silicon Test and Validation - Stanford University › class › archive › ee › ee371 › ee...– Must wait to shift the entire chain out before capturing new data –

EE 371 Lecture 12J. Stinson 25

Scanning Electron Microscope (SEM)• Measures backscatter’d electrons induced from electron beam

– High intensity electron beam “shot” at device w/in a vacuum– Detector measures secondary electrons emitted by material– Different material will have varying levels of secondary emission– Used in industry for MANY years (30+)– Cool Big Brother: Transmission Electron Microscope (TEM)

SEM Images

TEM Image (colorized ☺)Source: 130nm technology, Intel

Sour

ce: S

trai

ned

Silic

on T

rans

isto

r, IB

M

Source: Fly’s Head, Museum of Science

Page 26: Lecture 12 Silicon Test and Validation - Stanford University › class › archive › ee › ee371 › ee...– Must wait to shift the entire chain out before capturing new data –

EE 371 Lecture 12J. Stinson 26

Ebeam Probing• Measure secondary electron emission in time domain

– Keep e-beam pointed at single point (invasive)– Pulse the beam (or pulse the detector) in time domain to build

waveform– Technology works VERY well on metal interconnect

• Primary probing technique with “wirebond” packaging

Source: Integrated Service Technology

Page 27: Lecture 12 Silicon Test and Validation - Stanford University › class › archive › ee › ee371 › ee...– Must wait to shift the entire chain out before capturing new data –

EE 371 Lecture 12J. Stinson 27

Laser Voltage Probe (LVP)• Energy (light) absorbed by carriers in conduction band

– Laser beam pointed at “backside” of transistors• Requires “flip-chip” packaging• Laser photon energy close to silicon band edge• Wavelength kept in IR or NIR band (transparent thru silicon)

– Laser can induce carriers in conduction band• Need to keep intensity low enough to prevent inducing current

– SOI tends to absorb laser light• Difficult to use LVP w/SOI

– Laser must be mode-locked to test• Must be sync’d to test loop length

Source: Tsang et. al, IBM

Page 28: Lecture 12 Silicon Test and Validation - Stanford University › class › archive › ee › ee371 › ee...– Must wait to shift the entire chain out before capturing new data –

EE 371 Lecture 12J. Stinson 28

Laser Voltage Probe (LVP)

ObjectiveLens

Lightin/out

Beam-splitter

Referencemirror

Piezovibration

cancellation

Matched signal/reference

paths

Sample

Schematic of Laser Voltage Probe

Page 29: Lecture 12 Silicon Test and Validation - Stanford University › class › archive › ee › ee371 › ee...– Must wait to shift the entire chain out before capturing new data –

EE 371 Lecture 12J. Stinson 29

Laser Assisted Device Alteration (LADA)

λ

DUT

reflected intensity

AUX 1,2

pass/fail triggerCH1CH2

PC

System Control Bus

objective

Rotating mirror

LaserDetector

Laser Scanning Microscope

VCCVSS DRAIN

Page 30: Lecture 12 Silicon Test and Validation - Stanford University › class › archive › ee › ee371 › ee...– Must wait to shift the entire chain out before capturing new data –

EE 371 Lecture 12J. Stinson 30

Time Resolved Emission (TRE)• Detects photons emitted by switching xtors (similar to PICA)

– Carriers in the channel “thermalize”, emitting NIR light• Silicon is transparent to IR

– Need a REALLY good detector• Single photon per 10K switching events• Photons go in all directions; detector only at one angle• Need great timing resolution

– Completely non-invasive– Collection times are significant

• Longer time = better signal-to-noise ratio (SNR)

Pn+

Vgs > Vgs-Vt

-Pinch-off region

Page 31: Lecture 12 Silicon Test and Validation - Stanford University › class › archive › ee › ee371 › ee...– Must wait to shift the entire chain out before capturing new data –

EE 371 Lecture 12J. Stinson 31

Time Resolved Emission (TRE)• Photon emission strongest from NMOS devices

– Falling edge vs. rising edge have diff’t amplitudes– Linear dependence on device width– Exponential dependence on voltage

Ipmos

Inmos

Vout Photons

Vout

Ipmos

Inmos

TimePhoton Counts

500 ps/divSource: “Single Element Time Resolved Emission Probing

for Practical Microprocessor Diagnostic Applications” E. Varner et. al, ISTFA 2002

Page 32: Lecture 12 Silicon Test and Validation - Stanford University › class › archive › ee › ee371 › ee...– Must wait to shift the entire chain out before capturing new data –

EE 371 Lecture 12J. Stinson 32

TRE Basic Schematic

DUT

100X Objective

Imaging CameraBeamSplitter

Variable ApertureMirror Illumination

(for imaging)

OpticalFiber

Detector andData Acquisition

Tester

Trigger

Computer

Picosecond Timing Analyzer

FiberCouplingLens