fault coverage paper

8/8/2019 Fault Coverage Paper

http://slidepdf.com/reader/full/fault-coverage-paper 1/7

THE: EFFECT OF DIFF’ERENT TEST SETS ON Q UA LIT Y LEVEL

PREDICTION: WHEN IS 80% BETTER THA N 90%?

Peter C.Maxwell Robert C. A i t k e n

Design Technology Laboratory, H ewlett-Packard Company, Santa C lara, CA 95052

Vich h a n se n In sh en Ch ia n g

California Design Center, Hewlett-Packard Company, San ta Clara, CA 95052

ABSTRACT

This paper discusses the use of stuck-at faultcoverage as a means of determ ining quality levels. Da ta

from a par t tested with both functional and scan tests isanalyzed and compared to three existing theories. It isshown that reasonable predictions of quality level arepossible for the functional tests, but that scan testsproduce s igni f icant ly worse qua l i ty l eve ls thanpredicted , Ap pare nt clustering of defects resulted invery good quality levels for fault coverages less than99%.

1. INTRODUCTION

Quality levels in manufacturing are comingunder increasing at tent ion, and efforts are beingexpen ded a t all stages of a product’s pro duction cycle toimprove final quality. The econo mic consequences ofdefective com ponents ar e generally reduced when theyare detected earlier rather than later in a manufacturingcycle. To be able to mak e the best decisions regardingquality goals, one needs information pertaining toexisting quality levels and th e amou nt of effort requiredto improve them.

Q u a l i ty l e v el s a r e d e t e r m i n e d b y t h erigorousness of testing. It is therefo re desirable to beable to relate realistically determ inable metrics of a testto the expected quality level achieved by that test. T obe able to obtain the best economic picture, it isn e c es s a ry t o b e a b l e t o q u a n t i fy n e c e ss a r y

improvements in a test metric to achieve a particularquality level.

As fa r a s d i g i t a l i n t e gra t e d c i r c u i t s a reconcerned, testing is typically carried out in threephases: testing for gross failures (shorts, open etc.),d i g i t a l t e s t i ng ( t h i s i s some t i me s re fe r re d t oas“functiona1” testing althou gh it includes scan testingas well), and par ametric testing. The efficacy of each of

these contrib utes to the o verall quality level of the chip,althoug h it is usually the functional test that is subjectedto a critical grading process.

T h i s p a p e r p r e s e n t s t h e r e s u l t s o f a nexperimental study aimed at investigating the mostcommon digital test metric - single stuck-at faultcoverage - and its usefulness in predicting q uality levels.Specifical ly, i t a t tempts to answer the fol lowingquestions:

1. What is a realistic relationsh ip between qualitylevel and fault coverage?

Are there significant differences betweenexisting theories when realistic quality levelsand yields are taken into account?

Are theoretical predictions influenced by the

way/order the test(s) are presented?

Is stuck-at fault coverage a reliable m etric?

2.

3.

4.

A corollary of 3 an d 4 addresses fault coveragegoals, which are expressed as minimum acceptablestuck-a t fault coverage for a test set. It was desired todetermine if there should be any additional conditionsrelating to how thes e goals are met (i.e. the type of testbeing graded).

There have been very few reported studies onthese topics. Early results were described in Agrawal,Seth & Agrawal (1982) and Harrison et al. (1980). A

similar study to the present one was reported by Das et

al. (1990), which showed significant variation in varioustheoretical models. Their results came from datagathered from a single set of fun ctional tests applied toa part. The m otivation for the present work was to seeif the theories hold for different kinds of tests, byinvestigating the effect of ap plying a variety of tests toa part. In particular, the data includes results from aseparate scan test, as well as “conven tional” functional

Paper 13.3

358

INTERNATIONAL TEST CONFERENCE 1991

CH3032-0/ 91 OOOO-0358$01 O O @ 1991 IEEE



tests, allowing question 3 above to be addressed. Theresults also give another benchmark using anotherfabrication process.

2. EXPERIMENTAL OVERVIEW

The vehicle for the study was a standard-cell chipwith about 8500 gates, comprising 437 fliip-flops andcombinat ional log ic . I t was fabr ica ted in a 1p

2-level-metal CMOS process and the design was fullystatic. The digital tests applied to t he pa rt consisted of6623 designer-supplied functional vectors with a 77%stuck-at fault coverage, and 285 scan vectors with a 92%coverage. Here , a “vector” consists of a complete set ofinput stimuli and output responses. Each scan vectorconsis ts of 438 clocks (437 sh if t cycles p lus 1

“functional” cycle) while each functional vector consistsof a varying num ber (usually >1) of clocks. Th ecombined coverage of the two test sets was 96.5%.

There was also a second, smaller set of functionalvectors which was not used in the main dala analysisportion du e to lack of information on a vectoir by vectorbasis, but which was used to help determine which partswere defective. This set was originally written toimprove fault coverage during ramp-up of productionwhen it tra nspire d t hat the prop ortion of deffective: partswas unacc eptably high. Including this additioinal set, thetotal combined coverage was 98.9%. The functionalvectors were run both at low speed and at full operatingspeed, which allowed separation of failures due totiming from those du e to othe r causes.

T h e f u n c t i o n a l a n d s c a n t e s t s w e r e r u n

independently, and a pa rt which failed one set of vectorswas still subjected to the other set. For defective parts,the failing vector num ber for each test was logged. D atawas collected for a total of 18500 die from :3 separatewafer lots.

Although the scan and functional tests weregenerated independently, they w ere both gr aded by thesam e fault simulator. Cumulative values of fault

coverages were determ ined which resulted in a table offault coverage versus vector number. The tester dataallowed extraction of the cumulative number of faileddie versus vector number. Considering vector n, all diewhich fail at subsequent vectors contribute t o the defectlevel which would result if th e te st were stopped1 at n.Since the fault coverage at vector n is known it ispossible to produce a graph of observed defect levelversus fault coverage.

One of the advantagesof this study wa:sin regardto the assessment of quality levels of IC‘s which hadpassed all the tests. In many cases this determination isextremely difficult, if not im possible, due to lack of data

on parts once they leave the manufacturer. In our case,all chips were shipped to another product division,which returned parts th at failed for any reason a fter theywere assembled into the product. From these returnedparts it was possible to obtain th e proportion that were

true test escapes, as opposed to some other inducedfailure, such as bent pins. This was don e by firstlyverifying that the parts failed in a golden system, thenverifying that they still passed the original IC test. Theonly portion of defective ch ips not k nown is that whichfail in the field, but this number is assumed to be so lowtha t it will not impact th e results significantly.

For each test (functional, scan and combined)the quality level resulting from that test was determinedby first determining the num ber of bad parts that passedthe test according to:

#bad parts = #failing any oth er digital test +#test escapes

Th e quality level was the n repre sente d by calculatingthe defect level, expressed as:

#bad parts

#parts passing the testefect level =

Th e functional yield Y was determined from:

#parts passing all tests- #test escapes#pa rts digitally tested

=

3. RESULTS AND COMPARISON

3.1 Theoretical Predictions

Although there have been several theoreticalapproaches published, the intent here was not to carryout a detailed comparison between all of them as wasdone by Das et a1 (1990). Ra ther , three were chosen.Th e first approach, a simple model, is due to Williams&B rown (1981), and which was also used by McCluskey& Buelow (1988). This relates the defect level to faultcoverage by:

where f is the fault coverage. The second approachtakes int o account the fact tha t the re is dis tributionin the number of faults on a faulty chip, with theaverage being greater than one. This approach is dueto Agrawal, Seth & Agrawal (1982), in which therelatio n is:

where no is the average number of faults on a faulty

die. At the outset it was felt that this would give

Paper 13.3

359



closer predictions to actual data since defects areknown not to b e spread uniformly across a wafer, bu trath er occur in clusters. Ind eed , if this were no t thecase, yields would b e considerably less than obtain edin practice (Stapper, 1984).

Finally, some investigation of the predictivemodel described in Seth & Agrawal(l989) was carriedout, particularly w ith regard t o the sensitivity to the testsets and the ordering.

3. 2 Experimental Data

The functional yield Y or all the die was 76.6%.Data for the 3 production runs was combined together

to obtain a mo re average representation. In fact, therewas little difference between each run, the variation inyields being less than 2%. Theoretical numbers werecalculated using the a bov evalu e of yield, and , in the caseof equation (2), for values of noof 2.5,3.0,3.5 and 4.0.

Figure 1 hows these plotted as defect level versus faultcoverage. A lso are plotted the corresponding curves forthe data obtained for both the functional and scan tests.The reason the scan test data starts at 20% coverage isthat there is an initial test carried out o n the scan chainitself, which h as a 20% fault coverage.

The scan and functional data show noticeablysimilar trends, an d follow a curve which correspond s to

a value of no of approximately 3.0. When the two test

---...Wlllarns & Brown

0 10 20 30 4 0 50 60 70 80 90 100

Fault coveraoe %

Figure 1. Defect levels €or functional and scan vectors .

20

ae

c.-

5

0

0 10 20 30 40 50 60 70 80 90 100

Fault coverage in W

Figure 2. Defect level for combined (functional followed by scan) test.

Paper 13.3

360



sets are combined, as they are in the production test,with th e scan tests following the functional, the resultsare as shown in Figure 2.

Figure 1 illustrates some disturbing anomalies

that arise from blindly using stuc k-at fault coveralge asa test metric. The data show s that the defect levelresulting from functional vectors with 75% covleragewas 1.92%, whereas scan vectors with the samecoverage produced 2.43%. Ano ther way of'phrasingthis is that the scan vectors had to reach 80% coverageto achieve the sam e quality level as the functional tests.

Such a comparison is more dramatic iif wecompare a com bined functional + scan test with a scantest alone. In this case, the defect level after the full scantest with 92% coverage was equalled by the combinedtest with only 83% coverage. Figure 3 shows anenlargement of the tails of the previous plots, whichshows that the scan curve has flattened considerably.A n explanation for this behavior is that the scan teststarget stuck-at faults only, are few in number and areapplied at slow speed. The peripheral coverage ofnon-target faults such as bridges and delays is thereforelikely to be less than that of the much m ore num erousfunctional vectors. Th e dat a suggests that ev en were thescan tests to have 100%stuck-at coverage, there wouldstill be a significant defect level present if these were theonly tests applied.

I t i s t emp t ing t o be l ieve th a t the r eas onfunctional vectors are required is because they test chiptiming, which scanvectors cannot do , at least in the formimplemented on this chip. This is not the onliy reason,

however, as evidenced by the data relatirig to the

functional tests which were run at 2 MH z compared tothe same tests run at 20 Mhz. From the parts that hadpassed scan testing, a further 127 die failed the 2 M Hzfunctional tests. Whe n the functional tests were rerun

at 20 MHz, an additional64

die failed. T hese 64 about0.5%) herefore failed for purely timing-related causes.Conversely, the 127 (about 1%)ailed for other causes.Part of this fallout was due to increased stuck-atcoverage, in moving from 92% to 98.9%. Thiscontribution can be estimated by substituting the twocoverage figures into equation (2) and taking thedifference. For a value of no of 3.25, this difference is

approximately 0.3%. Since the observed fallout was 1%it can be concluded that about 0.7% of parts failed dueto non-modelled faults tha t were not timing-related.

Anomalies also show up in the predictive modelof Seth & Agrawal(l989). Table 1 hows the results of

applying their equations for true and observed yield to

the obtained data. The differences in performance ofthe functional and scan vectors, coupled with suddenchanges in the rate of decrea se of defect level when o netest takes over from the other, give rise to widelydifferent values of the defect level. Th e method givesgood agreement for the purely functional vectors, butsuch is not the case when scan vectors are used.

Insight into this behavior may be obtained byexamining Figure 4, which shows the variation inpredicted defect level for the functional vectors as afunction of vector numbe r. Also shown is the actualdefect level and the fault coverage. It is seen that if thefault coverage is increasing only slowly, the predicted

defect level drops. This corresponds to a situation

81 82 83 84 85 86 87 88 89 90 91 92 93 94 95

Fault coverage in %

Figure 3. Combined set with 83% cove'rage perfo rm as well as scan tests with 92% coverage.

Paper 13.3

361



Table 1. Th e effect of test set com binations on predicted defect levels.

\.-- actual defect level

---.-.-.-.o n ‘. . ..

L 4 o t 4 5

predicted defect level ---.-

V I I I 1 i u

0 2000 4000 6000 8000

Vector number

Figure 4. Variation of predicted defect level for functiona l vectors as test proceeds.

where relatively few additional die a re being rejected so

that the predictor “thinks” that there are relatively fewmore faults to uncover. O n the other hand, when thefault coverage jumps quickly, the predicted defect levelalso rises, since the relatively larg e num ber of die beingrejected suggest that there are many more faults to

uncover. Since the rat e of die rejection at the end of thescanvectors was very low, the predicted defect level wa s

also very low.

4. H O W M U C H FAULT COVERAGE ISENOUGH?

Using a value of 76.6% for yield, models such asWilliams & Brown indicate that for defect levels lessthan 0.1% (1000 ppm) fault coverages higher than99.6% are required. On the other hand, the data hereshow that such defect levels are obtained with only 96%

Paper 13.3

36 2

coverage. The fact that on average there were 3 faultson a faulty die reduced the required coverage to obtaina given defect level. The differences between the tw otheoretical approac hes decrease as the yield decreasesand as nodecreases. Clearly, require d covera ges will be

process depende nt, but the d ata illustrates that it is notalways necessary t o have exceedingly h igh faultcoverages.

A problem in requiring fault coverages in excess

of 99% is the ability to measure them sufficientlyaccurately. There can be significant differences in thecoverages reported by different simulators, reflectingtheir different modelling of the circuit, the way theyhandle oscillations, potential detections an d their abilityto deal with redundant faults.

Figure 5 shows anothe r enlarged plot relating to

the combined test set. This curve is steadily decreasing



actual and predicted defect levels. How ever, thisagreem ent occurs only for functional tests. Scan-onlytests depart significantly from predictions for higherfault coverages. Thus , it can be seen that scan testing

alone is inadequate and may result in h g h residualdefect levels, even for very high fault coverages. In

setting fault coverage goals, on e must take into accoun thow those goals are met, and not be content merelybecause a sufficiently high fault co verage is attained.

The approach of Agraw al et a1 (1982) prod ucesa much better fit to the data obtained here than that ofWilliams & Brow n (1981). T his is true even in the areasof interest, namely fault coverages above 95%. Figure 5 s h o w s t h a t t h e W i l l i a m s & B r o w n p r e d i c t ssubstantially higher defect levels than were actuallyobtained . The difference decreases as yield decreasesand as no decreases.

There remain difficulties in using theoreticalappro aches when very low defect levels are considered,say less than 200 ppm . Likely variation s in thedeterm ination of fault coverages can produ ce variationsin predicted defect levels that exceed the target figu re.Fault simulators that more accurately reflect realisticdefects will avoid the requ iremen t of an unrealisticallyhigh stuck-at coverage.

The re is still a deficit of in form ation available onthe subject. Continued work is needed to examine theeffect of othe r test sets and oth er fault models, such asbridging faults, opens and IDDQ (quiescent currentmonitoring).

7. ACKNOWLEDGEMENTS

Th e authors would like to thank Bruce Schoberof Hewlett-Packard’s Boise Surface Mount Center fororiginally proposing this study and for many helpfuldiscussions. Significant help was also obtained fromnumerous other people at the Boise, Corvallis, PaloAlto and Santa Clara sites.

8. REFERENCES

1. V.D. Agrawal, S.C. Seth & P. Agrawal (1982)

“Fault Coverage Requirements in ProductionTesting of LSI Circuits”, ZEEE Journal of SolidState C ircuits,SC-17,pp. 57-61.

2. D.V . Das, S.C. Seth, P.T. Wagner, J.C. Anderson& V.D. Agraw al(l990 ) “An Experimental Studyon Reject Ratio Prediction for VLSI Circuits:Kokomo Revisited”, Proc. Znt. Test Con$, pp .

712-720.

3. F.J. Ferguson & J.P. Shen (1988) “Extraction andSimula t ion of Rea l i s t i c CMOS Faul t s wi thInductive Fault Analysis”,Proc. Znt. Tes t C o n j ,pp .475-484.

4. F.J. Ferguson, M. Taylor & T. Larrabee (1990)“Testing for Parametric Faults in Static CMOSCircuits”, Proc. Znt. Test Con$ , pp. 436-443.

5. R.A. Harrison, R.W. Ho lzwarth, P.R. Motz, R.G.Daniels, J.S. Thomas & W.H. Wiemann (1980)“Logic Fault Verification of LSI: How it Benefitsthe User”, Proc. WESCON,paper 34/1.

6. Y.K. Malaiya & S.Y.H. Su (1982) “A New FaultM o d e l a n d T e s ti n g T e c h n i q u e f o r C M O SDevices”, Proc. Znt. Test Con$, pp. 25-34.

7. E.J. McCluskey & F. Buelow (1988) “ IC Qualityand Test Transparency”, Proc. Znt. Test Con$, pp.295-301.

8. S .C. Se th & V.D. Agraw al (1989) “O n the

Probability of Fault Occurrence”, in Defect andFault Tolerance in VL SZSy stem s,ed. I. Koren, pp.47-52, Plenum, New York.

9. C.H. Stapper (1984) “Yield Model for FaultClusters Within Integrated Circuits”, ZBM J. Res.Develop.,28,pp 636-640.

10. R.L. Wadsack (1978) “Fa ult Mod elling and LogicSimula t ion of CMOS a nd MOS In t e gra t e dCircuits”, Bell System TechnicalJournal,57(5), p.1449-1474.

11. T.W. Williams&N.C.Brow n (1981) “DefectLev elas a Function of Fault Co verage ”, ZEEE Trans.onComputers, C-30,pp. 987-988, December.

Paper 13.3

364

fault coverage paper

Documents