17. an accumulator- based comp action scheme for online bist of rams

4
1248 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRA TION (VLSI) SYSTEMS, VOL. 16, NO. 9, SEPTEMBER 2008 [17] M. L. Crow and M. Ilic, “The parallel implementation of the waveform relaxa tion method for transient stabi lity simulation s, in Proc. IEEE Trans. Power Syst. , 1990, pp. 922–932. [18] R. A. Saleh, K. A. Galliv an, M.-C. Chang, I. N. Haij, D. Smart, and T. N. Trick, “Parallel circuit simulation on supercomputers,” in Proc.  IEEE , 1989, pp. 1915–1931. [19] L. Dagum and R. Menon, “Open MP: An industry standar d API for shared-memory programming,” in Proc. IEEE CompUT. Science Eng., 1998, pp. 46–55. [20] J. Hensley , A. Lastra , and M. Singh, “An area- and energy-ef cient asynchronous booth multiplier for mobile devices,” in Proc. ICCD, 2004, pp. 18–25. [21] V. Salapura, R. Bickfor d, M. Blumrich, A. A. Bright, and D. Chen, “Power and performance optimization at the system level,” in Proc. CF , 2005, pp. 125–132. [22] J. Bla ze wic z, K. H. Ecker , E. Pesch, G. Schmidt, and J. Wegl arz  , Schedu ling Compute r and Manufa cturi ng Pr ocesses. Ber li n, Ger - many: Springer-Verlag, 1996. [23] R. Jej urikar, C. Per ier a, and R. Gupta, “Le aka ge awa re dyna mic voltage scaling for real-time embedded systems,” in Proc. DAC , 2004, pp. 275–280. [24] R. Xu, D. Mosse , and R. Melhem, “Practic al pace for embedded sys- tems,” in Proc. EMSoft , 2004, pp. 54–63. An Accumulator-Based Compaction Scheme For Online BIST of RAMs Ioannis Voyiatzis  Abstract—Transparent built-in self test (BIST) schemes for RAM mod- ulesassure the pres erva tionof the memo ry contentsduring perio dic testing. Symmetric trans pare nt BISTskips the sign ature pred ictio n phas e requ ired in traditional transparent BIST schemes, achieving considerable reduction in test time. In symmetric transparent BIST schemes proposed to date, output data compaction is performed using either single-input or multiple- input shift registers whose characteristic polynomials are modied during tes tin g. In thi s pap er theutili zat ionof acc umu lator modules for output dat a compaction in symmetric transparent BIST for RAMs is proposed. It is shown that in this way the hardware overhead, the complexity of the con- troller, and the aliasing probability are considerably reduced.  Index T erms—Onli ne testi ng, rand om acces s memo ries (RAMs), self testing. I. INTRODUCTION Emb edded semiconductor memori es tend to pla y an inc rea sin gly im- portant role in the operation of integrated circuits and systems. Since advances in memory technology tend to make memory devices more and more complicated (due to the appearance of new defect mecha- nisms ), considerabl e effo rt has been put to the direct ion of efc iently testing such modules [1]–[4], [14]–[21]. RAMs are typically discerned into bit- and word-organized [4]. For the testing of embedded RAMs, march algorithms outperform competitive schemes, since they result in simple, yet effective, testing scenarios [5]. A march algorithm comprises a series of march elements Manuscript received December 15, 2006; revised September 20, 2007. First published July 25, 2008; last published August 20, 2008 (projected). A prelimi- nary version of this work was presented in the Fi rst IEEE Conference on De sign and Test in Deep Submicron Technology, Tunis, August 2006. The author is with the Department of Informatics, Technological Educational Institute of Athens, Athens 12210, Greece (e-mail: [email protected]). Digital Object Identier 10.1109/TVLSI.2008.2000868 that perform a predet ermine d sequence of operations (read and/or write) in every cell (for the case of bit-organized RAMs) or word (for the case of word-organized RAMs). Testing of RAM modules is performed both right after manufac- turing and periodically in the eld. During manufacturing testing, var- ious kinds of tests are applied in order to ensure that the RAM operates normally; typical tests applied during manufacturing testing are march tests. Traditional march algorithms, e.g., [5]–[7], start with an initial write-all-zero phase, where all the RAM cells are set to 0 in order to ensure that the nal signature in the output compactor is known [5]. Periodic testing is discerned into start-up testing and testing during normal opera tion . Start-up testing is performed during the start-up of the system and resembles manufacturing testing. In testing during normal operation, the RAM normal operation is stalled (i.e., set out of normal operati on), tested and then given back to operation. This kind of testing is applied to circuits where it is difcult and/or impractical to shut down the system since the contents of the RAM cannot be lost. In this kind of testing, traditional march tests cannot be applied since (due to the initial write-all zero phase) the contents of the RAM cells before the test are lost. In order to confront the previously mentioned problems, transparent bui lt- in sel f test (BI ST)was pro pos ed by Nic ola idi s [1] ; in a tra nsparent BIST algorithm, the initial write-all-zero phase is skipped; instead, a signature prediction phase is issued that precedes the normal march series . During this signature predictio n phase, a signa ture is captur ed and stored. In the sequel, a sequence of carefully selected read and write operations are performed, that leave the RAM contents equal to the initial ones with the same fault coverag e of the corres pondin g tra- ditional march algorithm; the nal signature is compared to the one captur ed during the signat ure predictio n phase and a decision is made as to whether a fault has occurred in the RAM or not. The concept of transparent BIST is further analyzed in Section II-B. Yarmolik et al. [8], [9] advanced the eld proposing the concept of symmet ric trans parent BIST. In symmet ric trans parent BIST , the signa- ture pre dic tion phas e is ski ppe d and the mar ch ser iesis modie d in suc h a way tha t the na l signat ure is equal to the all -ze ro state, irr espect iv ely of the RAM initial contents. For response compaction of bit organized RAM’s , in [8] a singl e-inpu t shift regist er (SISR) was utiliz ed whose characteristic polynomial toggles between a primitive polynomial and its reciprocal one during the different march elements of the march se- ries. For the case of word-organized RAMs, it was proven in [9] that a multip le-inp ut shift regis ter (MISR ) whose charac terist ic polyno mial is altered in a similar fashion could serve as response compactor for sym- metric transparent BIST, resulting in a predetermined (all-zero) state. The concept of symmetric transparent BIST is analyzed and exempli- ed in Section II-C. The work of Yarmolik et al., although revolutionary, requires modi- fying existing registers (or SISRs/MISRs) in order to serve as response eva luator s and req uires compli cated contr ol logi c in such way to toggle between the two different polynomials during the application of the march series. It is widely accepted by the test community that the utilization of modules that typically exist in the circuit, e.g., accumulators [10] or arithmetic logic units (ALUs) [11], for BIST test pattern generation and/or response verication possesses advantages, such as lower hard- ware overhead and elimin ation of the need for multiple xers in the cir- cuit path; furthermore, the modules are exercised, therefore, faults ex- isting in them can be discovered [12]. In this paper, we propose the use of accumulator-based compaction in symmetric transparent RAM BIST (ASTRA). In modules that con- tain accumulators, the output of the RAM is either directly driven to the accumulator inputs or can be driven using processor instructions. It is shown that the proposed scheme imposes lower hardware overhead Authorized licensed use limited to: INTERNATIONAL INSTITUTE OF INFORMATION TECHNOLOGY. Downloaded on October 7, 2009 at 13:05 from IEEE Xplore. Restrictions apply.

Upload: mdhuq1

Post on 05-Apr-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

7/31/2019 17. an Accumulator- Based Comp Action Scheme for Online BIST of RAMs

http://slidepdf.com/reader/full/17-an-accumulator-based-comp-action-scheme-for-online-bist-of-rams 1/4

1248 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 16, NO. 9, SEPTEMBER 2008

[17] M. L. Crow and M. Ilic, “The parallel implementation of the waveformrelaxation method for transient stability simulations,” in Proc. IEEE Trans. Power Syst. , 1990, pp. 922–932.

[18] R. A. Saleh, K. A. Gallivan, M.-C. Chang, I. N. Haij, D. Smart, andT. N. Trick, “Parallel circuit simulation on supercomputers,” in Proc. IEEE , 1989, pp. 1915–1931.

[19] L. Dagum and R. Menon, “OpenMP: An industry standard API forshared-memory programming,” in Proc. IEEE CompUT. Science Eng. ,

1998, pp. 46–55.[20] J. Hensley, A. Lastra, and M. Singh, “An area- and energy-efcientasynchronous booth multiplier for mobile devices,” in Proc. ICCD ,2004, pp. 18–25.

[21] V. Salapura, R. Bickford, M. Blumrich, A. A. Bright, and D. Chen,“Power and performance optimization at the system level,” in Proc.CF , 2005, pp. 125–132.

[22] J. Blazewicz, K. H. Ecker, E. Pesch, G. Schmidt, and J. Weglarz ,Scheduling Computer and Manufacturing Processes . Berlin, Ger-many: Springer-Verlag, 1996.

[23] R. Jejurikar, C. Periera, and R. Gupta, “Leakage aware dynamicvoltage scaling for real-time embedded systems,” in Proc. DAC , 2004,pp. 275–280.

[24] R. Xu, D. Mosse, and R. Melhem, “Practical pace for embedded sys-tems,” in Proc. EMSoft , 2004, pp. 54–63.

An Accumulator-Based Compaction SchemeFor Online BIST of RAMs

Ioannis Voyiatzis

Abstract— Transparent built-in self test (BIST) schemes for RAM mod-ulesassure the preservationof the memory contents during periodic testing.Symmetric transparent BISTskips the signature prediction phase requiredin traditional transparent BIST schemes, achieving considerable reductionin test time. In symmetric transparent BIST schemes proposed to date,

output data compaction is performed using either single-input or multiple-input shift registers whose characteristic polynomials are modied duringtesting. In this paper theutilizationof accumulator modules for output datacompaction in symmetric transparent BIST for RAMs is proposed. It isshown that in this way the hardware overhead, the complexity of the con-troller, and the aliasing probability are considerably reduced.

Index Terms— Online testing, random access memories (RAMs), self testing.

I. INTRODUCTION

Embedded semiconductor memories tend to play an increasingly im-portant role in the operation of integrated circuits and systems. Sinceadvances in memory technology tend to make memory devices moreand more complicated (due to the appearance of new defect mecha-nisms), considerable effort has been put to the direction of efcientlytesting such modules [1]–[4], [14]–[21]. RAMs are typically discernedinto bit- and word-organized [4].

For the testing of embedded RAMs, march algorithms outperformcompetitive schemes, since they result in simple, yet effective, testingscenarios [5]. A march algorithm comprises a series of march elements

Manuscript received December 15, 2006; revised September 20, 2007. Firstpublished July 25, 2008; last published August 20, 2008 (projected). A prelimi-nary version of this work waspresented in the First IEEE Conference on Designand Test in Deep Submicron Technology, Tunis, August 2006.

The author is with the Department of Informatics, Technological EducationalInstitute of Athens, Athens 12210, Greece (e-mail: [email protected]).

Digital Object Identier 10.1109/TVLSI.2008.2000868

that perform a predetermined sequence of operations (read and/orwrite) in every cell (for the case of bit-organized RAMs) or word (forthe case of word-organized RAMs).

Testing of RAM modules is performed both right after manufac-turing and periodically in the eld. During manufacturing testing, var-ious kinds of tests are applied in order to ensure that the RAM operatesnormally; typical tests applied during manufacturing testing are march

tests. Traditional march algorithms, e.g., [5]–[7], start with an initialwrite-all-zero phase, where all the RAM cells are set to 0 in order toensure that the nal signature in the output compactor is known [5].

Periodic testing is discerned into start-up testing and testing duringnormal operation . Start-up testing is performed during the start-upof the system and resembles manufacturing testing. In testing duringnormal operation, the RAM normal operation is stalled (i.e., set out of normal operation), tested and then given back to operation. This kindof testing is applied to circuits where it is difcult and/or impracticalto shut down the system since the contents of the RAM cannot be lost.In this kind of testing, traditional march tests cannot be applied since(due to the initial write-all zero phase) the contents of the RAM cellsbefore the test are lost.

In order to confront the previously mentioned problems, transparentbuilt-in self test (BIST)was proposed by Nicolaidis [1]; in a transparentBIST algorithm, the initial write-all-zero phase is skipped; instead, asignature prediction phase is issued that precedes the normal marchseries. During this signature prediction phase, a signature is capturedand stored. In the sequel, a sequence of carefully selected read andwrite operations are performed, that leave the RAM contents equal tothe initial ones with the same fault coverage of the corresponding tra-ditional march algorithm; the nal signature is compared to the onecaptured during the signature prediction phase and a decision is madeas to whether a fault has occurred in the RAM or not. The concept of transparent BIST is further analyzed in Section II-B.

Yarmolik et al. [8], [9] advanced the eld proposing the concept of symmetric transparentBIST. In symmetric transparentBIST, the signa-

ture prediction phase is skipped and the march seriesis modied in sucha way that the nal signature is equal to the all-zero state, irrespectivelyof the RAM initial contents. For response compaction of bit organizedRAM’s, in [8] a single-input shift register (SISR) was utilized whosecharacteristic polynomial toggles between a primitive polynomial andits reciprocal one during the different march elements of the march se-ries. For the case of word-organized RAMs, it was proven in [9] that amultiple-input shift register (MISR) whose characteristic polynomial isaltered in a similar fashion could serve as response compactor for sym-metric transparent BIST, resulting in a predetermined (all-zero) state.The concept of symmetric transparent BIST is analyzed and exempli-ed in Section II-C.

The work of Yarmolik et al. , although revolutionary, requires modi-fying existing registers (or SISRs/MISRs) in order to serve as responseevaluators and requires complicated control logic in such way to togglebetween the two different polynomials during the application of themarch series.

It is widely accepted by the test community that the utilization of modules that typically exist in the circuit, e.g., accumulators [10] orarithmetic logic units (ALUs) [11], for BIST test pattern generationand/or response verication possesses advantages, such as lower hard-ware overhead and elimination of the need for multiplexers in the cir-cuit path; furthermore, the modules are exercised, therefore, faults ex-isting in them can be discovered [12].

In this paper, we propose the use of accumulator-based compactionin symmetric transparent RAM BIST (ASTRA). In modules that con-tain accumulators, the output of the RAM is either directly driven to

the accumulator inputs or can be driven using processor instructions. Itis shown that the proposed scheme imposes lower hardware overhead

horized licensed use limited to: INTERNATIONAL INSTITUTE OF INFORMATION TECHNOLOGY. Downloaded on October 7, 2009 at 13:05 from IEEE Xplore. Restrictions apply.

7/31/2019 17. an Accumulator- Based Comp Action Scheme for Online BIST of RAMs

http://slidepdf.com/reader/full/17-an-accumulator-based-comp-action-scheme-for-online-bist-of-rams 2/4

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 16, NO. 9, SEPTEMBER 2008 1249

Fig. 1. C- march algorithm: (a) original version; (b) transparent version; and(c) symmetric transparent version.

and less complexity in the control circuitry than previously proposedschemes.

This paper is organized as follows. In Section II, a review of themarch algorithms (traditional, transparent, and symmetric transparent)is given. In Section III, the proposed accumulator-based compactionscheme for symmetric transparent BIST (ASTRA) is introduced andexemplied for the case of word-organized RAMs. Next, in Section IV,the proposed scheme is compared to previously proposed schemesfor response compaction in symmetric transparent BIST. Finally, inSection V, we conclude this paper.

II. MARCH ALGORITHMS

A. Traditional March Algorithms

A march algorithm consists of march elements, denoted by ,with . Each march element comprises zero (or more) writeoperations, denoted by meaning that 0/1 is written to the RAMcell, and zero (or more) readoperations denoted by , meaning that0/1 is expected to be read from the memory cell. For example, the C-algorithm [see Fig. 1(a)] consists of six march elements denoted byto [5]. In Fig. 1, denotes an increasing addressing order (whichcan be any arbitrary addressing order) and denotes a decreasing ad-dressing order (which is the inverse addressing order of ).

B. Transparent BIST Algorithms

Traditional march algorithms erase the memory contents prior totesting, therefore, they do not serve as good platforms for periodicBIST. Nikolaidis [1] proposed the concept of transparent BIST wherethe initial phase is bypassed, and a signature prediction phase isused instead. The signature prediction phase consists of read operationsequivalent to all the read operations of the march algorithm and it is uti-lized in order to calculate a signature that will be compared against thecompacted signature calculated during the (remaining) march test. Thetransparent version of the C-algorithm is shown in Fig. 1(b). The nota-tion for the transparent versions of the algorithms differs slightly fromthe one used in traditional march algorithms. Instead of , , , ,the notations , , , and are utilized. Their meanings areas follows.

Read the contents of a word of the RAM, expecting to read theinitial contents of the RAM word (i.e., before the beginningof the test).Read the contents of a word of the RAM, expecting to readthe complement of the initial contents of the RAM word.Read the contents of a word of the RAM expecting to readthe initial word contents and feed the complement value tothe compactor.Write to the memory word; the value that was stored in thismemory word at the beginning of the test is (assumed to be)written to the word.Write to the memory word; the inverse of the value that was

stored in this memory word at the beginning of the test is(assumed to be) written to the word.

By default, the data driven to the compactor with the operationare identical to the data driven by the . The importance of theoperation is the following: during the signature prediction phase thecontents of the RAM are equal to the initial contents (since no write op-eration has been performed); therefore, in order to simulate the op-eration we invert these contents prior to driving them to the compactor.

It has been shown in [1] and [2] that, with the transparent BIST al-

gorithms, the contents of the memory at the end of the test are identicalto those before the test. Also, since the read elements of the signatureprediction phase ( ) are identical to the read elements of the testingphase ( - ), then if we store the result of the compaction of and compare it to the result of the compaction of - , then wecan detect faults that occur due to the write operations of the marchalgorithm.

Traditional transparent BIST has the disadvantage that the signatureprediction phase adds up to the total testing time with a percentageof (more than) 30% [8]. In order to confront this problem, Yarmolik et al. introduced the concept of symmetric transparent BIST, which isexplained in the next subsection.

C. Symmetric Transparent BIST

In order to dene a symmetric transparent algorithm, some notationswill be introduced rst. Let bea data stream, then denotes the datastream with components in reverse order anddenotes the data stream with inverted components. For example, if

, and .A data string is called symmetric , if there exists a

data string with or . For ex-ample, and are symmetricdata strings, since and . Fur-thermore, a transparent march test is called symmetric if it produces asymmetric test data string .

In order to derive a symmetric transparent algorithm, the march se-ries is modied in such way that the expected output response is equalto a known value. Therefore, the signature prediction phase can beskipped and the time required for the test is reduced.

In order to achieve this, Yarmolik et al. [8] noticed that most of the march algorithms used for transparent BIST produce test datawith a high degree of symmetry. For example, the read elementsof the transparent C- march algorithms [see Fig. 1(b)], ignoring thesignature prediction phase (and the write elements) are: ;

; ; ; . It is easy to detect the approximatesymmetry; furthermore, it is also easy to derive a symmetric sequenceby adding an additional read element, resulting in the followingsequence of read elements: ; ; ; ; ;

. For example, for a bit-organized memory with ve wordswhose initial contents are (11010), the result of the latter sequence is

which is easily shown tobe symmetric with respect to the given denition.

Yarmolik et al. [8], [9] have shown that by exploiting the previouslymentioned symmetry and by using linear structures as compactors forthe outputs of the RAM, the nal value of the compactor is equal toa known value, i.e., the all-zero value. For the case of bit-organizedmemories, SISRs were utilized, while for word-organized memoriesMISRs were exploited. In [8] and [9], it was proven that by togglingbetween a primitive polynomial and its reciprocal one during theand operations, the nal signature is equal to theall-zero state.Theyeven reported marginal increase in the fault coverage of the symmetricschemes compared to the respective transparent ones with signatureprediction. For example, in Fig. 1(c) the symmetric transparent versionof the C- algorithm is presented.

horized licensed use limited to: INTERNATIONAL INSTITUTE OF INFORMATION TECHNOLOGY. Downloaded on October 7, 2009 at 13:05 from IEEE Xplore. Restrictions apply.

7/31/2019 17. an Accumulator- Based Comp Action Scheme for Online BIST of RAMs

http://slidepdf.com/reader/full/17-an-accumulator-based-comp-action-scheme-for-online-bist-of-rams 3/4

1250 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 16, NO. 9, SEPTEMBER 2008

Fig. 2. Accumulator-based compaction in word-organized memories.

III. A CCUMULATOR -BASED COMPACTION OF THE RESPONSES IN

SYMMETRIC TRANSPARENT BIST

The accumulator-based response compaction scheme proposed in

this paper, stems from the following two observations.1) Observation 1: If the march algorithm is symmetric (as in the

case of symmetric transparent BIST) then the number of elementsequals the number of elements plus the number of elements(without taking into account the addressing order, , of the marchelement).

2) Observation 2: The accumulator-based compaction of the re-sponses holds the order-independent property (i.e., the nal signatureis independent of the order of the incoming vectors [13]).

Observation 2 stems directly from the permutational property of theaddition operation.

Accumulator-based compaction for symmetric transparent BIST forthe case of word-organized memories is based on Lemma 1.

Lemma 1: If a symmetric transparent march algorithmis appliedto aword-organized memory whose word length is and the responses arecaptured in an -stage accumulator comprising a 1’scomplementadder(starting from the all-0 state), then the nal content of the accumulatoris equal to the all-1 state.

Proof: Let be the number of elements of the march algorithm;since the algorithm is symmetric, the total number of elements isequal to the total number of (plus the number of ) elements.

Therefore, for every vector driven to the inputs of the accumulator,its complement is also driven to the inputs of the accumulator ex-actly once. But

(1)

Furthermore, for 1’s complement addition it holds that the sum of two numbers and is given by ,therefore, the sum of and is

. Therefore, it is trivial to show (byinduction) that (2) holds for any value of

(2)

From (1) and (2), and taking into account that the addition operationis per mutative (Observation 2), we have the proof.

For example, let us consider the 4-word 3-bit RAM presented inFig. 2(a). The outputs of the memory are driven to an -stageaccumulator comprising a 1’s complement adder, Fig. 2(b). For the im-plementation of the march element, the subtraction operation of the accumulator can be utilized. In order to apply march elements of

TABLE IOUTPUT DATA COMPACTION IN SYMMETRIC TRANSPARENT BIST: C OMPARISON

the form or the output of the RAM must be invertedand then fed back to its inputs; with the proposed scheme, this can bedone by forcing the all-1 vector to one input of the adder/subtractor andperform a subtract operation. This is done with the OR gates whose oneinput is driven by the inv signal in Fig. 2. Therefore, the inverse of theread vector appears at the outputs the adder/subtractor and applied tothe RAM inputs.

IV. COMPARISONS

In this section, we shall compare the proposed scheme with the one

proposed in [9], with respectto the required hardware overhead. Forthecalculations, we assume that a D-type ip-op requires 8 gate equiv-alents, a ip-op with shift capability requires 10 gates, a ip-opwith double-shift capability requires 12 gates, and an XOR gate requires4 gates.

For the scheme proposed in [9], a MISR with double-shift (i.e., bothleft-to-right and right-to-left) capability is required; in case that a reg-ister is available, the transformation of ip-ops into multiplexedinput, two-way ip-ops is required; furthermore, two-input XOR

gates (to invert the values of the outputs of the RAM) and anothertwo-input XOR gates are required (in case A MISR is not available, inorder to transform the register into a MISR). In case that a MISR ex-ists, the transformation of multiplexed input ip-ops into two-waymultiplexed input ip-ops is required. The overhead is presented inTable I.

For the implementation of the proposed scheme, assuming the exis-tence of an accumulator, two-input OR gates are required at the inputsof the accumulator. Since the outputs of the RAM can be driven to theinputs of the accumulator by proper control of the datapath module, noadditional overhead is imposed.

From Table I, it is evident that the proposed scheme requires lowerhardware overhead than the scheme proposed in [9] for the samepurpose.

V. CONCLUSION

In this paper, we have proposed the utilization of accumulators forthe compaction of the responses in ASTRA. The proposed scheme

presents lower hardware overhead and requires less complicated con-trol compared to the scheme proposed in [9], therefore, it may provea viable solution for periodic testing of RAMs embedded into currentVLSI chips.

ACKNOWLEDGMENT

The author would like to thank the anonymous reviewers and theAssociate Editor for their constructive comments on the submittedmanuscript.

REFERENCES

[1] M. Nicolaidis, “Theory of transparent BIST for RAMs,” IEEE Trans.Comput. , vol. 45, no. 10, pp. 1141–1156, Oct. 1996.

[2] M. Nicolaidis, “An efcient built-in self test for functional test of em-beddedRAMs,”in Proc. 15th Symp. Fault Tolerant Comput. , Jun. 1985,pp. 118–123.

horized licensed use limited to: INTERNATIONAL INSTITUTE OF INFORMATION TECHNOLOGY. Downloaded on October 7, 2009 at 13:05 from IEEE Xplore. Restrictions apply.

7/31/2019 17. an Accumulator- Based Comp Action Scheme for Online BIST of RAMs

http://slidepdf.com/reader/full/17-an-accumulator-based-comp-action-scheme-for-online-bist-of-rams 4/4

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 16, NO. 9, SEPTEMBER 2008 1251

[3] A. Castro, M. Nicolaidis, P. Lestrat, and B. Courtois, “Built-in self testfor multi-port RAMs,” presented at the ICCAD, Santa Clara, CA, Nov.1991.

[4] A. J. van de Goor , Testing Semiconductor Memories, Theory and Prac-tice . Chichester, U.K.: Wiley, 1991.

[5] A. J. van de Goor, “Using march tests to test SRAMs,” IEEE Des. Test Comput. , vol. 10, no. 1, pp. 8–14, 1993.

[6] A. J. van de Goor and C. A. Verruijt, “An overview of deterministic

functional RAM chip testing,” ACM Comput. Surveys , vol. 22, no. 1,pp. 5–33, Mar. 1990.[7] M. Marinescu, “Simple and efcient algorithms for functional RAM

testing,” in Proc. IEEE Int. Test Conf. , 1982, pp. 236–239.[8] V. N. Yarmolik, S. Hellebrand, and H.-J. Wunderlich, “Symmetric

transparent BIST for RAMs,” presented at the DATE, Munich, Ger-many, Mar. 1999.

[9] V. N. Yarmolik, I. V. Bykov, S. Hellebrand, and H.-J. Wunderlich,“Transparent word-oriented memory BIST based on symmetric marchalgorithms,” in Proc. Eur. Dependable Comput. Conf. , 1999, pp.339–350.

[10] I. Voyiatzis, “Test vector embedding into accumulator-generated se-quences: A linear-time solution,” IEEE Trans. Comput. , vol. 54, no. 4,pp. 476–484, Apr. 2005.

[11] A. Stroele, “BIST patter generators using addition and subtraction op-erations,” J. Electron. Test.: Theory Appl. , vol. 11, pp. 69–80, 1997.

[12] R. Dorsch and H.-J. Wunderlich, “Accumulator-based deterministic

BIST,” in Proc. Int. Test Conf. , 1998, pp. 412–421.[13] I. Voyiatzis, A. Paschalis, D. Gizopoulos, N. Kranitis, and C. Halatsis,

“A concurrent built-in self-test architecture based on a self-testingRAM,” IEEE Trans. Reliab. , vol. 54, no. 1, pp. 69–78, Mar. 2005.

[14] W. L. Wang, K. J. Lee, and J. F. Wang, “An on-chip march pattern gen-erator for testing embedded memory cores,” IEEE Trans. Very LargeScale Integr. (VLSI) Syst. , vol. 9, no. 5, pp. 730–735, Oct. 2001.

[15] C.-T. Huang, J.-R. Huang, C.-F. Wu, C.-W. Wu, and T.-Y. Chang,“A programmable BIST core for embedded DRAM,” IEEE Des. Test Comput. , vol. 16, no. 1, pp. 59–70, Jan./Mar. 1999.

[16] J. -F. Li, R.-S. Tzeng, and C.-W. Wu, “Diagnostic data compressiontechniques for embedded memories with built-in self-test,” J. Electron.Test.: Theory Appl. , vol. 18, pp. 515–527, 2002.

[17] S. Hamdioui and J. E. Q. D. Reyes, “New data-background sequencesand their industrial evaluation for word-oriented random-access mem-ories,” IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. , vol. 24,no. 6, pp. 892–904, Jun. 2005.

[18] S.Hamdioui, Z. Al-Ars, and A.J. van de Goor, “Opensand delay faultsin CMOS RAM address decoders,” IEEE Trans. Comput. , vol. 55, no.12, pp. 1630–1639, Dec. 2006.

[19] W.-L. Wang, K.-J. Lee, and J.-F. Wang, “An on-chip march patterngenerator for testing embedded memory cores,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst. , vol. 9, no. 5, pp. 730–735, Oct. 2001.

[20] D.-C. Huang and W.-B. Jone, “A parallel built-in self-diagnosticmethod for embedded memory arrays,” IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. , vol. 21, no. 4, pp. 449–465, Apr. 2002.

[21] B. H. Fang and N. Nicolici, “Power-constrained embedded memoryBIST architecture,” in Proc. 18th IEEE Int. Symp. Defect Fault Toler-ance VLSI Syst. (DFT) , 2003, pp. 451–458.

Injection-Locked Clocking: A Low-Power ClockDistribution Scheme for High-Performance

Microprocessors

Lin Zhang, Aaron Carpenter, Berkehan Ciftcioglu, Alok Garg,Michael Huang, and Hui Wu

Abstract— We propose injection-locked clocking (ILC) to combatdeteriorating clock skew and jitter, and reduce power consumption inhigh-performance microprocessors. In the new clocking scheme, injec-tion-locked oscillators are used as local clock receivers. Compared toconventional clocking with buffered trees or grids, ILC can achieve betterpower efciency, lower jitter, and much simpler skew compensationthanks to its built-in deskewing capability. Unlike other alternatives, ILCis fully compatible with conventional clock distribution networks. In thispaper, a quantitative study based on circuit and microarchitectural-levelsimulations is performed. Alpha21264 is used as the baseline processor,and is scaled to 0.13 m and 3 GHz. Simulations show 20- and 23-ps jitterreduction, 10.1% and 17% power savings in two ILC congurations. Atest chip distributing 5-GHz clock is implemented in a standard 0.18- mCMOS technology and achieved excellent jitter performance and a deskewrange up to 80 ps.

I. INTRODUCTION

Clock distribution is a crucial aspect of modern multi-gigahertzmicroprocessor design. Conventional distribution schemes are moreor less monolithic in that a single clock source is generated by anon-chip phase-locked loop (PLL) and then fed through hierarchiesof clock buffers and interconnects to eventually drive the entire chip(see Fig. 1). This raises a number of challenges. First, the nonuniformload of the clock network and deteriorating process, voltage, andtemperature (PVT) variations give rises to spatial timing uncertaintiesknown as clock skews . To minimize the global clock skew, the globalclock-distribution network has to be balanced by meticulous design of the interconnects and buffers [5]. This practice puts a very demandingconstraint on the physical design of the chip. Another practice is touse a grid instead of a tree for clock distribution, as shown in theupper-left local clock region in Fig. 1. A grid has a lower resistancethan a tree between two end nodes, and hence can reduce skew.However, a grid usually has much larger parasitic capacitance (due tolarger metal area) than an equivalent tree, and therefore takes morepower to drive. Passive and active deskew methods have also beenemployed to compensate skew after chip fabrication. Unfortunately,these approaches often increase the circuit complexity, chip area, andpower consumption.

Second, given the substantial load of the clock, sending a highquality clock signal to every corner of the chip requires driving theclock distribution network “hard,” usually in full swing of the powersupply voltage. Not only does this mean high power expenditure, butit also requires a chain of clock buffers, which are subject to powersupply noise and hence add delay uncertainty-jitter. Unlike skew,(short-term) jitter is very difcult to compensate due to its randomnature and thus poses an even larger threat to microprocessor per-formance and power consumption. To reduce jitter, the interconnect

Manuscript received February 19, 2007; revised October 2, 2007. PublishedAugust 20, 2008 (projected).This paper is a preliminary version of the technicalreport “Injection-locked clocking: a lower-power clock distribution scheme forhigh-performance microprocessors.” This work was supported in part by Na-tional Semiconductor and by NSF under Grant 0509270 and Grant 0719790.

Theauthors are with theDepartmentof Electrical andComputer Engineering,University of Rochester, Rochester, NY 14627 USA (e-mail: [email protected]; [email protected]; [email protected];[email protected]; [email protected]; [email protected]).

Color versions of one or more of the gures in this paper are available onlineat http://ieeexplore.ieee.org

Digital Object Identier 10.1109/TVLSI.2008.2000976

horized licensed use limited to: INTERNATIONAL INSTITUTE OF INFORMATION TECHNOLOGY Downloaded on October 7 2009 at 13:05 from IEEE Xplore Restrictions apply