automation in design for test for asynchronous null ...scotsmit/ncl_dft_nasa.pdf · automation in...

Automation in Design for Test for Asynchronous Null Conventional Logic (NCL)

Circuits

Venkat Satagopan,Bonita Bhaskaran, Waleed Al-Assadi, and Scott C. Smith Department of Electrical and Computer Engineering, University of Missouri - Rolla

1870 Miner Circle, Rolla, MO 65409 Email: {venkat, bonita, waleed, smithsco} @umr.edu

Keywords: Asynchronous, Design for test, VLSI circuits, Null Conventional Logic, CAD tools,

VLSI Testing, Simulation

Abstract

The semiconductor era has been thriving since the past four decades but as we continually attain smaller chip sizes comprising of millions of transistors, the problems grow at an unmitigated speed too. Testing such huge circuits poses a huge problem unless tackled prudently. The best case scenario given the current circumstances would be to create a favorable test environment on-chip by implementing Design for Test (DFT) techniques. Asynchronous digital design methodologies are currently gaining popularity since they enjoy the benefits of reduced area, power and low EMI. In spite of several advantages, due to the absence of the global clock and the presence of more state holding gates, testing these circuits presents a challenge to the designer. A DFT implementation for one such asynchronous class known as Null Conventional Logic (NCL) circuits has been proposed in this paper. The methodology discussed, exploits the technique of test points’ insertion in order to improve the controllability of difficult feedback paths. Enhanced observability of long paths in the circuits is achieved by introducing of balanced tree structures or scan latches. This approach has been automated and works with industry standard tool suites, such as Mentor Graphics and Synopsis. The implemented tool transforms gate level NCL descriptions to those understood by the ATPG library provided with the DFT CAD tools. A test coverage increase of around 75% has been achieved with a less than 5% increase in terms of cost and area. 1.0 Introduction

The digital world has been dominated by the growth of synchronous techniques for nearly four decades due only to the ease of design of these circuits. Also, CAD

tools for synchronous designs have become more advanced and sophisticated allowing total automation of several stages of the design process. However, with clock speeds nearing the gigahertz range and CMOS technology reaching deep submicron ranges, serious doubts have been cast over the suitability of synchronous designs for next generation processors and systems. Problems associated with clock synchronization, power consumption, and noise in synchronous designs has forced designers to look for alternatives [1].

Designers are looking at asynchronous circuits as a potential solution to these problems as they are modular and do not require clock synchronization. Some of the possible benefits of asynchronous techniques include low power, less EMI, less noise, increased robustness and design-reuse [2]. A variety of approaches exist for the design and implementation of asynchronous circuits. Huffman’s model and Muller’s model form the basis for many of these approaches. Asynchronous circuits fall into two main categories: delay-insensitive and bounded-delay models [3]. Paradigms, like NCL, assume delays in both logic elements and interconnect to be unbounded, although they assume that wire forks are isochronic [4]. This implies the ability to operate in the presence of indefinite arrival times for the reception of inputs. Completion detection of the output signals allows for handshaking controlling input wavefronts. On the other hand, bounded-delay models such as Huffman circuits, burst-mode circuits, and micropipelines assume that delays in both gates and wires are bounded. NCL circuits are often able to outperform other self-timed methods since NCL targets a wider range of logical operators whereas other methods target a more standard, restricted set [5].

In order for an ASIC to be successfully implemented on a silicon wafer, the design flow needs to be automated but having test capabilities is equally important. There are numerous algorithms which are used to effectively test combinational logic, sequential logic, memory

12th NASA Symposium on VLSI Design, Coeur d’Alene, Idaho, USA, Oct. 4-5, 2005

arrays, etc. Parametric tests and functional tests are applied to the VLSI chips after fabrication since the test processes before fabrication are limited. Testers or Automatic Test Equipment (ATE) use specific test algorithms to drive inputs to a chip-under-test and monitor its outputs based on the timing specifications provided by the designer. Design for Test implementations of chips help make them more viable for being tested easily using prototype testers. The asynchronous NCL DI designs present a challenging test case to the tester/DFT CAD tools because of the presence of numerous feedback loops. The testability can be strengthened by making design modifications which appear to be dormant under normal circuit operation and would come into play only during the test mode.

2.0 NCL Circuits NULL Convention Logic (NCL) provides an asynchronous design methodology employing dual-rail signals, quad-rail signals, or other Mutually Exclusive Assertion Groups (MEAGs) to incorporate data and control information into one mixed path. NCL is a self-timed logic paradigm in which control is inherent in each datum, so there is no need for worse-case delay analysis and control path delay matching [6]. Various aspects of the paradigm, including the NULL (or spacer) logic state from which NCL derives its name, have origins in Muller’s work on speed-independent circuits in the 1950s and 1960s. 2.1 NCL Combinational Circuits

NCL gates are a special case of the logical operators or gates available in digital VLSI circuit design and are said to be delay-insensitive. Such an operator consists of a set condition and a reset condition that the environment must ensure are not both satisfied at the same time. If neither condition is satisfied then the operator maintains its current state. NCL uses symbolic completeness of expression to achieve self-timed behavior. A symbolically complete expression is defined as an expression that only depends on the relationships of the symbols present in the expression without a reference to the time of evaluation. Traditional Boolean logic is not symbolically complete; the output of a Boolean gate is only valid when referenced with time. NCL eliminates this problem of time-reference by employing dual-rail or quad-rail signals. A dual-rail signal, D, consists of two wires, D0 and D1, which may assume any value from the set {DATA0, DATA1, NULL}. Similarly, a quad-rail signal, Q, consists of four wires, Q0, Q1, Q2, and Q3, which may assume any value from the set {DATA0, DATA1, DATA2, DATA3,

NULL}. The two rails of a dual-rail logic signal and the four rails of a quad-rail logic signal are all mutually exclusive.

All NCL systems must satisfy two criteria for them to be delay-insensitive – • Input-Completeness – (i) the outputs of a circuit

may not transition from NULL to DATA until all inputs have transitioned from NULL to DATA. (ii) the outputs of a circuit may not transition from DATA to NULL until all inputs have transitioned from DATA to NULL. In circuits with multiple outputs, it is acceptable, according to Seitz’s weak conditions [7], for some of the outputs to transition without having a complete input set present, as long as all outputs cannot transition before all inputs arrive.

• Observability – ensures that every gate transition is observable at the output, which means that every gate that transitions is necessary to transition at least one of the outputs.

NCL uses threshold gates with hysteresis for its composable logic elements. One type of threshold gate is the THmn gate, where 1 ≤ m ≤ n as depicted in Figure 1. A THmn gate corresponds to an operator with at least m out of n signals asserted as its set condition and all signals de-asserted as its reset condition. At least m of the n inputs must be asserted before the output will become asserted. Since threshold gates are designed with hysteresis, all asserted inputs must be de-asserted before the output will be de-asserted.

2.2 NCL Systems

NCL systems, like synchronous circuits, have both combinational and registration portions. However, NCL circuits also have completion components used for generating the controlling handshake signals. An example of an n-stage NCL system is shown in Figure 2. Furthermore, these three components are all built from the same threshold gates. Each stage in an NCL pipeline consists of these three components: combinational logic, registration, and completion logic. In an NCL system the DATA wavefront and NULL wavefront are applied alternatively. The NULL wavefront is characterized by all the inputs being NULL, while a DATA wavefront refers to all inputs of a circuit being DATA, some

Figure 1: THmn threshold gate.


combination of DATA0 and DATA1. The NCL registers interact with one another using handshaking signals and are responsible for ensuring that successive DATA wavefronts are separated by a NULL wavefront. When all outputs of a combinational circuit are DATA, request for NULL, rfn, is generated on the Ko output of the register. Similarly, request for DATA, rfd, is generated on the Ko register output when all combinational logic outputs are NULL. 3.0 Testing of NCL Designs

Testing of a chip is done in several different stages. At each stage the testing methods are aimed at detecting defects and improving reliability/yield. Functional tests, verification tests, parametric tests and manufacturing tests are some of the different test methods being used [8]. DFT methods collectively refer to the design practices used to modify the existing designs in order to make them easily testable. The common objective of the DFT methods is to make the design easily testable using the Automatic Test Pattern Generators (ATPGs). In VLSI design flow, commercial DFT CAD tool suites are used to simulate the test pattern generation programs of the ATE. So, getting a high test coverage using the DFT CAD tools is an important benchmark. Testing asynchronous circuits has been a major challenge compared to that of testing synchronous circuits. In order to compete with the synchronous counterparts, asynchronous design methods should be capable of producing VLSI circuits which are at least as readily testable as synchronous circuits. The lack of efficient and widely accepted design and test techniques for asynchronous circuits in addition to little CAD support has limited the widespread use of asynchronous designs in commercial VLSI application. Synchronous testing techniques, though popular and well tested cannot be directly applied to asynchronous circuits due to their inherently different composition. Several possible solutions are being developed to tackle this problem. The

most important reason behind this is that the asynchronous design schemes use special gates or elements in order to achieve the synchronization between the datapath and the control path. NCL uses the DI paradigm to achieve synchronization by means of handshaking. The handshaking mechanism invariably leads to the presence of feedback paths, which in turn poses a serious problem for the test pattern generation programs used by the DFT CAD tools.

Failures in VLSI circuits can be modeled at different levels of abstraction. DFT CAD tools target the design at the gate-level in order to reduce the number of primitives and the complexity of computations. By doing so, the tools achieve a good correlation between actual failures at the Physical Design (PD) level of abstraction and the stuck-at-fault models at gate-level. Two important parameters which govern the fault coverage of a VLSI design are controllability and observability. Controllability is defined as the ease with which the appropriate test vector can be set at the primary inputs, to excite the fault location. Observability on the other hand, is the ease with which the excitation of the fault can be observed at a primary output node or a latch. Unstable faults can either be redundant faults or faults that remain untested due to poor controllability and observability [9]. Redundant faults can be removed by removing redundant logic in the design. Circuits which have these problems suffer substantial fault coverage degradation. This phenomenon is highly prevalent in asynchronous circuits.

3.1 Previous Work

NCL design methodology is a relatively new area in asynchronous design and ongoing research activities aim to bring about a good improvement in their testability. Previous DFT methods for asynchronous DI circuits use introduction of scan latches. This method is a well-established technique for synchronous circuits where the latches and flip-flops in the design are converted into

Figure 2: An Example NCL System.


scan chains in order to be able to control their states and simultaneously being able to observe their individual outputs by performing a simple shift operation. The work by Y. Kang and K. Huh [5] proposed a new scan design with low overhead for asynchronous micropipeline circuits to efficiently detect stuck-at and delay faults. [10] discusses a partial-scan technique for targeting delay faults for clockless systems. I. Blunno and L. Lavango [1] have successfully demonstrated automated synthesis of micro-pipelines from behavioral Verilog HDL. NCL differs from other delay-insensitive schemes which use a single state-holding gate like the C-element, in that all the threshold gates are state-holding. Due to this fact, it would lead to a large area overhead to convert each of the threshold gates used in the design to scan chains. An approach which is very pertinent to the NCL DI scheme has been used for targeting stuck-at, delay and bridging faults. M. Ligthart, K. Fant, R. Smith, A. Taubin and A. Kondratyev [11] suggest techniques for designing and synthesizing NCL circuits using conventional CAD tools. The work by L. Sorensen, A. Streich, and A. Kondratyev [12] uses commercial DFT CAD tools for testing the NCL designs. The methods proposed in [12] are for acyclic and cyclic NCL pipelines, wherein the fault sites in the NCL design are mapped onto an equivalent Boolean design. A cyclic pipeline has a feedback in its data path also and is more complex to test. The proposed method uses a partial scan technique for the cyclic pipelines by breaking the computational loops and inserting test vectors in the data paths. Though limited in number, the research work on testability of NCL designs is a significant step. 4.0 Design for Test for NCL Systems A Design-For-Test strategy for NCL systems is essential for their capability to be tested using conventional methods. The use of a self testing

mechanism like a BIST would prove to be beneficial, but would lead to a large area overhead. The commercial DFT CAD tools require the presence of a clock for sequential logic and cannot support feedback paths in combinational logic because trying to control these faults, could lead to hazardous race conditions. Since these tools cannot be directly applied to NCL systems, additional logic and new test methodologies are needed to solve the problem. The structured-DFT scheme discussed in this paper uses minimal additional logic to modify the design and make it readily testable by the DFT CAD tools. The two-fold approach caters to solve both, controllability and observability issues by

• inserting test points after breaking the feedback loops

• adding either balanced gate structures or scan latches to propagate unobservable fault sites

Feedback significantly degrades stuck-at fault

coverage by increasing the complexity of the test to that of testing a sequential circuit that needs two vectors <t1, t2>, t1 being the initialization vector and t2, the test vector. Breaking these loops ensures that the complexity of the test is that of a combinational circuit. Enhancement to testability can be achieved by inserting test points on selective nets where faults are declared as unstable by tools [13]. Test points allow the tools to probe the feedback paths and initialize them effortlessly. The test points can be inserted in a multitude of ways but the scheme proposed here simply uses an XOR gate to do so. As an example let us consider a two-stage pipeline adder shown in Figure 3.

Clearly this is exactly similar to the general block diagram of an NCL system shown in Figure 2 wherein the combinational logic has been replaced with full and half adders. For every registration stage, there is a feedback path from the succeeding stage, which can be seen as the Completion Detector’s (CD) output. An XOR

TC

NCL Reg

In NCL Full

Adder

NCL Half

Adder

3-bit

CD

A

B S

Cout

Ko

NCL Reg

NCL Reg

2-bit 2-bit

Out

Ko Ko Ko Ki Ki

Ki

In In Out Out

CD CD

Test Test

Ki

Cin

Figure 3: Improving testability by breaking feedback paths.


gate is included in this path, whose other input is controlled by an external test input signal. The test signal is made a primary input (PI) – Test Control (TC), whose value would be ‘0’ in the normal functional mode. The XOR gate would thus, not interfere in the normal operation of the system and on the other hand, provide a means for the tester to excite the required faults on the feedback net. The feedback loop is controlled better, in turn leading to an improvement in the fault coverage. While controllability issues are resolved by breaking feedback paths, fault sites in long paths cannot easily propagate to the output due to the topology of the design. An easy solution for this problem would be to identify the unobservable nodes in the design and make them primary outputs. Figure 4 shows the first stage of the NCL pipelined adder along with the XOR and the TC input with two different manifestations as modules I and II. Module I shows 8 of the unobservable fault sites propagated to the primary outputs (POs). Although observability is greatly enhanced by doing this, design complexity would increase tremendously, due to the addition of one primary output pin for each observation point added. This could not only lead to undesirable increase in cost of adding additional output pins to the chip but the long wire connections could also make the design susceptible to signal coupling and physical defects such as bridging faults. Having a huge number of outputs switching at the same time would cause EMI problems of ground bounce and SSN which can be difficult to model and analyze. Instead of increasing the number of POs and in turn causing more problems, a wiser solution strives to limit the number of added observation points (OPs) at the primary output pins to a single one. This can be accomplished by consolidating the OPs by the inclusion of a balanced XOR tree structure as seen in Module II of Figure 4. A structure formed by connecting gates of a similar type in which all the inputs need to pass through the same number of logic levels to reach the single output is defined as a balanced tree. The inherent characteristic of this structure is that all the input fault locations have the same probability of occurrence and/or are equally likely to occur on the output observation point. Other than reducing the number of primary output pins and the routing of long wires, this method also helps in shortening the length of a signal path in the design from an input edge, thus making it more easily observable. Analyzing the developed DFT technique in terms of cost, the resulting design would additionally have a single test pin, one output pin and a few XOR gates. This is a tradeoff between yield and cost because by adding test capabilities without making many design modifications would enable the chip to be perfectly testable. Yet another advantage of this method is that Weighted Random Pattern Test (WRPT) can effectively

be run as a result of less probability of having random resistance faults. Furthermore, the presence of random resistance faults degrades random testing, which is beyond the scope of this paper. Another approach to tackle fault propagation issues is to use observation latches in the design. These are generic components used to form a scan chain during the test phase. They serve as internal probe points for the design and improve the fault propagation on these nodes. Up to 4 internal nodes or potential observation points (POPs) could be mapped to a single latch as illustrated in Figure 5. The latch serves as an observation point for these nodes. The functionality of the NCL design is not affected since these latches are enabled only during test mode. Any number of POPs could be observed in this fashion. These observation latches form a scan chain with the PI, scan_in tied to the scan input of the first latch and scan output of the last latch tied to the PO scan_out. In the test mode, faults on any of these nodes could be activated by applying suitable a test pattern. An applied test clock would cause the latches to capture and propagate the fault values through the scan chain to the single PO. 4.1 NCL Test Library

NCL designs can be modified to conform to existing DFT tools without any loss in functionality [12]. Any of the NCL threshold gates could be mapped to its equivalent Boolean form by using gates from the DFT tool’s library. For example, the set condition for a th22x0 gate is defined as Z=A.B. Using equivalence principle, a th22x0 gate is replaced by a 2-input AND gate for the purpose of analyzing faults and generating test patterns. An NCL Test library is created by mapping all the Thmn gates. Using the NCL Test library, it is now possible to convert any structural NCL design to its Boolean equivalent form. The registers present in NCL pipelines are composed of Thmn gates and hence can be converted to the equivalent Boolean form. DFT tools from companies like Synopsis and Mentor Graphics can be used perform testability analysis on NCL design.

NCL pipelines are unique in structure and functionality compared to their synchronous counterparts. In order to implement the DFT techniques discussed in section 5.0 it is necessary to clearly understand factors affecting the fault coverage. It is also logical to analyze the impact of the proposed testability schemes on NCL pipelines before any automation efforts.

DFT tools were used to ensure 100 % fault coverage for all components of the pipeline shown in Figure 2. A single stage pipeline consists of combinational logic sandwiched between two registration stages. A majority of these nodes were marked as ATPG Untestable (AU) and a few as Post Deterministic Untestable (PU) by the


Module II

Figure 4 : Observability enhancing solutions

Module I

Figure 5 : Observation Latch


DFT tool. These faults were traced to the feedback path that exists between the registration stages. The path, between the output of the completion component and the input of the registration stage was identified as the ideal location to break the feedback loop. FastScan results after breaking the feedback showed a 40-60% increase in fault coverage owing to significant reduction in the number of AU faults. Similar results were obtained for a two-stage NCL pipeline. A number of faults were classified as UnObserved (UO) which caused the fault coverage to be lesser than 100%. Adding additional test points to the design did not bring any increase in the fault coverage. The majority of the faults were traced to be relatively longer paths in the design. Long paths have significant number of gates from the PIs to POs causing the faults sites to be unobservable. All the identified UO and AU faults nodes were marked as Potential Observation Points (POPs). Balanced XOR trees and observation latches help combine the POPs to a PO. Test results further validate the effectiveness of these methods. Further increase in fault coverage if necessary could be brought about by adding additional POs. To summarize the strategies for NCL designs

1. Feedback paths severely reduce controllability of the design. Ideal location to insert the test point is on the path between the completion component and the registration stage.

2. Test point insertion is best done using an XOR gate.

3. Observability and hence the fault coverage can be improved by providing paths to primary outputs from the fault nodes.

4. POPs can be consolidated to a single PO using balanced XOR-tree structure or by using observation latches.

Based on these observations a DFT tool is developed for NCL designs. The following section explains the tool and its usage.

5.0 Algorithmic DFT Tool

An automated DFT tool for NCL systems is proposed in this work. Repeated tests on NCL pipelines helped to evolve the automated DFT tool. In order to better understand the advantages of automation, a brief overview of the previous design flow is presented in Figure 6. The steps involved in using the automated DFT tool are illustrated in Figure 7.

An outline of steps leading to a testable NCL pipeline using is explained here [14]. A structural VHDL design of the NCL pipeline is required. FastScan is used to obtain the fault coverage and any redundancies if present are removed from the design. Next, feedback paths are manually identified by inspecting the VHDL description of the design. Identified feedback paths are broken by inserting the test points and a fault analysis is repeated. Fault list obtained from FastScan is manually inspected for presence of any AU or UO faults. For a 2 stage

Figure 6: Potential Observation Points mapped to Scan Chain


pipeline (Reg combinational logic Reg) the fault list could roughly have between 50-200 faults. Commonly, a node in the design netlist could have more than one fault associated with it. As proposed in section 5.1 these fault nodes are due poor observability and are selected as POPs. These POPs nodes are manually traced on the structural VHDL design and are consolidated using the proposed techniques to form one or more POs.

The above DFT flow, though effective is practically suitable only for smaller designs. This is because the critical steps of inserting test and observation points were done manually. The only automated step in this flow is the use of commercial DFT tool to determine fault coverage. Analyzing NCL pipelines with more than 2 stages using this approach is tedious; time consuming and unreliable owing to human errors. The automated DFT tool proposed for NCL designs solves these issues while achieving good fault coverage. The tool is based on the DFT strategies discussed in sections 5.0 and 5.1. The tool uses three specific PERL scripts to implement the automation:

break_feedbacks – Identifies breaks the feedback paths present in the structural VHDL design.

find_faults – Analyses the fault list of the design to select faults of interest (AU and PU).

map_faults – Locates nodes with AU, UO faults, identifies POPs and combines them to a PO. The tool finally output a structural VHDL file which has the improved controllability and observability.

The steps involved is illustrated in Figure 7. Similar to the previous approach, a structural VHDL description of the NCL pipeline is required.

1) The first step is to break the feedback paths using the break_feedback script.

break_feedback design_name.vhd 2) The structural VHDL design is now flattened using

Leonardo Spectrum. This reduces the complexity of script used to identify fault nodes. 3) A command file fs.do, invokes FastScan on the design and stores the fault list in a fault file design_name.fault. 4) Script find_faults searches the fault file for specific fault types (UO,AU) and stores the fault location of these fault types in a data file faults.dat. find_faults design_name.faults 5) Now, POPs can be identified and combined into a PO. The script map_faults performs this function.

map_faults design_name.vhd It reads the faults.dat to identify POPs. It processes the input VHDL design file to incorporate POPs. It automatically inserts generic balanced xor trees or observation latches to combine the POPs onto a PO. Finally it outputs a VHDL file output_dft.vhd which has all the DFT features incorporated in it. Steps 3 through 5 may be repeated as necessary. Once the target fault coverage is achieved, functional verification is done as the final step.

Figure 6 : Previous DFT Design Flow Figure 7 : Automated DFT Tool- Usage Steps


Algorithm break_feedback /* feedback path exists between completion component and registration component /* Input file = structural vhdl design a) Read Input file b) Modify the entity to include a test control pin tc. c) Identify completion component, determine its output signal x. /* A completion component has only one output signal. d) Identify Registration component. e) If (signal x an input for this component = True) then f) Break feedback; g) Include testpoint; h) Elseif (signal x a PO) = True then i) Null; j) End if; k) Repeat steps (c) through (j) until all feedbacks are broken. l) Output a modified VHDL file. Algorithm find_faults /* Input file =design_name.faults (Fault list file) /*Output file = faults.dat a) Open Input file. b) Read one line at a time from Input file until end of file is reached. c) If ( (fault_type=AU) or (fault_type=UO) ) then d) write current line to output file e) else f) go to step b. g) end if; An example fault file is included in the appendix. Algorithm map_faults /* Input file = design_name.vhd /*Fault file = faults.dat /*Output file = output_dft.vhd a) observation_points_array = array of nothing; b) fault_count =0; c) fault_line=Read a line from faults.dat d) fault_gate = 1st field from fault_line; e) fault_port = 2nd field from fault_line; f) Read one line at a time from Input file until end of file is reached. g) If (fault_gate=found) then /* find the signal mapped to this port and mark it /* as observation point. fault_signal= signal mapped to fault_port /* check if fault_signal is already present in /* observation_points_array h) If (is fault_signal already present=false) then i) add fault_signal to observation_points_array; j) increment fault_count k) end if;

l) end if; m) Modify the Input file as follows 1) Modify entity of the design to accommodate the added PO. Additional port signals clk and scan_en would be added if an observation latch is used. 2) Include signal declaration for the array of OPs. /*for balanced xor approach /* m= number obtained by approximating the /* fault_count to a power of 2. /* eg: if fault_count=10, m would be 16 signal obsvn_pts : std_logic_vector (m-1 downto 0); /*for Observation latch approach /* m= number obtained by approximating the /* fault_count the nearest multiple of 4. /* eg: if fault_count=17, m would be 20 signal obsvn_pts : std_logic_vector (m-1 downto 0); 3) Include generic XOR-tree component or genric Observation latch component The number m obtained in step 2, is passed as generic map input to these components. 4) Provide port mappings for these generic components 5) Write these modifications to Output file. A portion of a modified VHDL file is shown in appendix.

6.0 Results and Analysis

In this section we demonstrate the effectiveness of the proposed DFT tool. The DFT tool was applied to the NCL adder pipeline shown in Figure 3. This design was chosen in order to make comparative analysis with the manual DFT approach [14]. A single stage NCL pipeline was initially analyzed followed by a two stage pipeline and the results are presented in Table 1. Simulations were run on 900 MHz, Sun SPARC machines running SunOS operating system.

As can be seen in Table 1, there is significant increase in fault coverage with the automated tool for both single and two stage pipelines. For smaller order pipelines high fault coverage could be obtained with a single iteration through the DFT tool. The manual DFT approach is tedious because of the large number of faults to be processed, manually. Manual approach followed a cause-and-effect DFT approach, which required several repetitions to achieve the required fault coverage. This is because fault coverage and fault list had to be analyzed at every step. As observed in Table 1, with increasing order of pipelines the untestable faults grow


Table 1- Testability analysis of NCL Adder Pipelines

Single Stage Adder Pipeline

Design Approach

# of Untestable faults processed

# of PI Added

# of POs

Added

# of POPs monitored

Fault Coverage

%

# of iterations to achieve fault

coverage

CPU time in secs

No DFT methods

60 0 0 0 52.19 0 0.8

Automated DFT Tool (XOR-Tree)

15 1 1 12 98.32 1 1.8

Two Stage Adder Pipeline Design

Approach # of Untestable faults processed

# of PI Added

# of POs

Added

# of POPs monitored

Fault Coverage

%

# of iterations to achieve fault

coverage

CPU time in secs

No DFT methods

143 0 0 0 26.92 0 0.8

Automated DFT Tool (XOR-Tree)

86 1 1 34 97.63 1 1.8

exponentially, severely impeding possible manual approaches.

Using observation latches to probe potential observation points was also considered. For this method, clock inputs would be required during the test phase. Initial testing using this approach has been very promising. Using latches allows for better automation and also reduces the amount of additional logic added. For a design with 34 POPs , the balanced XOR approach would need an XOR-tree with 64 inputs requiring 63 2-input XOR gates. If scan latches similar to Figure 5 are used, only nine scan latches would be required. Futher testing needs to be done in order to validate this approach. 7.0 Conclusion and Future Work This paper presents an Automated DFT Tool for NCL pipelines. Test results on an NCL adder pipeline yielded stuck-at-fault coverage close to 99%. The DFT tool offers considerable time savings due to its effectiveness and ease of use. This is specifically true for higher order pipelines and bigger designs which may be too difficult to be analyzed manually. Scan latches were employed to increase observability by adding only one primary output pin. Compared to XOR-tree structures Scan latches allow for better control of fault propagation. With careful Physical Design (PD), minimizing the

impact of additional logic can be achieved easily without degrading functional behavior.

References

[1] I. Blunno, L. Lavagno, “Automated synthesis of micropipelines from behavioral Verilog HDL”, Proc. International Symposium on Advanced Research in Asynchronous Circuits and Systems, IEEE, Apr 2000 [2] S. C. Smith, Gate and Throughput Optimizations for

NULL Convention Self-Timed Digital Circuits, Ph.D. Dissertation, School of Electrical Engineering and Computer Science, University of Central Florida, 2001.

[3] Roig, Formal Verification and Testing of Asynchronous Circuits, Ph.D. Dissertation, Universitat Politecnica de Catalunya, May 1997. [4] “A proposal for Research on Testability of Asynchronous Circuits”, Internal report HPCA-ECS- 95/03, Aug 1998. [5] Y-S Kang, K. H. Huh and S. Kang, “New Scan Design of Asynchronous Sequential Circuits”, Dept. of Electrical Eng., Yonsei University. [6] S. H. Unger, Asynchronous Sequential Switching Circuits, Wiley, New York, 1969 [7] C. J. Myers, Asynchronous Circuit Design, Wiley

publications, USA, Jul 2001 [8] Essentials of Electronic Testing for Digita;l, Memory, and

Mixed-Signal VLSI Circuits (Frontiers in Electronic Testing Volume 17); Michael L Bushnell , Vishwani D. Agrawal .

[9] W. K. Al-Assadi et. al, “Faulty Behavior of Storage


Elements and it Effects on Sequential Circuits”, IEEE Transactions on VLSI Systems, Vol. 1, no. 4, pp 446- 452, Dec 1993

[10] M. Kishinevsky, A. Kondratyev, L. Lavagno, A. Taubin, “Partial-Scan Delay Fault Testing of Asynchronous Circuits”, IEEE, Nov 1998.

[11] M. Ligthart, . Fant, R. Smith, A. Taubin, A. Kondratyev, “Asynchronous design using commercial HDL synthesis tools”, Proc. International Symposium on Advanced Research in Asynchronous Circuits and Systems, IEEE, Apr 2000. [12] A Kondratyev, L Sorensen, A Streich, “Testing of Asynchronous Designs by Inappropriate Means. Synchronous approach.” IEEE 2002. [13] M. L. Bushnell, A. D. Agrawal, “Essentials of Electronic Testing for Digital, Memory and Mixed Signal VLSI Circuits”, Kluwer Academic Publishers, 2001. [14] B. Bhaskaran, V. Satagopan, W. K. Al-Assadi, S. C. Smith, “Implementation of Design for Tests for NCL Designs”, CDES 2005 Appendix a) Example of fault file for a single stage adder pipeline. It lists the kind of stuck-at fault, the fault classification followed by the gate and port location. 1 UO /reg_0_reg3_and02_4/Y 0 AU /reg_1_reg2_and02_4/Y 1 AU /reg_1_reg2_or02_1/Y 1 AU /reg_1_reg2_and02_4/A0 1 AU /reg_1_reg2_and02_4/A1 0 AU /reg_1_reg1_and02_4/Y 1 PU /reg_1_reg1_or02_1/Y 1 PU /reg_1_reg1_and02_4/Y 1 PU /reg_1_reg1_and02_4/Y 1 AU /reg_0_reg1_and02_2/Y 1 AU /reg_0_reg1_and02_3/Y 1 AU /reg_0_reg2_and02_2/Y 1 AU /reg_0_reg2_and02_3/Y 1 AU /reg_0_reg3_and02_2/Y 1 AU /reg_0_reg3_and02_3/Y b) To illustrate some of the modifications done to the original VHDL file to incorporate testability features. Original design entity use work.ncl_signals.all; library ieee;

use ieee.std_logic_1164.all; entity fapl_two is port (a : IN dual_rail_logic ; b : IN dual_rail_logic ; c : IN dual_rail_logic ; reset : IN std_logic; ki : IN std_logic; car : OUT dual_rail_logic ; sum : OUT dual_rail_logic; ko : OUT std_logic) ; end entity fapl_two; Design entity modified to improve testability The balanced xor component and mapping of POPs is shown here. Notice the added test control pin tc and added PO m_obsvn_pt. library IEEE; use IEEE.STD_LOGIC_1164.all; entity fapl_two is port ( a : IN std_logic_vector (1 DOWNTO 0) ; b : IN std_logic_vector (1 DOWNTO 0) ; c : IN std_logic_vector (1 DOWNTO 0) ; reset : IN std_logic ; ki : IN std_logic ; tc : IN std_logic ; car : OUT std_logic_vector (1 DOWNTO 0) ; sum : OUT std_logic_vector (1 DOWNTO 0) ; ko : OUT std_logic ; m_obsvn_pt : OUT std_logic) ; end fapl_two ; --portion of architecture architecture bool_gtlev of fapl_two is signal obsn_pts: std_logic_vector(15 downto 0); component balxor_nto1 generic(stage: in integer :=4); port ( a : in std_logic_vector((2**stage)-1 downto 0); obs_pt :out std_logic ); end component;


-- other lines begin --POPs mapped to the balanced xor tree input gnds<='0'; obsn_pts(0) <= ncl_reg0_out_2_RAIL0; obsn_pts(1) <= ncl_reg0_out_2_RAIL1; obsn_pts(2) <= ncl_reg0_out_1_RAIL0; obsn_pts(3) <= ncl_reg0_out_1_RAIL1; obsn_pts(4) <= ncl_reg0_out_0_RAIL0; obsn_pts(5) <= ncl_reg0_out_0_RAIL1; obsn_pts(6) <= reg_0_reg3_andout1; obsn_pts(7) <= reg_1_reg1_andout1; obsn_pts(8) <= tempc_RAIL1; obsn_pts(9) <= ko1_out_1; obsn_pts(10) <= reg_1_reg2_andout1; obsn_pts(11) <= temps_RAIL1; obsn_pts(12) <= temps_RAIL0; obsn_pts(13) <= ko1_out_0; obsn_pts(14) <= gnds; obsn_pts(15) <= gnds;


automation in design for test for asynchronous null ...scotsmit/ncl_dft_nasa.pdf · automation in...

Documents